Skip to content

Add ModelExpress to CUDA image#2621

Open
JannikSt wants to merge 8 commits into
mainfrom
feat/modelexpress-image
Open

Add ModelExpress to CUDA image#2621
JannikSt wants to merge 8 commits into
mainfrom
feat/modelexpress-image

Conversation

@JannikSt
Copy link
Copy Markdown
Member

@JannikSt JannikSt commented May 24, 2026

Summary

Adds ModelExpress support to the standard CUDA image build path.

  • adds a modelexpress optional extra pinned to modelexpress==0.3.0
  • installs that extra from Dockerfile.cuda during the existing locked uv sync
  • pins protobuf to the latest compatible 5.x range via the existing uv override mechanism because ModelExpress 0.3.0 requires protobuf<6
  • updates uv.lock with modelexpress and protobuf==5.29.6
  • builds and ships UCX 1.19.1 with CUDA + verbs support under /opt/ucx so NIXL can use a CUDA-capable UCX backend instead of distro libucx0
  • installs the RDMA runtime libraries needed by that UCX build

Notes

This keeps the hosted-rl image workflow unchanged: it can build this prime-rl branch into the normal ghcr.io/primeintellect-ai/hosted-rl/prime-rl:<tag> image.

The protobuf override is needed because the current env stack pulls prime-sandboxes, whose published metadata currently resolves protobuf 6.x. ModelExpress generated protobuf code is built for protobuf 5.x.

The first Telus pilot image reached vLLM with load_format: mx but failed because NIXL could not load UCX (libucp.so.0 missing). Adding distro libucx0 let NIXL discover UCX, but exposed that the distro UCX is 1.18.1 and lacks CUDA support. This branch now builds UCX 1.19.1 from source with CUDA and verbs support.

Validation

  • uv lock --check
  • git diff --check
  • hosted-rl image workflow completed for earlier branch revisions and published GHCR images
  • Telus pilot verified ModelExpress/vLLM wiring and exposed the UCX runtime requirements

No local Docker image build was run.


Note

Medium Risk
Touches container dependency resolution (protobuf overrides) and adds a non-trivial UCX source build plus RDMA libs, which affects runtime networking/GPU transfer paths for NIXL-backed inference.

Overview
Adds ModelExpress to the CUDA container build: a new modelexpress optional extra (modelexpress==0.3.0), included in the locked uv sync in Dockerfile.cuda, with uv.lock updated accordingly.

Resolves a protobuf version clash by overriding ModelExpress’s published metadata so resolution stays on a compatible 5.x–6.x band while the rest of the stack (e.g. prime-sandboxes) can still pull newer protobuf.

Ships a CUDA-capable UCX 1.19.1 built in the builder stage (verbs + CUDA, installed under /opt/ucx) and copies it into the runtime image with UCX_HOME and LD_LIBRARY_PATH, plus RDMA-related build/runtime packages—so NIXL/ModelExpress can load UCX with GPU support instead of relying on distro UCX that lacked CUDA.

Reviewed by Cursor Bugbot for commit 3fddc64. Bugbot is set up for automated code reviews on this repo. Configure here.

@JannikSt JannikSt marked this pull request as ready for review May 25, 2026 23:52
@JannikSt
Copy link
Copy Markdown
Member Author

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 🎉

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit ca96dcb. Configure here.

Comment thread pyproject.toml Outdated
@JannikSt
Copy link
Copy Markdown
Member Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 944ec179f3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread Dockerfile.cuda Outdated
@JannikSt
Copy link
Copy Markdown
Member Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bc1954bab3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread pyproject.toml Outdated
@JannikSt
Copy link
Copy Markdown
Member Author

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Bravo.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant