ci: switch vLLM containers to upstream base images by alec-flowers · Pull Request #7648 · ai-dynamo/dynamo

alec-flowers · 2026-03-26T03:26:30Z

Summary

switch the vllm container path to upstream multi-arch vllm/vllm-openai base images and keep the runtime lean by layering only Dynamo-owned wheels with --no-deps
simplify the vLLM image-selection/compliance path now that upstream tags are multi-arch, removing the extra architecture-specific tag handling from this PR
move the AIC replay parity coverage out of the vLLM runtime lane into profiler-owned planner coverage
keep the fault-tolerance/deploy collection fix small by restoring the deploy-local --framework option and broad-copying examples/ into the vLLM image instead of adding new shared pytest plumbing

Validation

python3 container/render.py --framework=vllm --target=runtime --platform=linux/amd64 --cuda-version=12.9 --output-short-filename
docker build -f container/rendered.Dockerfile -t dynamo:vllm-runtime-local .
docker build -f container/Dockerfile.test --build-arg BASE_IMAGE=dynamo:vllm-runtime-local -t dynamo:vllm-test-local .
docker run --rm -w /workspace --network host dynamo:vllm-test-local sh -lc 'pytest -v --collect-only -m "pre_merge and vllm and gpu_0"'
python3 -m py_compile components/src/dynamo/profiler/tests/unit/test_replay_aic_parity.py lib/bindings/python/tests/replay/replay_utils.py tests/conftest.py tests/deploy/conftest.py tests/fault_tolerance/deploy/conftest.py tests/fault_tolerance/deploy/test_deployment.py
git diff --check

coderabbitai · 2026-03-26T03:41:14Z

Walkthrough

This pull request transitions vLLM container builds from custom-built images to upstream vllm/vllm-openai images with arch-specific runtime selection, replaces msgpack with msgspec for serialization, removes complex build scripts, and adds Ray node discovery patching while simplifying Docker image composition and dependency management.

Changes

Cohort / File(s)	Summary
Container Runtime & Multi-Arch Support `.github/actions/compliance-scan/action.yml`, `container/Dockerfile.template`, `container/render.py`, `container/context.yaml`	Base image resolution now incorporates architecture flag; vLLM runtime selection consolidated to vllm/vllm-openai image; removed non-CUDA device support from vLLM validation; context.yaml restructured with arch-specific runtime image tags and removed legacy build parameters (vllm_ref, xpu, cpu, flashinf_ref, etc.).
Serialization Migration `components/src/dynamo/common/multimodal/embedding_transfer.py`, `components/src/dynamo/trtllm/publisher.py`, `container/deps/requirements.common.txt`, `lib/gpu_memory_service/setup.py`, `pyproject.toml`	Replaced msgpack with msgspec for ZMQ event and NIXL notification serialization; removed msgpack dependency from requirements and pyproject configurations.
Build Template Refactoring `container/templates/args.Dockerfile`, `container/templates/vllm_runtime.Dockerfile`, `container/templates/wheel_builder.Dockerfile`	Simplified Docker argument declarations with multi-arch support for vLLM; consolidated runtime stage logic; added hwloc build and NIXL plugin disable option; removed device-specific (xpu/cpu) branching and extensive vLLM parameterization.
Build Script & Framework Stage Cleanup `container/templates/vllm_framework.Dockerfile`, `container/deps/vllm/install_vllm.sh`, `container/deps/README.md`, `container/deps/requirements.vllm.txt`	Removed entire vllm framework Docker stage and associated install script; eliminated vLLM-specific Python requirements file and documentation reference.
vLLM Source Code Updates `components/src/dynamo/vllm/handlers.py`, `components/src/dynamo/vllm/main.py`	Added Ray node discovery patch with optional dependency handling; broadened metrics config parameter type from union to Any.
Example & Runtime Configuration `examples/backends/vllm/launch/disagg_multimodal_e_pd.sh`, `examples/backends/vllm/launch/disagg_multimodal_epd.sh`, `deploy/sanity_check.py`	Added UCX transport constraints and per-worker system port configuration for disaggregated multimodal pipelines; updated vLLM version detection to upstream package paths.
Documentation & Compliance `container/README.md`, `container/compliance/README.md`, `container/compliance/resolve_base_image.py`	Updated documentation to reflect upstream vllm/vllm-openai runtime and architecture-specific base image selection; enhanced resolve_base_image.py script with arch parameter and runtime image tag mapping.
Test Infrastructure Removal `tests/fault_tolerance/deploy/container/Dockerfile.local_vllm`, `tests/fault_tolerance/deploy/container/build_from_local_vllm.sh`	Removed local vLLM Docker build test utilities and associated build script.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check	❓ Inconclusive	The PR description covers the main changes (upstream base images, lean runtime, architecture-aware selection) and includes validation commands, but does not follow the template structure with Overview, Details, Where to start, and Related Issues sections.	Restructure the description to match the template: add an Overview section, organize Details by component/change, specify reviewer starting points, and include Related Issues with GitHub issue numbers using action keywords.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main change: switching vLLM containers to use upstream base images instead of local source builds.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (5)

examples/backends/vllm/launch/disagg_multimodal_epd.sh (1)

124-134: Hardcoded NIXL side-channel and ZMQ ports reduce portability.

The VLLM_NIXL_SIDE_CHANNEL_PORT values (20097-20099) and ZMQ endpoint ports (20080-20082) are static. If multiple test runs or instances execute concurrently (e.g., pytest-xdist), these will conflict.

Consider making these configurable via environment variables similar to the DYN_SYSTEM_PORT* pattern, or using alloc_port from launch_utils.sh if available.
♻️ Example pattern for configurable NIXL/ZMQ ports
+# NIXL side-channel ports (override via environment variables)
+DYN_NIXL_PORT_ENCODE=${DYN_NIXL_PORT_ENCODE:-20097}
+DYN_NIXL_PORT_PREFILL=${DYN_NIXL_PORT_PREFILL:-20098}
+DYN_NIXL_PORT_DECODE=${DYN_NIXL_PORT_DECODE:-20099}
+
+# ZMQ KV event ports
+DYN_ZMQ_PORT_ENCODE=${DYN_ZMQ_PORT_ENCODE:-20080}
+DYN_ZMQ_PORT_PREFILL=${DYN_ZMQ_PORT_PREFILL:-20081}
+DYN_ZMQ_PORT_DECODE=${DYN_ZMQ_PORT_DECODE:-20082}
+
 echo "System ports: encode=${DYN_SYSTEM_PORT_ENCODE}, prefill=${DYN_SYSTEM_PORT_PREFILL}, decode=${DYN_SYSTEM_PORT_DECODE}"
+echo "NIXL ports: encode=${DYN_NIXL_PORT_ENCODE}, prefill=${DYN_NIXL_PORT_PREFILL}, decode=${DYN_NIXL_PORT_DECODE}"
 
-VLLM_NIXL_SIDE_CHANNEL_PORT=20097 DYN_SYSTEM_PORT=$DYN_SYSTEM_PORT_ENCODE ...
+VLLM_NIXL_SIDE_CHANNEL_PORT=$DYN_NIXL_PORT_ENCODE DYN_SYSTEM_PORT=$DYN_SYSTEM_PORT_ENCODE ...
Based on learnings: "Flag hard-coded portability-reducing constants in shell/scripts across the repository (e.g., fixed memory sizes, fixed memory fractions, static ports...). Prefer portable alternatives available in the repo: use alloc_port for dynamic ports..."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/backends/vllm/launch/disagg_multimodal_epd.sh` around lines 124 -
134, The script hardcodes NIXL side-channel ports
(VLLM_NIXL_SIDE_CHANNEL_PORT=20097/20098/20099) and ZMQ endpoint ports
(tcp://*:20080/20081/20082) which causes conflicts; change these to be
allocated/configurable by using alloc_port from launch_utils.sh or environment
variables (e.g., VLLM_NIXL_SIDE_CHANNEL_PORT_ENCODE/PREFILL/DECODE and
VLLM_ZMQ_PORT_ENCODE/PREFILL/DECODE) before launching each worker and then
reference those variables in the VLLM_NIXL_SIDE_CHANNEL_PORT and
--kv-events-config endpoint strings; ensure ports are exported/initialized early
(call alloc_port where available) so concurrent runs pick different ports.

components/src/dynamo/vllm/handlers.py (1)

585-594: Duplicate _NodeInfo class and patch—consider removing.

This inline _NodeInfo class and list_nodes patch duplicate the module-level implementation at lines 77-93. Since the module-level patch runs at import time (before scale_elastic_ep is ever called), this method-level patch is redundant.

♻️ Suggested fix: Remove the duplicate inline patch

         try:
-            # TODO(upstream-vllm): remove this patch once vLLM fixes
-            # add_dp_placement_groups in vllm/v1/engine/utils.py to use ray.nodes()
-            # instead of ray.util.state.list_nodes().
-            #
-            # Patch ray.util.state.list_nodes to use the GCS API instead of the
-            # dashboard HTTP API (127.0.0.1:8265/api/v0/nodes). The dynamo image
-            # installs ray core only (not ray[default]), so the dashboard HTTP server
-            # starts in --minimal mode with the HTTP server disabled. vLLM's
-            # add_dp_placement_groups calls list_nodes() which requires that HTTP
-            # endpoint, causing scale_elastic_ep to fail with "Failed to connect to
-            # API server".
-            #
-            # ray.nodes() uses the GCS gRPC channel directly (no dashboard process
-            # needed) and returns the same information. Imported lazily so ray is not
-            # required at module load time (absent in non-elastic-EP deployments).
-            #
-            # Format mapping:
-            #   list_nodes() → objects with .node_ip and .node_id
-            #   ray.nodes()  → dicts with "NodeManagerAddress" and "NodeID"
-            import ray
-            import ray.util.state as _ray_util_state
-
-            class _NodeInfo:
-                __slots__ = ("node_id", "node_ip")
-
-                def __init__(self, d: dict) -> None:
-                    self.node_ip: str = d["NodeManagerAddress"]
-                    self.node_id: str = d["NodeID"]
-
-            _ray_util_state.list_nodes = lambda **kw: [
-                _NodeInfo(n) for n in ray.nodes() if n.get("Alive", False)
-            ]
-
             await self.engine_client.scale_elastic_ep(new_dp_size)

The TODO comment and documentation are valuable—keeping them in the module-level location (lines 59-75) preserves context while eliminating duplication.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@components/src/dynamo/vllm/handlers.py` around lines 585 - 594, Remove the
duplicate inline _NodeInfo class and the local patch of
_ray_util_state.list_nodes inside the scale_elastic_ep-related code path: delete
the class _NodeInfo defined with __slots__ ("node_id","node_ip") and the lambda
assignment _ray_util_state.list_nodes = lambda **kw: [ _NodeInfo(n) for n in
ray.nodes() if n.get("Alive", False) ]; rely on the existing module-level
_NodeInfo and the module-level patch of _ray_util_state.list_nodes (keep the
TODO/documentation there) so there is a single canonical implementation.

components/src/dynamo/vllm/main.py (2)

62-64: Remove unused TYPE_CHECKING block.

This if TYPE_CHECKING: pass block is dead code—it does nothing at runtime or for static analysis. The TYPE_CHECKING import on line 10 is also unused since there are no type-only imports guarded by this block.

If OmniConfig was removed and this block was intended to hold a type-only import, either add the appropriate import or remove both the block and the TYPE_CHECKING import.
♻️ Suggested fix
-from typing import TYPE_CHECKING, Any, Optional
+from typing import Any, Optional
And remove lines 62-64 entirely.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/src/dynamo/vllm/main.py` around lines 62 - 64, Remove the dead
TYPE_CHECKING guard and its unused import: delete the "if TYPE_CHECKING: pass"
block and the corresponding "TYPE_CHECKING" import at the top of the file (or
replace it with the intended type-only import if you actually need a guarded
type import such as OmniConfig); ensure no remaining references to TYPE_CHECKING
remain in the module.
188-190: Restore type safety for config parameter by using Config | OmniConfig instead of Any.

The config parameter accepts both Config (from vllm/args.py) and OmniConfig (from vllm/omni/args.py), as both classes have the required attributes (engine_args, namespace, component, endpoint, model). Using Any disables type checking. Define the union type using TYPE_CHECKING to avoid import overhead:
♻️ Suggested fix
 from typing import TYPE_CHECKING, Any, Optional
 
+if TYPE_CHECKING:
+    from .omni.args import OmniConfig
 
 def setup_metrics_collection(
-    config: Any, generate_endpoint: Endpoint, logger: logging.Logger
+    config: Config | OmniConfig, generate_endpoint: Endpoint, logger: logging.Logger
 ) -> None:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/src/dynamo/vllm/main.py` around lines 188 - 190, The config
parameter in setup_metrics_collection should be typed as a union of
vllm.args.Config and vllm.omni.args.OmniConfig instead of Any to restore type
safety; wrap imports in a TYPE_CHECKING block (from typing import TYPE_CHECKING)
to avoid runtime import overhead, then annotate the function as def
setup_metrics_collection(config: "Config | OmniConfig", generate_endpoint:
Endpoint, logger: logging.Logger) -> None (or use Config | OmniConfig directly
if using from __future__ import annotations), ensuring you reference the Config
and OmniConfig symbols from vllm.args and vllm.omni.args respectively so static
type checkers can validate access to engine_args, namespace, component,
endpoint, and model.

container/compliance/resolve_base_image.py (1)

87-93: Code would silently coerce multi-platform strings; however, no current callers pass comma-delimited values to this action.

The split("/")[-1] normalization is problematic for inputs like linux/amd64,linux/arm64 (would coerce to arm64), but examining all workflows shows that compliance-scan's arch input only receives single values: either explicit linux/amd64/linux/arm64 or matrix-expanded ${{ matrix.arch }} (which expands to individual values). Comma-delimited platforms appear elsewhere (e.g., passed to init-dynamo-builder), but not directly to compliance-scan.

That said, the code lacks the validation layer that other actions use (e.g., render.py for docker-remote-build). To be safe, consider explicitly rejecting invalid formats (e.g., reject , or enforce exact matches: {amd64, arm64, linux/amd64, linux/arm64}).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@container/compliance/resolve_base_image.py` around lines 87 - 93, The current
normalization using arch = args.arch.split("/")[-1] silently coerces
multi-platform strings; replace that behavior by validating args.arch exactly
against the allowed set {"amd64", "arm64", "linux/amd64", "linux/arm64"} and
explicitly reject any input containing commas or other unexpected formats;
update the validation block that currently uses args.arch and arch to (1) check
for a comma and exit with an error if present, and (2) check args.arch (not the
split result) is one of the four exact allowed values, logging a clear error
message and sys.exit(1) when it is not.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@components/src/dynamo/common/multimodal/embedding_transfer.py`:
- Line 16: The module imports msgspec (see the import msgspec in
embedding_transfer.py) but the vLLM runtime build installs the Dynamo wheels
with --no-deps so msgspec may not be present; fix by ensuring msgspec is
installed into the runtime image before wheel installation — either install
requirements.common.txt (which pins msgspec) or explicitly pip install msgspec
prior to installing the Dynamo wheels (or add msgspec as a dependency in the
wheel build), so any imports of msgspec in functions/classes under
multimodal/embedding_transfer.py succeed.

In `@container/render.py`:
- Around line 99-103: The CLI currently allows --device choices like cpu/xpu in
parse_args while validate_args enforces vllm only supports "cuda", causing late
failure; update the contract so failures surface earlier by either restricting
parse_args choices when framework is vllm or making the validation
framework-specific and clearer: modify parse_args to conditionally limit
--device choices to ["cuda"] when the selected framework/back-end is "vllm" (or,
if you prefer, change validate_args to raise a framework-specific, descriptive
error mentioning "vllm" and "--device" so callers fail fast), and reference the
validate_args and parse_args functions plus the "--device" flag and "vllm"
backend when implementing the change.

In `@pyproject.toml`:
- Around line 53-54: The comment above the dependency is incorrect — the extra
"vllm" does include vllm-omni==0.16.0, so either remove or correct the comment
in pyproject.toml (referencing the "vllm-omni==0.16.0" line and the "vllm"
extra) to state the accurate behavior, or if the note pertains only to the
container/runtime image, move that note out of pyproject.toml into the container
docs; update the comment text accordingly to avoid the contradiction.

---

Nitpick comments:
In `@components/src/dynamo/vllm/handlers.py`:
- Around line 585-594: Remove the duplicate inline _NodeInfo class and the local
patch of _ray_util_state.list_nodes inside the scale_elastic_ep-related code
path: delete the class _NodeInfo defined with __slots__ ("node_id","node_ip")
and the lambda assignment _ray_util_state.list_nodes = lambda **kw: [
_NodeInfo(n) for n in ray.nodes() if n.get("Alive", False) ]; rely on the
existing module-level _NodeInfo and the module-level patch of
_ray_util_state.list_nodes (keep the TODO/documentation there) so there is a
single canonical implementation.

In `@components/src/dynamo/vllm/main.py`:
- Around line 62-64: Remove the dead TYPE_CHECKING guard and its unused import:
delete the "if TYPE_CHECKING: pass" block and the corresponding "TYPE_CHECKING"
import at the top of the file (or replace it with the intended type-only import
if you actually need a guarded type import such as OmniConfig); ensure no
remaining references to TYPE_CHECKING remain in the module.
- Around line 188-190: The config parameter in setup_metrics_collection should
be typed as a union of vllm.args.Config and vllm.omni.args.OmniConfig instead of
Any to restore type safety; wrap imports in a TYPE_CHECKING block (from typing
import TYPE_CHECKING) to avoid runtime import overhead, then annotate the
function as def setup_metrics_collection(config: "Config | OmniConfig",
generate_endpoint: Endpoint, logger: logging.Logger) -> None (or use Config |
OmniConfig directly if using from __future__ import annotations), ensuring you
reference the Config and OmniConfig symbols from vllm.args and vllm.omni.args
respectively so static type checkers can validate access to engine_args,
namespace, component, endpoint, and model.

In `@container/compliance/resolve_base_image.py`:
- Around line 87-93: The current normalization using arch =
args.arch.split("/")[-1] silently coerces multi-platform strings; replace that
behavior by validating args.arch exactly against the allowed set {"amd64",
"arm64", "linux/amd64", "linux/arm64"} and explicitly reject any input
containing commas or other unexpected formats; update the validation block that
currently uses args.arch and arch to (1) check for a comma and exit with an
error if present, and (2) check args.arch (not the split result) is one of the
four exact allowed values, logging a clear error message and sys.exit(1) when it
is not.

In `@examples/backends/vllm/launch/disagg_multimodal_epd.sh`:
- Around line 124-134: The script hardcodes NIXL side-channel ports
(VLLM_NIXL_SIDE_CHANNEL_PORT=20097/20098/20099) and ZMQ endpoint ports
(tcp://*:20080/20081/20082) which causes conflicts; change these to be
allocated/configurable by using alloc_port from launch_utils.sh or environment
variables (e.g., VLLM_NIXL_SIDE_CHANNEL_PORT_ENCODE/PREFILL/DECODE and
VLLM_ZMQ_PORT_ENCODE/PREFILL/DECODE) before launching each worker and then
reference those variables in the VLLM_NIXL_SIDE_CHANNEL_PORT and
--kv-events-config endpoint strings; ensure ports are exported/initialized early
(call alloc_port where available) so concurrent runs pick different ports.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8891f7b0-f61a-4038-874c-efa3164a1090

📥 Commits

Reviewing files that changed from the base of the PR and between a58bcc3 and bc844f4.

📒 Files selected for processing (26)

.github/actions/compliance-scan/action.yml
components/src/dynamo/common/multimodal/embedding_transfer.py
components/src/dynamo/trtllm/publisher.py
components/src/dynamo/vllm/handlers.py
components/src/dynamo/vllm/main.py
container/Dockerfile.template
container/README.md
container/compliance/README.md
container/compliance/resolve_base_image.py
container/context.yaml
container/deps/README.md
container/deps/requirements.common.txt
container/deps/requirements.vllm.txt
container/deps/vllm/install_vllm.sh
container/render.py
container/templates/args.Dockerfile
container/templates/vllm_framework.Dockerfile
container/templates/vllm_runtime.Dockerfile
container/templates/wheel_builder.Dockerfile
deploy/sanity_check.py
examples/backends/vllm/launch/disagg_multimodal_e_pd.sh
examples/backends/vllm/launch/disagg_multimodal_epd.sh
lib/gpu_memory_service/setup.py
pyproject.toml
tests/fault_tolerance/deploy/container/Dockerfile.local_vllm
tests/fault_tolerance/deploy/container/build_from_local_vllm.sh

💤 Files with no reviewable changes (8)

container/deps/requirements.vllm.txt
container/deps/README.md
lib/gpu_memory_service/setup.py
tests/fault_tolerance/deploy/container/build_from_local_vllm.sh
tests/fault_tolerance/deploy/container/Dockerfile.local_vllm
container/templates/vllm_framework.Dockerfile
container/deps/vllm/install_vllm.sh
container/deps/requirements.common.txt

alec-flowers requested a review from a team March 26, 2026 03:26

alec-flowers requested review from a team as code owners March 26, 2026 03:26

pull-request-size Bot added the size/XXL label Mar 26, 2026

github-actions Bot added documentation Improvements or additions to documentation backend::vllm Relates to the vllm backend backend::trtllm Relates to the trtllm backend container actions labels Mar 26, 2026

alec-flowers changed the title ~~Switch vLLM containers to upstream base images~~ ci: switch vLLM containers to upstream base images Mar 26, 2026

github-actions Bot added the ci Issues/PRs that reference CI build/test label Mar 26, 2026

copy-pr-bot Bot temporarily deployed to GITLAB March 26, 2026 03:31 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB March 26, 2026 03:32 Inactive

coderabbitai Bot reviewed Mar 26, 2026

View reviewed changes

Comment thread components/src/dynamo/common/multimodal/embedding_transfer.py

Comment thread container/render.py

Comment thread pyproject.toml Outdated

copy-pr-bot Bot temporarily deployed to GITLAB March 26, 2026 04:45 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB March 26, 2026 06:30 Inactive

github-actions Bot added the planner label Mar 26, 2026

copy-pr-bot Bot temporarily deployed to GITLAB March 26, 2026 06:31 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB March 26, 2026 06:36 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB March 26, 2026 18:41 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB March 26, 2026 18:44 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB March 26, 2026 18:47 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB March 26, 2026 18:48 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB March 26, 2026 18:54 Inactive

alec-flowers force-pushed the feat/DYN-2204-vllm-upstream-base-pattern-refresh branch from 994a545 to 505010e Compare March 26, 2026 19:09

copy-pr-bot Bot temporarily deployed to GITLAB March 26, 2026 19:09 Inactive

alec-flowers force-pushed the feat/DYN-2204-vllm-upstream-base-pattern-refresh branch from 2863195 to 18bc7b0 Compare April 21, 2026 22:50

copy-pr-bot Bot temporarily deployed to GITLAB April 21, 2026 22:50 Inactive

alec-flowers force-pushed the feat/DYN-2204-vllm-upstream-base-pattern-refresh branch from 18bc7b0 to 032bc69 Compare April 21, 2026 23:28

copy-pr-bot Bot temporarily deployed to GITLAB April 21, 2026 23:28 Inactive

alec-flowers force-pushed the feat/DYN-2204-vllm-upstream-base-pattern-refresh branch from 032bc69 to c97b10a Compare April 21, 2026 23:34

copy-pr-bot Bot temporarily deployed to GITLAB April 21, 2026 23:34 Inactive

alec-flowers force-pushed the feat/DYN-2204-vllm-upstream-base-pattern-refresh branch from c97b10a to 2a6cdbe Compare April 21, 2026 23:43

copy-pr-bot Bot temporarily deployed to GITLAB April 21, 2026 23:43 Inactive

alec-flowers force-pushed the feat/DYN-2204-vllm-upstream-base-pattern-refresh branch from 2a6cdbe to 4e26964 Compare April 22, 2026 00:16

copy-pr-bot Bot temporarily deployed to GITLAB April 22, 2026 00:16 Inactive

build(vllm): revive upstream runtime container

0522768

alec-flowers force-pushed the feat/DYN-2204-vllm-upstream-base-pattern-refresh branch from 4e26964 to 0522768 Compare April 22, 2026 00:26

copy-pr-bot Bot temporarily deployed to GITLAB April 22, 2026 00:26 Inactive

test(vllm): fix disagg multimodal serve coverage

da41632

copy-pr-bot Bot temporarily deployed to GITLAB April 22, 2026 00:50 Inactive

alec-flowers added 15 commits April 21, 2026 17:52

test(vllm): use larger synthetic multimodal image

fd15720

build(vllm): layer vllm-omni onto upstream runtime

02f2c03

refactor(vllm): pull omni deps from upstream ref

16383ac

fix(vllm): redeclare omni ref in runtime stage

fe9b1f4

ci: rebuild lmcache from source on cuda13

2c90b45

test: xfail vllm lmcache on cuda13

e9cf1c9

fix: guard lmcache gaps in vllm runtime

67fa14b

fix(vllm): build runtime wheel with nixl

69504da

test(vllm): xfail cuda12 nixl disagg instability

39058ce

test(vllm): xfail cuda12 qwen p-d nixl path

883cf3b

test(vllm): document cuda12 p-d xfail

cc2e455

test(vllm): restore cuda12 nixl coverage

216a611

fix(vllm): select cuda-matched nixl runtime libs

f50bbe6

fix(vllm): build runtime wheel once with nixl

8161c73

fix(vllm): upgrade runtime nixl for stub capi

ecbc8ea

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: switch vLLM containers to upstream base images#7648

ci: switch vLLM containers to upstream base images#7648
alec-flowers wants to merge 17 commits intomainfrom
feat/DYN-2204-vllm-upstream-base-pattern-refresh

alec-flowers commented Mar 26, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Mar 26, 2026 •

edited

Loading

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

alec-flowers commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Uh oh!

coderabbitai Bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

alec-flowers commented Mar 26, 2026 •

edited

Loading

coderabbitai Bot commented Mar 26, 2026 •

edited

Loading