Skip to content

ci: switch vLLM containers to upstream base images#7648

Open
alec-flowers wants to merge 17 commits intomainfrom
feat/DYN-2204-vllm-upstream-base-pattern-refresh
Open

ci: switch vLLM containers to upstream base images#7648
alec-flowers wants to merge 17 commits intomainfrom
feat/DYN-2204-vllm-upstream-base-pattern-refresh

Conversation

@alec-flowers
Copy link
Copy Markdown
Contributor

@alec-flowers alec-flowers commented Mar 26, 2026

Summary

  • switch the vllm container path to upstream multi-arch vllm/vllm-openai base images and keep the runtime lean by layering only Dynamo-owned wheels with --no-deps
  • simplify the vLLM image-selection/compliance path now that upstream tags are multi-arch, removing the extra architecture-specific tag handling from this PR
  • move the AIC replay parity coverage out of the vLLM runtime lane into profiler-owned planner coverage
  • keep the fault-tolerance/deploy collection fix small by restoring the deploy-local --framework option and broad-copying examples/ into the vLLM image instead of adding new shared pytest plumbing

Validation

  • python3 container/render.py --framework=vllm --target=runtime --platform=linux/amd64 --cuda-version=12.9 --output-short-filename
  • docker build -f container/rendered.Dockerfile -t dynamo:vllm-runtime-local .
  • docker build -f container/Dockerfile.test --build-arg BASE_IMAGE=dynamo:vllm-runtime-local -t dynamo:vllm-test-local .
  • docker run --rm -w /workspace --network host dynamo:vllm-test-local sh -lc 'pytest -v --collect-only -m "pre_merge and vllm and gpu_0"'
  • python3 -m py_compile components/src/dynamo/profiler/tests/unit/test_replay_aic_parity.py lib/bindings/python/tests/replay/replay_utils.py tests/conftest.py tests/deploy/conftest.py tests/fault_tolerance/deploy/conftest.py tests/fault_tolerance/deploy/test_deployment.py
  • git diff --check

@alec-flowers alec-flowers requested a review from a team March 26, 2026 03:26
@alec-flowers alec-flowers requested review from a team as code owners March 26, 2026 03:26
@github-actions github-actions Bot added documentation Improvements or additions to documentation backend::vllm Relates to the vllm backend backend::trtllm Relates to the trtllm backend container actions labels Mar 26, 2026
@alec-flowers alec-flowers changed the title Switch vLLM containers to upstream base images ci: switch vLLM containers to upstream base images Mar 26, 2026
@github-actions github-actions Bot added the ci Issues/PRs that reference CI build/test label Mar 26, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 26, 2026

Walkthrough

This pull request transitions vLLM container builds from custom-built images to upstream vllm/vllm-openai images with arch-specific runtime selection, replaces msgpack with msgspec for serialization, removes complex build scripts, and adds Ray node discovery patching while simplifying Docker image composition and dependency management.

Changes

Cohort / File(s) Summary
Container Runtime & Multi-Arch Support
.github/actions/compliance-scan/action.yml, container/Dockerfile.template, container/render.py, container/context.yaml
Base image resolution now incorporates architecture flag; vLLM runtime selection consolidated to vllm/vllm-openai image; removed non-CUDA device support from vLLM validation; context.yaml restructured with arch-specific runtime image tags and removed legacy build parameters (vllm_ref, xpu, cpu, flashinf_ref, etc.).
Serialization Migration
components/src/dynamo/common/multimodal/embedding_transfer.py, components/src/dynamo/trtllm/publisher.py, container/deps/requirements.common.txt, lib/gpu_memory_service/setup.py, pyproject.toml
Replaced msgpack with msgspec for ZMQ event and NIXL notification serialization; removed msgpack dependency from requirements and pyproject configurations.
Build Template Refactoring
container/templates/args.Dockerfile, container/templates/vllm_runtime.Dockerfile, container/templates/wheel_builder.Dockerfile
Simplified Docker argument declarations with multi-arch support for vLLM; consolidated runtime stage logic; added hwloc build and NIXL plugin disable option; removed device-specific (xpu/cpu) branching and extensive vLLM parameterization.
Build Script & Framework Stage Cleanup
container/templates/vllm_framework.Dockerfile, container/deps/vllm/install_vllm.sh, container/deps/README.md, container/deps/requirements.vllm.txt
Removed entire vllm framework Docker stage and associated install script; eliminated vLLM-specific Python requirements file and documentation reference.
vLLM Source Code Updates
components/src/dynamo/vllm/handlers.py, components/src/dynamo/vllm/main.py
Added Ray node discovery patch with optional dependency handling; broadened metrics config parameter type from union to Any.
Example & Runtime Configuration
examples/backends/vllm/launch/disagg_multimodal_e_pd.sh, examples/backends/vllm/launch/disagg_multimodal_epd.sh, deploy/sanity_check.py
Added UCX transport constraints and per-worker system port configuration for disaggregated multimodal pipelines; updated vLLM version detection to upstream package paths.
Documentation & Compliance
container/README.md, container/compliance/README.md, container/compliance/resolve_base_image.py
Updated documentation to reflect upstream vllm/vllm-openai runtime and architecture-specific base image selection; enhanced resolve_base_image.py script with arch parameter and runtime image tag mapping.
Test Infrastructure Removal
tests/fault_tolerance/deploy/container/Dockerfile.local_vllm, tests/fault_tolerance/deploy/container/build_from_local_vllm.sh
Removed local vLLM Docker build test utilities and associated build script.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The PR description covers the main changes (upstream base images, lean runtime, architecture-aware selection) and includes validation commands, but does not follow the template structure with Overview, Details, Where to start, and Related Issues sections. Restructure the description to match the template: add an Overview section, organize Details by component/change, specify reviewer starting points, and include Related Issues with GitHub issue numbers using action keywords.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: switching vLLM containers to use upstream base images instead of local source builds.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (5)
examples/backends/vllm/launch/disagg_multimodal_epd.sh (1)

124-134: Hardcoded NIXL side-channel and ZMQ ports reduce portability.

The VLLM_NIXL_SIDE_CHANNEL_PORT values (20097-20099) and ZMQ endpoint ports (20080-20082) are static. If multiple test runs or instances execute concurrently (e.g., pytest-xdist), these will conflict.

Consider making these configurable via environment variables similar to the DYN_SYSTEM_PORT* pattern, or using alloc_port from launch_utils.sh if available.

♻️ Example pattern for configurable NIXL/ZMQ ports
+# NIXL side-channel ports (override via environment variables)
+DYN_NIXL_PORT_ENCODE=${DYN_NIXL_PORT_ENCODE:-20097}
+DYN_NIXL_PORT_PREFILL=${DYN_NIXL_PORT_PREFILL:-20098}
+DYN_NIXL_PORT_DECODE=${DYN_NIXL_PORT_DECODE:-20099}
+
+# ZMQ KV event ports
+DYN_ZMQ_PORT_ENCODE=${DYN_ZMQ_PORT_ENCODE:-20080}
+DYN_ZMQ_PORT_PREFILL=${DYN_ZMQ_PORT_PREFILL:-20081}
+DYN_ZMQ_PORT_DECODE=${DYN_ZMQ_PORT_DECODE:-20082}
+
 echo "System ports: encode=${DYN_SYSTEM_PORT_ENCODE}, prefill=${DYN_SYSTEM_PORT_PREFILL}, decode=${DYN_SYSTEM_PORT_DECODE}"
+echo "NIXL ports: encode=${DYN_NIXL_PORT_ENCODE}, prefill=${DYN_NIXL_PORT_PREFILL}, decode=${DYN_NIXL_PORT_DECODE}"
 
-VLLM_NIXL_SIDE_CHANNEL_PORT=20097 DYN_SYSTEM_PORT=$DYN_SYSTEM_PORT_ENCODE ...
+VLLM_NIXL_SIDE_CHANNEL_PORT=$DYN_NIXL_PORT_ENCODE DYN_SYSTEM_PORT=$DYN_SYSTEM_PORT_ENCODE ...

Based on learnings: "Flag hard-coded portability-reducing constants in shell/scripts across the repository (e.g., fixed memory sizes, fixed memory fractions, static ports...). Prefer portable alternatives available in the repo: use alloc_port for dynamic ports..."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/backends/vllm/launch/disagg_multimodal_epd.sh` around lines 124 -
134, The script hardcodes NIXL side-channel ports
(VLLM_NIXL_SIDE_CHANNEL_PORT=20097/20098/20099) and ZMQ endpoint ports
(tcp://*:20080/20081/20082) which causes conflicts; change these to be
allocated/configurable by using alloc_port from launch_utils.sh or environment
variables (e.g., VLLM_NIXL_SIDE_CHANNEL_PORT_ENCODE/PREFILL/DECODE and
VLLM_ZMQ_PORT_ENCODE/PREFILL/DECODE) before launching each worker and then
reference those variables in the VLLM_NIXL_SIDE_CHANNEL_PORT and
--kv-events-config endpoint strings; ensure ports are exported/initialized early
(call alloc_port where available) so concurrent runs pick different ports.
components/src/dynamo/vllm/handlers.py (1)

585-594: Duplicate _NodeInfo class and patch—consider removing.

This inline _NodeInfo class and list_nodes patch duplicate the module-level implementation at lines 77-93. Since the module-level patch runs at import time (before scale_elastic_ep is ever called), this method-level patch is redundant.

♻️ Suggested fix: Remove the duplicate inline patch
         try:
-            # TODO(upstream-vllm): remove this patch once vLLM fixes
-            # add_dp_placement_groups in vllm/v1/engine/utils.py to use ray.nodes()
-            # instead of ray.util.state.list_nodes().
-            #
-            # Patch ray.util.state.list_nodes to use the GCS API instead of the
-            # dashboard HTTP API (127.0.0.1:8265/api/v0/nodes). The dynamo image
-            # installs ray core only (not ray[default]), so the dashboard HTTP server
-            # starts in --minimal mode with the HTTP server disabled. vLLM's
-            # add_dp_placement_groups calls list_nodes() which requires that HTTP
-            # endpoint, causing scale_elastic_ep to fail with "Failed to connect to
-            # API server".
-            #
-            # ray.nodes() uses the GCS gRPC channel directly (no dashboard process
-            # needed) and returns the same information. Imported lazily so ray is not
-            # required at module load time (absent in non-elastic-EP deployments).
-            #
-            # Format mapping:
-            #   list_nodes() → objects with .node_ip and .node_id
-            #   ray.nodes()  → dicts with "NodeManagerAddress" and "NodeID"
-            import ray
-            import ray.util.state as _ray_util_state
-
-            class _NodeInfo:
-                __slots__ = ("node_id", "node_ip")
-
-                def __init__(self, d: dict) -> None:
-                    self.node_ip: str = d["NodeManagerAddress"]
-                    self.node_id: str = d["NodeID"]
-
-            _ray_util_state.list_nodes = lambda **kw: [
-                _NodeInfo(n) for n in ray.nodes() if n.get("Alive", False)
-            ]
-
             await self.engine_client.scale_elastic_ep(new_dp_size)

The TODO comment and documentation are valuable—keeping them in the module-level location (lines 59-75) preserves context while eliminating duplication.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/src/dynamo/vllm/handlers.py` around lines 585 - 594, Remove the
duplicate inline _NodeInfo class and the local patch of
_ray_util_state.list_nodes inside the scale_elastic_ep-related code path: delete
the class _NodeInfo defined with __slots__ ("node_id","node_ip") and the lambda
assignment _ray_util_state.list_nodes = lambda **kw: [ _NodeInfo(n) for n in
ray.nodes() if n.get("Alive", False) ]; rely on the existing module-level
_NodeInfo and the module-level patch of _ray_util_state.list_nodes (keep the
TODO/documentation there) so there is a single canonical implementation.
components/src/dynamo/vllm/main.py (2)

62-64: Remove unused TYPE_CHECKING block.

This if TYPE_CHECKING: pass block is dead code—it does nothing at runtime or for static analysis. The TYPE_CHECKING import on line 10 is also unused since there are no type-only imports guarded by this block.

If OmniConfig was removed and this block was intended to hold a type-only import, either add the appropriate import or remove both the block and the TYPE_CHECKING import.

♻️ Suggested fix
-from typing import TYPE_CHECKING, Any, Optional
+from typing import Any, Optional

And remove lines 62-64 entirely.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/src/dynamo/vllm/main.py` around lines 62 - 64, Remove the dead
TYPE_CHECKING guard and its unused import: delete the "if TYPE_CHECKING: pass"
block and the corresponding "TYPE_CHECKING" import at the top of the file (or
replace it with the intended type-only import if you actually need a guarded
type import such as OmniConfig); ensure no remaining references to TYPE_CHECKING
remain in the module.

188-190: Restore type safety for config parameter by using Config | OmniConfig instead of Any.

The config parameter accepts both Config (from vllm/args.py) and OmniConfig (from vllm/omni/args.py), as both classes have the required attributes (engine_args, namespace, component, endpoint, model). Using Any disables type checking. Define the union type using TYPE_CHECKING to avoid import overhead:

♻️ Suggested fix
 from typing import TYPE_CHECKING, Any, Optional
 
+if TYPE_CHECKING:
+    from .omni.args import OmniConfig
 
 def setup_metrics_collection(
-    config: Any, generate_endpoint: Endpoint, logger: logging.Logger
+    config: Config | OmniConfig, generate_endpoint: Endpoint, logger: logging.Logger
 ) -> None:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/src/dynamo/vllm/main.py` around lines 188 - 190, The config
parameter in setup_metrics_collection should be typed as a union of
vllm.args.Config and vllm.omni.args.OmniConfig instead of Any to restore type
safety; wrap imports in a TYPE_CHECKING block (from typing import TYPE_CHECKING)
to avoid runtime import overhead, then annotate the function as def
setup_metrics_collection(config: "Config | OmniConfig", generate_endpoint:
Endpoint, logger: logging.Logger) -> None (or use Config | OmniConfig directly
if using from __future__ import annotations), ensuring you reference the Config
and OmniConfig symbols from vllm.args and vllm.omni.args respectively so static
type checkers can validate access to engine_args, namespace, component,
endpoint, and model.
container/compliance/resolve_base_image.py (1)

87-93: Code would silently coerce multi-platform strings; however, no current callers pass comma-delimited values to this action.

The split("/")[-1] normalization is problematic for inputs like linux/amd64,linux/arm64 (would coerce to arm64), but examining all workflows shows that compliance-scan's arch input only receives single values: either explicit linux/amd64/linux/arm64 or matrix-expanded ${{ matrix.arch }} (which expands to individual values). Comma-delimited platforms appear elsewhere (e.g., passed to init-dynamo-builder), but not directly to compliance-scan.

That said, the code lacks the validation layer that other actions use (e.g., render.py for docker-remote-build). To be safe, consider explicitly rejecting invalid formats (e.g., reject , or enforce exact matches: {amd64, arm64, linux/amd64, linux/arm64}).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@container/compliance/resolve_base_image.py` around lines 87 - 93, The current
normalization using arch = args.arch.split("/")[-1] silently coerces
multi-platform strings; replace that behavior by validating args.arch exactly
against the allowed set {"amd64", "arm64", "linux/amd64", "linux/arm64"} and
explicitly reject any input containing commas or other unexpected formats;
update the validation block that currently uses args.arch and arch to (1) check
for a comma and exit with an error if present, and (2) check args.arch (not the
split result) is one of the four exact allowed values, logging a clear error
message and sys.exit(1) when it is not.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@components/src/dynamo/common/multimodal/embedding_transfer.py`:
- Line 16: The module imports msgspec (see the import msgspec in
embedding_transfer.py) but the vLLM runtime build installs the Dynamo wheels
with --no-deps so msgspec may not be present; fix by ensuring msgspec is
installed into the runtime image before wheel installation — either install
requirements.common.txt (which pins msgspec) or explicitly pip install msgspec
prior to installing the Dynamo wheels (or add msgspec as a dependency in the
wheel build), so any imports of msgspec in functions/classes under
multimodal/embedding_transfer.py succeed.

In `@container/render.py`:
- Around line 99-103: The CLI currently allows --device choices like cpu/xpu in
parse_args while validate_args enforces vllm only supports "cuda", causing late
failure; update the contract so failures surface earlier by either restricting
parse_args choices when framework is vllm or making the validation
framework-specific and clearer: modify parse_args to conditionally limit
--device choices to ["cuda"] when the selected framework/back-end is "vllm" (or,
if you prefer, change validate_args to raise a framework-specific, descriptive
error mentioning "vllm" and "--device" so callers fail fast), and reference the
validate_args and parse_args functions plus the "--device" flag and "vllm"
backend when implementing the change.

In `@pyproject.toml`:
- Around line 53-54: The comment above the dependency is incorrect — the extra
"vllm" does include vllm-omni==0.16.0, so either remove or correct the comment
in pyproject.toml (referencing the "vllm-omni==0.16.0" line and the "vllm"
extra) to state the accurate behavior, or if the note pertains only to the
container/runtime image, move that note out of pyproject.toml into the container
docs; update the comment text accordingly to avoid the contradiction.

---

Nitpick comments:
In `@components/src/dynamo/vllm/handlers.py`:
- Around line 585-594: Remove the duplicate inline _NodeInfo class and the local
patch of _ray_util_state.list_nodes inside the scale_elastic_ep-related code
path: delete the class _NodeInfo defined with __slots__ ("node_id","node_ip")
and the lambda assignment _ray_util_state.list_nodes = lambda **kw: [
_NodeInfo(n) for n in ray.nodes() if n.get("Alive", False) ]; rely on the
existing module-level _NodeInfo and the module-level patch of
_ray_util_state.list_nodes (keep the TODO/documentation there) so there is a
single canonical implementation.

In `@components/src/dynamo/vllm/main.py`:
- Around line 62-64: Remove the dead TYPE_CHECKING guard and its unused import:
delete the "if TYPE_CHECKING: pass" block and the corresponding "TYPE_CHECKING"
import at the top of the file (or replace it with the intended type-only import
if you actually need a guarded type import such as OmniConfig); ensure no
remaining references to TYPE_CHECKING remain in the module.
- Around line 188-190: The config parameter in setup_metrics_collection should
be typed as a union of vllm.args.Config and vllm.omni.args.OmniConfig instead of
Any to restore type safety; wrap imports in a TYPE_CHECKING block (from typing
import TYPE_CHECKING) to avoid runtime import overhead, then annotate the
function as def setup_metrics_collection(config: "Config | OmniConfig",
generate_endpoint: Endpoint, logger: logging.Logger) -> None (or use Config |
OmniConfig directly if using from __future__ import annotations), ensuring you
reference the Config and OmniConfig symbols from vllm.args and vllm.omni.args
respectively so static type checkers can validate access to engine_args,
namespace, component, endpoint, and model.

In `@container/compliance/resolve_base_image.py`:
- Around line 87-93: The current normalization using arch =
args.arch.split("/")[-1] silently coerces multi-platform strings; replace that
behavior by validating args.arch exactly against the allowed set {"amd64",
"arm64", "linux/amd64", "linux/arm64"} and explicitly reject any input
containing commas or other unexpected formats; update the validation block that
currently uses args.arch and arch to (1) check for a comma and exit with an
error if present, and (2) check args.arch (not the split result) is one of the
four exact allowed values, logging a clear error message and sys.exit(1) when it
is not.

In `@examples/backends/vllm/launch/disagg_multimodal_epd.sh`:
- Around line 124-134: The script hardcodes NIXL side-channel ports
(VLLM_NIXL_SIDE_CHANNEL_PORT=20097/20098/20099) and ZMQ endpoint ports
(tcp://*:20080/20081/20082) which causes conflicts; change these to be
allocated/configurable by using alloc_port from launch_utils.sh or environment
variables (e.g., VLLM_NIXL_SIDE_CHANNEL_PORT_ENCODE/PREFILL/DECODE and
VLLM_ZMQ_PORT_ENCODE/PREFILL/DECODE) before launching each worker and then
reference those variables in the VLLM_NIXL_SIDE_CHANNEL_PORT and
--kv-events-config endpoint strings; ensure ports are exported/initialized early
(call alloc_port where available) so concurrent runs pick different ports.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8891f7b0-f61a-4038-874c-efa3164a1090

📥 Commits

Reviewing files that changed from the base of the PR and between a58bcc3 and bc844f4.

📒 Files selected for processing (26)
  • .github/actions/compliance-scan/action.yml
  • components/src/dynamo/common/multimodal/embedding_transfer.py
  • components/src/dynamo/trtllm/publisher.py
  • components/src/dynamo/vllm/handlers.py
  • components/src/dynamo/vllm/main.py
  • container/Dockerfile.template
  • container/README.md
  • container/compliance/README.md
  • container/compliance/resolve_base_image.py
  • container/context.yaml
  • container/deps/README.md
  • container/deps/requirements.common.txt
  • container/deps/requirements.vllm.txt
  • container/deps/vllm/install_vllm.sh
  • container/render.py
  • container/templates/args.Dockerfile
  • container/templates/vllm_framework.Dockerfile
  • container/templates/vllm_runtime.Dockerfile
  • container/templates/wheel_builder.Dockerfile
  • deploy/sanity_check.py
  • examples/backends/vllm/launch/disagg_multimodal_e_pd.sh
  • examples/backends/vllm/launch/disagg_multimodal_epd.sh
  • lib/gpu_memory_service/setup.py
  • pyproject.toml
  • tests/fault_tolerance/deploy/container/Dockerfile.local_vllm
  • tests/fault_tolerance/deploy/container/build_from_local_vllm.sh
💤 Files with no reviewable changes (8)
  • container/deps/requirements.vllm.txt
  • container/deps/README.md
  • lib/gpu_memory_service/setup.py
  • tests/fault_tolerance/deploy/container/build_from_local_vllm.sh
  • tests/fault_tolerance/deploy/container/Dockerfile.local_vllm
  • container/templates/vllm_framework.Dockerfile
  • container/deps/vllm/install_vllm.sh
  • container/deps/requirements.common.txt

Comment thread components/src/dynamo/common/multimodal/embedding_transfer.py
Comment thread container/render.py
Comment thread pyproject.toml Outdated
@alec-flowers alec-flowers force-pushed the feat/DYN-2204-vllm-upstream-base-pattern-refresh branch from 2863195 to 18bc7b0 Compare April 21, 2026 22:50
@alec-flowers alec-flowers force-pushed the feat/DYN-2204-vllm-upstream-base-pattern-refresh branch from 18bc7b0 to 032bc69 Compare April 21, 2026 23:28
@alec-flowers alec-flowers force-pushed the feat/DYN-2204-vllm-upstream-base-pattern-refresh branch from 032bc69 to c97b10a Compare April 21, 2026 23:34
@alec-flowers alec-flowers force-pushed the feat/DYN-2204-vllm-upstream-base-pattern-refresh branch from c97b10a to 2a6cdbe Compare April 21, 2026 23:43
@alec-flowers alec-flowers force-pushed the feat/DYN-2204-vllm-upstream-base-pattern-refresh branch from 2a6cdbe to 4e26964 Compare April 22, 2026 00:16
@alec-flowers alec-flowers force-pushed the feat/DYN-2204-vllm-upstream-base-pattern-refresh branch from 4e26964 to 0522768 Compare April 22, 2026 00:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

actions backend::trtllm Relates to the trtllm backend backend::vllm Relates to the vllm backend ci Issues/PRs that reference CI build/test container documentation Improvements or additions to documentation multimodal planner size/XXL xpu

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant