fix(trtllm): propagate max_seq_len from extra-engine-args to MDC registration by AsadShahid04 · Pull Request #7653 · ai-dynamo/dynamo

AsadShahid04 · 2026-03-26T04:41:26Z

When --extra-engine-args overrides max_seq_len (and other engine config fields), propagate these values back to config so they are correctly reflected in the ModelDeploymentCard (MDC) registration.

This ensures:

context_length in MDC matches the engine's actual max_seq_len limit
Prompt length validation correctly rejects over-length requests (HTTP 400)
Instead of allowing them through and crashing in the engine (HTTP 500)

Also propagates max_batch_size and max_num_tokens for consistency.

Fixes #6987

Summary by CodeRabbit

Bug Fixes
- Fixed configuration override handling to ensure engine limits are correctly propagated when custom configuration options are provided.

…stration When --extra-engine-args overrides max_seq_len (and other engine config fields), propagate these values back to config so they are correctly reflected in the ModelDeploymentCard (MDC) registration. This ensures: - context_length in MDC matches the engine's actual max_seq_len limit - Prompt length validation correctly rejects over-length requests (HTTP 400) - Instead of allowing them through and crashing in the engine (HTTP 500) Also propagates max_batch_size and max_num_tokens for consistency. Fixes ai-dynamo#6987 Signed-off-by: Asad Shahid <asad.shahid@berkeley.edu>

copy-pr-bot · 2026-03-26T04:41:30Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

github-actions · 2026-03-26T04:41:34Z

👋 Hi AsadShahid04! Thank you for contributing to ai-dynamo/dynamo.

Just a reminder: The NVIDIA Test Github Validation CI runs an essential subset of the testing framework to quickly catch errors.Your PR reviewers may elect to test the changes comprehensively before approving your changes.

🚀

coderabbitai · 2026-03-26T04:46:41Z

Walkthrough

The code now parses override values from config.extra_engine_args via update_llm_args_with_extra_options and propagates specific fields (max_seq_len, max_batch_size, max_num_tokens) back to the config object with informational logging. This ensures the config values used for ModelDeploymentCard registration reflect the effective engine limits.

Changes

Cohort / File(s)	Summary
Engine Args Propagation `components/src/dynamo/trtllm/workers/llm_worker.py`	Added logic to parse and propagate override values from extra engine arguments back to `config` fields (`max_seq_len`, `max_batch_size`, `max_num_tokens`) with informational logging, occurring before other config overrides are applied.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly identifies the specific fix: propagating max_seq_len from extra-engine-args to MDC registration, accurately reflecting the main change in the PR.
Description check	✅ Passed	The description covers the problem, solution, and impact. It includes the necessary sections and context for understanding the changes, though it could be slightly more detailed about implementation specifics.
Linked Issues check	✅ Passed	The PR directly addresses issue `#6987` by implementing the suggested fix: propagating max_seq_len (and max_batch_size, max_num_tokens) from extra-engine-args back to config for MDC registration.
Out of Scope Changes check	✅ Passed	All changes are scoped to the identified problem: propagating engine config overrides to the config object before MDC registration, with no extraneous modifications.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

components/src/dynamo/trtllm/workers/llm_worker.py (1)

221-230: ⚠️ Potential issue | 🟠 Major

Consider propagating values from override_engine_args as well.

The PR correctly propagates max_seq_len, max_batch_size, and max_num_tokens from extra_engine_args back to config. However, override_engine_args (applied at lines 222-230) can also modify these same values via deep_update, but no propagation occurs afterward.

If a user sets max_seq_len via --override-engine-args instead of --extra-engine-args, the MDC would still register the wrong context length, causing the same 500-instead-of-400 bug described in issue #6987.

Proposed fix to also propagate after override_engine_args

             deep_update(arg_map, overrides)
+
+            # Propagate any overridden values back to config for MDC registration
+            if "max_seq_len" in overrides:
+                config.max_seq_len = arg_map["max_seq_len"]
+                logging.info(
+                    f"Updated config.max_seq_len from override-engine-args: {config.max_seq_len}"
+                )
+            if "max_batch_size" in overrides:
+                config.max_batch_size = arg_map["max_batch_size"]
+                logging.info(
+                    f"Updated config.max_batch_size from override-engine-args: {config.max_batch_size}"
+                )
+            if "max_num_tokens" in overrides:
+                config.max_num_tokens = arg_map["max_num_tokens"]
+                logging.info(
+                    f"Updated config.max_num_tokens from override-engine-args: {config.max_num_tokens}"
+                )
         except json.JSONDecodeError as e:

Based on learnings: the --override-engine-args flag uses deep_update to merge overrides into the engine arg map, so users may reasonably expect to set these limits via either flag.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@components/src/dynamo/trtllm/workers/llm_worker.py` around lines 221 - 230,
The override_engine_args block applies deep_update to arg_map but doesn't
propagate updated limits back to the config, so values set via
--override-engine-args (e.g., max_seq_len, max_batch_size, max_num_tokens) are
ignored; after successfully loading and applying overrides in the
override_engine_args branch (where config.override_engine_args, json.loads, and
deep_update(arg_map, overrides) are used), reuse the same propagation logic that
was applied after extra_engine_args to copy relevant values from arg_map into
config (or call the helper that updates config from arg_map), ensuring
max_seq_len, max_batch_size, max_num_tokens (and any other limit fields) are
written back to config so MDC sees the overridden values.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@components/src/dynamo/trtllm/workers/llm_worker.py`:
- Line 202: Remove the trailing whitespace on the blank line in llm_worker.py
that is causing black to fail; open the module (llm_worker.py) and delete the
extra spaces on the empty line around the LLMWorker module/class area so the
file contains a truly blank line, then re-run black or your formatter to confirm
the whitespace is gone.

---

Outside diff comments:
In `@components/src/dynamo/trtllm/workers/llm_worker.py`:
- Around line 221-230: The override_engine_args block applies deep_update to
arg_map but doesn't propagate updated limits back to the config, so values set
via --override-engine-args (e.g., max_seq_len, max_batch_size, max_num_tokens)
are ignored; after successfully loading and applying overrides in the
override_engine_args branch (where config.override_engine_args, json.loads, and
deep_update(arg_map, overrides) are used), reuse the same propagation logic that
was applied after extra_engine_args to copy relevant values from arg_map into
config (or call the helper that updates config from arg_map), ensuring
max_seq_len, max_batch_size, max_num_tokens (and any other limit fields) are
written back to config so MDC sees the overridden values.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 55b844de-b8be-4e6f-8f65-f703e3df10c5

📥 Commits

Reviewing files that changed from the base of the PR and between a58bcc3 and 5bb3bd4.

📒 Files selected for processing (1)

components/src/dynamo/trtllm/workers/llm_worker.py

Remove trailing whitespace from blank line inside the extra_engine_args block that caused black formatter to modify the file. Signed-off-by: Asad Shahid <asad.shahid@berkeley.edu>

…registration Also propagate max_seq_len, max_batch_size, and max_num_tokens from override_engine_args back to config after deep_update, matching the existing propagation done for extra_engine_args. This ensures MDC sees the correct values regardless of which flag the user sets. Signed-off-by: Asad Shahid <asad.shahid@berkeley.edu>

AsadShahid04 · 2026-03-27T01:11:05Z

Addressed the CodeRabbit suggestion regarding override_engine_args: added the same propagation logic (for max_seq_len, max_batch_size, max_num_tokens) after the deep_update call in the override_engine_args block. This ensures MDC registration sees the correct values whether the user sets limits via --extra-engine-args or --override-engine-args. Fix in commit 6b040ba.

AsadShahid04 requested a review from a team as a code owner March 26, 2026 04:41

AsadShahid04 requested a review from a team March 26, 2026 04:41

pull-request-size Bot added the size/S label Mar 26, 2026

github-actions Bot added external-contribution Pull request is from an external contributor fix backend::trtllm Relates to the trtllm backend labels Mar 26, 2026

coderabbitai Bot reviewed Mar 26, 2026

View reviewed changes

Comment thread components/src/dynamo/trtllm/workers/llm_worker.py Outdated

AsadShahid04 added 2 commits March 27, 2026 00:54

style: fix trailing whitespace to pass black pre-commit hook

bfc7cc8

Remove trailing whitespace from blank line inside the extra_engine_args block that caused black formatter to modify the file. Signed-off-by: Asad Shahid <asad.shahid@berkeley.edu>

pull-request-size Bot added size/M and removed size/S labels Mar 27, 2026

Merge branch 'main' into fix/inf-issue-6987

89a6165

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(trtllm): propagate max_seq_len from extra-engine-args to MDC registration#7653

fix(trtllm): propagate max_seq_len from extra-engine-args to MDC registration#7653
AsadShahid04 wants to merge 4 commits intoai-dynamo:mainfrom
AsadShahid04:fix/inf-issue-6987

AsadShahid04 commented Mar 26, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

copy-pr-bot Bot commented Mar 26, 2026

Uh oh!

github-actions Bot commented Mar 26, 2026

Uh oh!

coderabbitai Bot commented Mar 26, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

AsadShahid04 commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AsadShahid04 commented Mar 26, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

copy-pr-bot Bot commented Mar 26, 2026

Uh oh!

github-actions Bot commented Mar 26, 2026

Uh oh!

coderabbitai Bot commented Mar 26, 2026

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

AsadShahid04 commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

AsadShahid04 commented Mar 26, 2026 •

edited by coderabbitai Bot

Loading