feat(client): RENDERERS_MAX_PROMPT_LEN env override for pre-flight cap by snimu · Pull Request #62 · PrimeIntellect-ai/renderers

snimu · 2026-05-25T09:58:34Z

Adds an env-var escape hatch for the pre-flight overflow check in _resolve_max_prompt_len. When RENDERERS_MAX_PROMPT_LEN is set to a positive integer, that value is returned directly and /v1/models is not queried.

Motivation: routers/gateways whose /v1/models handler is broken (observed with vllm-router v0.1.22 under --intra-node-data-parallel-size

silently disable the pre-flight via the cached-None path,
which lets overlong prompts reach the engine and crash the orchestrator with a raw ValueError. Operators who know the real cap can now set the env var to restore pre-flight without touching the broken endpoint.

Invalid values (non-integer, <= 0) are logged and ignored, falling back to the existing auto-discovery path.

Note

Low Risk
Optional env override on client-side pre-flight only; invalid values are ignored with fallback to existing discovery.

Overview
Adds RENDERERS_MAX_PROMPT_LEN so operators can pin the client pre-flight context cap without calling GET /v1/models. When set to a positive integer, _resolve_max_prompt_len returns that value immediately and skips engine model-card discovery—useful when /v1/models is broken but the real max_model_len is known.

Invalid values (non-integer or ≤ 0) are logged and ignored; behavior falls back to the existing auto-discovery and cache path. Tests cover override winning over the model card, skipping /v1/models, and invalid env falling back to discovery.

^{Reviewed by Cursor Bugbot for commit 5f653df. Bugbot is set up for automated code reviews on this repo. Configure here.}

Note

Add `RENDERERS_MAX_PROMPT_LEN` env var override for pre-flight prompt length cap

Adds _max_prompt_len_from_env in renderers/client.py that reads RENDERERS_MAX_PROMPT_LEN and parses it as a positive integer, logging a warning and returning None for invalid values.
Updates _resolve_max_prompt_len to return the env var value immediately when set, skipping the /v1/models HTTP request and the in-memory cache lookup.
When the env var is unset or invalid, behavior is unchanged: check the cache, then query /v1/models.

^{Macroscope summarized 5f653df.}

Adds an env-var escape hatch for the pre-flight overflow check in `_resolve_max_prompt_len`. When `RENDERERS_MAX_PROMPT_LEN` is set to a positive integer, that value is returned directly and `/v1/models` is not queried. Motivation: routers/gateways whose `/v1/models` handler is broken (observed with vllm-router v0.1.22 under `--intra-node-data-parallel-size` > 1) silently disable the pre-flight via the cached-`None` path, which lets overlong prompts reach the engine and crash the orchestrator with a raw `ValueError`. Operators who know the real cap can now set the env var to restore pre-flight without touching the broken endpoint. Invalid values (non-integer, <= 0) are logged and ignored, falling back to the existing auto-discovery path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 5f653df. Configure here.}

cursor · 2026-05-25T10:03:23Z

    """
+    override = _max_prompt_len_from_env()
+    if override is not None:
+        return override


Invalid env var logs warning on every call

Medium Severity

_max_prompt_len_from_env() is called on every invocation of _resolve_max_prompt_len, which runs on every generate() call. When the env var is set to an invalid value (e.g. a typo like "4096O"), the warning is logged on every single call — potentially thousands of times per second — even though the auto-discovery fallback result is properly cached. The auto-discovery path deliberately caches failures to avoid "retry on every call," but the env var parse has no such caching, creating unbounded log spam for a simple operator typo.

Additional Locations (1)

renderers/client.py#L86-L98

^{Reviewed by Cursor Bugbot for commit 5f653df. Configure here.}

macroscopeapp · 2026-05-25T10:05:09Z

Approvability

Verdict: Approved

Small, self-contained feature adding an optional env var override for max prompt length. Existing behavior unchanged when env var is unset. The unresolved comment about log spam on invalid env values is a minor polish issue, not a blocking concern.

^{You can customize Macroscope's approvability policy. Learn more.}

cursor Bot reviewed May 25, 2026

View reviewed changes

macroscopeapp Bot approved these changes May 25, 2026

View reviewed changes

snimu closed this May 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(client): RENDERERS_MAX_PROMPT_LEN env override for pre-flight cap#62

feat(client): RENDERERS_MAX_PROMPT_LEN env override for pre-flight cap#62
snimu wants to merge 1 commit into
mainfrom
sebastian/preflight-max-prompt-len-env-override

snimu commented May 25, 2026 •

edited by macroscopeapp Bot

Loading

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot May 25, 2026

Uh oh!

macroscopeapp Bot commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

snimu commented May 25, 2026 • edited by macroscopeapp Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add RENDERERS_MAX_PROMPT_LEN env var override for pre-flight prompt length cap

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 25, 2026

Choose a reason for hiding this comment

Invalid env var logs warning on every call

Uh oh!

macroscopeapp Bot commented May 25, 2026

Approvability

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

snimu commented May 25, 2026 •

edited by macroscopeapp Bot

Loading

Add `RENDERERS_MAX_PROMPT_LEN` env var override for pre-flight prompt length cap