Consume the typed RendererConfig surface by hallerite · Pull Request #2635 · PrimeIntellect-ai/prime-rl

hallerite · 2026-05-25T22:04:25Z

Summary

Switches the orchestrator and SFT trainer to consume the typed
renderers.RendererConfig discriminated union (rendered in
PrimeIntellect-ai/renderers#60) and the matching typed
vf.ClientConfig.renderer_config field
(PrimeIntellect-ai/verifiers#1467).

configs.shared.RendererConfig composes renderers.RendererConfig as a single typed settings field. The flat name / tool_parser / reasoning_parser / preserve_* fields are gone — pydantic dispatches on the settings.name discriminator.
setup_inference_pool / setup_clients / StaticInferencePool / ElasticInferencePool collapse a six-arg renderer surface to one renderer_config arg.
Orchestrator and SFT trainer call sites become create_renderer(tokenizer, config.renderer.settings).
Cross-env / unmapped-model validators now inspect renderer.settings.

TOML before / after

Before:

[orchestrator.renderer]
name = "qwen3.5"
tool_parser = "qwen3.5"
preserve_all_thinking = true

After:

[orchestrator.renderer.settings]
name = "qwen3.5"
enable_thinking = false
add_vision_id = true
preserve_all_thinking = true

Bogus combinations (e.g. add_vision_id under name = "qwen3") raise pydantic.ValidationError at config-load time instead of --dry-run.

Stacking

Pins deps/renderers submodule and the prime-rl-configs renderers dep to a git SHA on the renderers PR's head while it's in review. When that lands and a new dev release ships, this PR bumps to the PyPI release.

Depends on: Typed renderer configs renderers#60
Depends on: Consume the typed RendererConfig surface verifiers#1467
Supersedes: Use sampling chat template kwargs for renderer RL #2605 (which wired chat_template_kwargs via sampling.extra_body)

Validation

Static-only at the moment — the renderers / verifiers branches haven't shipped to PyPI, so the prime-rl venv can't be re-synced to run pytest against the new typed surface yet.

uv run ruff check and uv run ruff format — clean (pre-commit applied one auto-format)
Updated unit tests construct typed configs; will run on CI after the renderers + verifiers deps resolve

Test plan

CI runs once both upstream PRs land
Manual: round-trip a TOML config through OrchestratorConfig and confirm config.renderer.settings deserializes to the expected variant
Manual: trigger the unmapped-model validator and confirm the new error message references renderer.settings.name

Note

Medium Risk
Changes how rollout tokenization is selected (TITO vs MITO) and config validation for VLMs and auto-renderer resolution; misconfigured TOML could fail at load time or pick the wrong client until upstream renderers/verifiers deps align.

Overview
Replaces the orchestrator.use_renderer boolean with orchestrator.renderer: a typed renderers.RendererConfig discriminated union (default AutoRendererConfig) when using the renderer/TITO path, or None for MITO. The duplicate flat RendererConfig in configs.shared is removed in favor of the renderers package; create_renderer and verifiers clients now take a single renderer_config instead of scattered **name / parsers / preserve_*` fields.

orchestrator.pool_size moves to the orchestrator root (no longer under [orchestrator.renderer]). Validators and docs are updated accordingly (VLM requires a non-None renderer, SFT forces renderer=None, pool_size only when a renderer is set). A custom serializer writes renderer = "None" so MITO configs round-trip through saved orch.toml. Sample configs, multimodal docs, prime-rl-configs renderers pin, and unit tests follow the new shape.

^{Reviewed by Cursor Bugbot for commit 310a2fa. Bugbot is set up for automated code reviews on this repo. Configure here.}

Reshapes the orchestrator / SFT renderer plumbing around ``renderers.RendererConfig`` (the typed discriminated union) instead of flat strings + booleans. - ``configs.shared.RendererConfig`` now composes ``renderers.RendererConfig`` as a single ``settings`` field; ``pool_size`` stays as a separate prime-rl-side concern. The flat ``name`` / ``tool_parser`` / ``reasoning_parser`` / ``preserve_*`` fields are gone. - ``setup_inference_pool`` / ``setup_clients`` / ``StaticInferencePool`` / ``ElasticInferencePool`` collapse their six-arg renderer surface to a single ``renderer_config: RendererConfig | None`` plus the orthogonal ``renderer_model_name`` / ``renderer_pool_size`` fields. - Orchestrator + SFT trainer call sites pass the typed config straight through to ``create_renderer(tokenizer, config.renderer.settings)``. - Cross-env / unmapped-model validators rewritten to inspect ``renderer.settings`` (discriminated-union shape) instead of the flat field names. - Tests updated to construct typed configs (``Qwen3VLRendererConfig()``, etc.) and assert against the new call shapes. Pins ``deps/renderers`` and the ``prime-rl-configs`` ``renderers`` dep to the renderers PR's tip (``4c9099debbc9bc4fcc3eb6a118bbfa5c364ec395``) until it merges and a dev release ships. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The typed-config refactor landed on renderers main as ``c86c50b`` (tag ``renderers-v0.1.8.dev27``). Switching: - ``deps/renderers`` submodule pointer → ``c86c50b`` - ``prime-rl-configs`` git URL pin → ``@renderers-v0.1.8.dev27`` Same commit, more readable, and stable to anyone reading the diff after the feature branch is deleted. PyPI dev release for the same version is pending a publisher-config fix; once it lands the git URL pin can collapse to a plain ``renderers>=0.1.8.dev27`` PyPI pin.

The new RendererConfig accepts only `settings` (typed `renderers.RendererConfig` discriminated union) and `pool_size`, so flat `name`/`reasoning_parser`/`preserve_*` keys under `[orchestrator.renderer]` now fail pydantic validation. Move those keys under `[orchestrator.renderer.settings]`. `pool_size` stays in the outer block. Two configs that only set `preserve_all_thinking` gain an explicit `name = "auto"` since the discriminator is required. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Per Mika's review: drop prime-rl's RendererConfig wrapper that nested the renderers' discriminated union under a ``.settings`` field. The wrapper bought a parallel ``pool_size`` attribute but cost a clunky two-level TOML layout and made every call site dereference ``.settings``. Use ``renderers.RendererConfig`` (the typed discriminated union) as the field type directly. Pool size moves up one level to ``orchestrator.renderer_pool_size``. ``use_renderer: bool`` is gone — ``renderer: RendererConfig | None`` with ``None`` for MITO replaces it (consistent with how SFT, optimizer, and other optional sub-configs express absence). TOML callers spell MITO with ``renderer = "None"``, which ``prime_pydantic_config.BaseConfig._none_str_to_none`` coerces back to Python None. Net effect on TOMLs is a wash with the previous commit: 26 in-repo configs lose the ``.settings`` indirection they gained yesterday, and one (multimodal) splits ``pool_size`` out of the renderer block. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The slim ``prime-rl-configs`` wheel (installed standalone, outside the workspace) now resolves ``renderers`` from PyPI instead of the git tag URL it used while the dev release was waiting on PyPI trusted-publisher config. The full ``prime-rl`` workspace still pulls renderers from the ``deps/renderers`` submodule, so this only affects downstream slim consumers. Re-locks to refresh the lockfile against the same workspace state. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The slim ``prime-rl-configs`` install path (configs only, no GPU deps) asserts that importing the config classes doesn't pull in ``transformers``. ``from renderers import RendererConfig`` at the top of ``orchestrator.py`` / ``sft.py`` broke that: ``renderers/__init__`` eagerly imports every concrete renderer module (DefaultRenderer, Qwen3Renderer, …), each of which imports ``transformers.tokenization_utils.PreTrainedTokenizer``. Defer the renderers import via: - ``TYPE_CHECKING``-guarded ``from renderers import RendererConfig`` so static type checkers still see the typed discriminated union. - ``renderer: Any`` field annotation at runtime, so no resolution happens at class definition. - ``model_validator(mode="before") _resolve_renderer`` that lazy-imports ``renderers``, fills the orchestrator default (``AutoRendererConfig()``), and coerces dict input through the typed union via ``TypeAdapter``. SFT's default stays ``None``. End-user semantics unchanged; class import is now transformers-free. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

dev28 lazy-loads concrete renderer classes via PEP 562 ``__getattr__`` (renderers#64), so ``from renderers import RendererConfig`` no longer drags ``transformers`` into ``sys.modules``. The slim-install CI gate is satisfied without the prior workaround. Drop the ``Any`` annotation and the ``_resolve_renderer`` model-validator that lazy-resolved the discriminated union from ``OrchestratorConfig`` and ``SFTConfig``. The fields are typed ``RendererConfig | None`` again, and pydantic does the full discriminator validation at class definition. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- deps/renderers c86c50b -> 2ec28a8 — picks up the PEP 562 ``__getattr__`` so the workspace install path also lazy-loads concrete renderer classes. - deps/verifiers 04651be -> d107442 — adds the typed ``ClientConfig.renderer_config`` field that ``setup_clients`` forwards. Fixes the unit-test ``test_setup_clients_assigns_renderer_and_dp_rank_headers`` AttributeError on ``renderer_config``. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two cleanups on the typed renderer surface: 1. Per Samsja: drop the ``RendererConfig as RendererSettings`` alias in ``client.py`` / ``elastic.py``. The alias was a workaround for the former local ``RendererConfig`` wrapper class; with that gone, we just import ``RendererConfig`` from renderers directly. 2. Per Cursor bugbot: ``OrchestratorConfig.renderer = None`` (MITO) was silently dropped by ``model_dump(exclude_none=True)``, so any saved ``control/orch.toml`` would reload as the default ``AutoRendererConfig()`` (TITO). Add a wrapping ``model_serializer`` that re-injects ``renderer = "None"`` (string) when the field is ``None``, matching what ``BaseConfig._none_str_to_none`` accepts on reload. MITO and TITO now round-trip through TOML. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 2072afe. Configure here.}

cursor · 2026-05-26T00:20:26Z


-    use_renderer: bool = True
-    """Use the renderer-backed TITO client (client-side tokenization via the ``renderers`` package, served by ``/v1/generate``). When True, the ``[orchestrator.renderer]`` block (name / tool_parser / reasoning_parser / pool_size) applies. Default for both text-only and VLM rollouts; VLMs require it. False falls back to MITO (``openai_chat_completions``)."""
-


Missing breaking config changelog

Medium Severity

This PR removes orchestrator.use_renderer, moves renderer pool sizing to orchestrator.renderer_pool_size, and replaces the flat [orchestrator.renderer] schema with the typed renderers.RendererConfig union (MITO via renderer = "None"). Those are breaking config changes, but CHANGELOG.md was not updated with a migration entry.

^{Triggered by project rule: BugBot Instructions}

^{Reviewed by Cursor Bugbot for commit 2072afe. Configure here.}

mikasenghaas · 2026-05-26T00:22:35Z

-    renderer: RendererConfig = RendererConfig()
-    """Client-side renderer configuration. Only consumed when ``use_renderer=true``."""
+    renderer: RendererConfig | None = Field(default_factory=AutoRendererConfig)
+    """Typed renderer config (``renderers.RendererConfig`` discriminated


can we make this docstring a bit more concise

mikasenghaas · 2026-05-26T00:23:26Z

+    @model_serializer(mode="wrap")
+    def _preserve_mito_renderer(self, handler: SerializerFunctionWrapHandler) -> dict[str, Any]:
+        """Emit ``renderer = "None"`` (string) when MITO so
+        ``model_dump(exclude_none=True)`` round-trips: dumped TOML has
+        ``renderer = "None"``, and on reload
+        ``BaseConfig._none_str_to_none`` coerces it back to ``None``.
+        Without this, a MITO orchestrator config saved to
+        ``control/orch.toml`` would lose the renderer key entirely and
+        reload as the default ``AutoRendererConfig()`` (TITO)."""
+        result = handler(self)
+        if self.renderer is None:
+            result["renderer"] = "None"
+        return result


hmm this is code smell from the pydantic-config side. can merge for now but can you put an issue to fix this one

…cstring - renderer field uses direct AutoRendererConfig() default (frozen, no factory needed) - renderer_pool_size -> pool_size on OrchestratorConfig; cascade through client.py, elastic.py, orchestrator.py call sites and the test; vf.ClientConfig kwarg name (verifiers-side) stays renderer_pool_size - shorter docstring on the renderer field Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

hallerite and others added 2 commits May 25, 2026 22:03

cursor Bot reviewed May 25, 2026

View reviewed changes

Comment thread packages/prime-rl-configs/src/prime_rl/configs/shared.py

mikasenghaas reviewed May 25, 2026

View reviewed changes

Comment thread packages/prime-rl-configs/src/prime_rl/configs/orchestrator.py Outdated

Comment thread packages/prime-rl-configs/src/prime_rl/configs/shared.py Outdated

cursor Bot reviewed May 25, 2026

View reviewed changes

Comment thread tests/unit/train/test_runs.py

hallerite and others added 6 commits May 25, 2026 23:21

Merge branch 'main' into feat/typed-renderer-config

351e1f4

Merge branch 'main' into feat/typed-renderer-config

74f1376

samsja reviewed May 26, 2026

View reviewed changes

Comment thread src/prime_rl/utils/client.py Outdated

cursor Bot reviewed May 26, 2026

View reviewed changes

Comment thread src/prime_rl/orchestrator/orchestrator.py

cursor Bot reviewed May 26, 2026

View reviewed changes

mikasenghaas reviewed May 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consume the typed RendererConfig surface#2635

Consume the typed RendererConfig surface#2635
hallerite wants to merge 12 commits into
mainfrom
feat/typed-renderer-config

hallerite commented May 25, 2026 •

edited by cursor Bot

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot May 26, 2026

Uh oh!

Uh oh!

Uh oh!

mikasenghaas May 26, 2026

Uh oh!

mikasenghaas May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants


		use_renderer: bool = True
		"""Use the renderer-backed TITO client (client-side tokenization via the ``renderers`` package, served by ``/v1/generate``). When True, the ``[orchestrator.renderer]`` block (name / tool_parser / reasoning_parser / pool_size) applies. Default for both text-only and VLM rollouts; VLMs require it. False falls back to MITO (``openai_chat_completions``)."""

Conversation

hallerite commented May 25, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

TOML before / after

Stacking

Validation

Test plan

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 26, 2026

Choose a reason for hiding this comment

Missing breaking config changelog

Uh oh!

Uh oh!

Uh oh!

mikasenghaas May 26, 2026

Choose a reason for hiding this comment

Uh oh!

mikasenghaas May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hallerite commented May 25, 2026 •

edited by cursor Bot

Loading