Skip to content

Consume the typed RendererConfig surface#2635

Open
hallerite wants to merge 12 commits into
mainfrom
feat/typed-renderer-config
Open

Consume the typed RendererConfig surface#2635
hallerite wants to merge 12 commits into
mainfrom
feat/typed-renderer-config

Conversation

@hallerite
Copy link
Copy Markdown
Member

@hallerite hallerite commented May 25, 2026

Summary

Switches the orchestrator and SFT trainer to consume the typed
renderers.RendererConfig discriminated union (rendered in
PrimeIntellect-ai/renderers#60) and the matching typed
vf.ClientConfig.renderer_config field
(PrimeIntellect-ai/verifiers#1467).

  • configs.shared.RendererConfig composes renderers.RendererConfig as a single typed settings field. The flat name / tool_parser / reasoning_parser / preserve_* fields are gone — pydantic dispatches on the settings.name discriminator.
  • setup_inference_pool / setup_clients / StaticInferencePool / ElasticInferencePool collapse a six-arg renderer surface to one renderer_config arg.
  • Orchestrator and SFT trainer call sites become create_renderer(tokenizer, config.renderer.settings).
  • Cross-env / unmapped-model validators now inspect renderer.settings.

TOML before / after

Before:

[orchestrator.renderer]
name = "qwen3.5"
tool_parser = "qwen3.5"
preserve_all_thinking = true

After:

[orchestrator.renderer.settings]
name = "qwen3.5"
enable_thinking = false
add_vision_id = true
preserve_all_thinking = true

Bogus combinations (e.g. add_vision_id under name = "qwen3") raise pydantic.ValidationError at config-load time instead of --dry-run.

Stacking

Pins deps/renderers submodule and the prime-rl-configs renderers dep to a git SHA on the renderers PR's head while it's in review. When that lands and a new dev release ships, this PR bumps to the PyPI release.

Validation

Static-only at the moment — the renderers / verifiers branches haven't shipped to PyPI, so the prime-rl venv can't be re-synced to run pytest against the new typed surface yet.

  • uv run ruff check and uv run ruff format — clean (pre-commit applied one auto-format)
  • Updated unit tests construct typed configs; will run on CI after the renderers + verifiers deps resolve

Test plan

  • CI runs once both upstream PRs land
  • Manual: round-trip a TOML config through OrchestratorConfig and confirm config.renderer.settings deserializes to the expected variant
  • Manual: trigger the unmapped-model validator and confirm the new error message references renderer.settings.name

Note

Medium Risk
Changes how rollout tokenization is selected (TITO vs MITO) and config validation for VLMs and auto-renderer resolution; misconfigured TOML could fail at load time or pick the wrong client until upstream renderers/verifiers deps align.

Overview
Replaces the orchestrator.use_renderer boolean with orchestrator.renderer: a typed renderers.RendererConfig discriminated union (default AutoRendererConfig) when using the renderer/TITO path, or None for MITO. The duplicate flat RendererConfig in configs.shared is removed in favor of the renderers package; create_renderer and verifiers clients now take a single renderer_config instead of scattered **name / parsers / preserve_*` fields.

orchestrator.pool_size moves to the orchestrator root (no longer under [orchestrator.renderer]). Validators and docs are updated accordingly (VLM requires a non-None renderer, SFT forces renderer=None, pool_size only when a renderer is set). A custom serializer writes renderer = "None" so MITO configs round-trip through saved orch.toml. Sample configs, multimodal docs, prime-rl-configs renderers pin, and unit tests follow the new shape.

Reviewed by Cursor Bugbot for commit 310a2fa. Bugbot is set up for automated code reviews on this repo. Configure here.

hallerite and others added 2 commits May 25, 2026 22:03
Reshapes the orchestrator / SFT renderer plumbing around
``renderers.RendererConfig`` (the typed discriminated union) instead of
flat strings + booleans.

- ``configs.shared.RendererConfig`` now composes ``renderers.RendererConfig``
  as a single ``settings`` field; ``pool_size`` stays as a separate
  prime-rl-side concern. The flat ``name`` / ``tool_parser`` /
  ``reasoning_parser`` / ``preserve_*`` fields are gone.
- ``setup_inference_pool`` / ``setup_clients`` / ``StaticInferencePool``
  / ``ElasticInferencePool`` collapse their six-arg renderer surface to
  a single ``renderer_config: RendererConfig | None`` plus the orthogonal
  ``renderer_model_name`` / ``renderer_pool_size`` fields.
- Orchestrator + SFT trainer call sites pass the typed config straight
  through to ``create_renderer(tokenizer, config.renderer.settings)``.
- Cross-env / unmapped-model validators rewritten to inspect
  ``renderer.settings`` (discriminated-union shape) instead of the flat
  field names.
- Tests updated to construct typed configs (``Qwen3VLRendererConfig()``,
  etc.) and assert against the new call shapes.

Pins ``deps/renderers`` and the ``prime-rl-configs`` ``renderers`` dep to
the renderers PR's tip
(``4c9099debbc9bc4fcc3eb6a118bbfa5c364ec395``) until it merges and a dev
release ships.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The typed-config refactor landed on renderers main as ``c86c50b``
(tag ``renderers-v0.1.8.dev27``). Switching:

- ``deps/renderers`` submodule pointer → ``c86c50b``
- ``prime-rl-configs`` git URL pin → ``@renderers-v0.1.8.dev27``

Same commit, more readable, and stable to anyone reading the diff
after the feature branch is deleted. PyPI dev release for the same
version is pending a publisher-config fix; once it lands the git URL
pin can collapse to a plain ``renderers>=0.1.8.dev27`` PyPI pin.
Comment thread packages/prime-rl-configs/src/prime_rl/configs/shared.py
The new RendererConfig accepts only `settings` (typed
`renderers.RendererConfig` discriminated union) and `pool_size`, so
flat `name`/`reasoning_parser`/`preserve_*` keys under
`[orchestrator.renderer]` now fail pydantic validation. Move those
keys under `[orchestrator.renderer.settings]`. `pool_size` stays in
the outer block. Two configs that only set `preserve_all_thinking`
gain an explicit `name = "auto"` since the discriminator is required.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread packages/prime-rl-configs/src/prime_rl/configs/orchestrator.py Outdated
Comment thread packages/prime-rl-configs/src/prime_rl/configs/shared.py Outdated
Per Mika's review: drop prime-rl's RendererConfig wrapper that nested
the renderers' discriminated union under a ``.settings`` field. The
wrapper bought a parallel ``pool_size`` attribute but cost a clunky
two-level TOML layout and made every call site dereference ``.settings``.

Use ``renderers.RendererConfig`` (the typed discriminated union) as
the field type directly. Pool size moves up one level to
``orchestrator.renderer_pool_size``. ``use_renderer: bool`` is gone —
``renderer: RendererConfig | None`` with ``None`` for MITO replaces it
(consistent with how SFT, optimizer, and other optional sub-configs
express absence). TOML callers spell MITO with ``renderer = "None"``,
which ``prime_pydantic_config.BaseConfig._none_str_to_none`` coerces
back to Python None.

Net effect on TOMLs is a wash with the previous commit: 26 in-repo
configs lose the ``.settings`` indirection they gained yesterday, and
one (multimodal) splits ``pool_size`` out of the renderer block.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread tests/unit/train/test_runs.py
hallerite and others added 6 commits May 25, 2026 23:21
The slim ``prime-rl-configs`` wheel (installed standalone, outside the
workspace) now resolves ``renderers`` from PyPI instead of the git tag
URL it used while the dev release was waiting on PyPI trusted-publisher
config. The full ``prime-rl`` workspace still pulls renderers from the
``deps/renderers`` submodule, so this only affects downstream slim
consumers.

Re-locks to refresh the lockfile against the same workspace state.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The slim ``prime-rl-configs`` install path (configs only, no GPU
deps) asserts that importing the config classes doesn't pull in
``transformers``. ``from renderers import RendererConfig`` at the top
of ``orchestrator.py`` / ``sft.py`` broke that: ``renderers/__init__``
eagerly imports every concrete renderer module (DefaultRenderer,
Qwen3Renderer, …), each of which imports
``transformers.tokenization_utils.PreTrainedTokenizer``.

Defer the renderers import via:
  - ``TYPE_CHECKING``-guarded ``from renderers import RendererConfig``
    so static type checkers still see the typed discriminated union.
  - ``renderer: Any`` field annotation at runtime, so no resolution
    happens at class definition.
  - ``model_validator(mode="before") _resolve_renderer`` that
    lazy-imports ``renderers``, fills the orchestrator default
    (``AutoRendererConfig()``), and coerces dict input through the
    typed union via ``TypeAdapter``. SFT's default stays ``None``.

End-user semantics unchanged; class import is now transformers-free.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dev28 lazy-loads concrete renderer classes via PEP 562
``__getattr__`` (renderers#64), so ``from renderers import
RendererConfig`` no longer drags ``transformers`` into ``sys.modules``.
The slim-install CI gate is satisfied without the prior workaround.

Drop the ``Any`` annotation and the ``_resolve_renderer``
model-validator that lazy-resolved the discriminated union from
``OrchestratorConfig`` and ``SFTConfig``. The fields are typed
``RendererConfig | None`` again, and pydantic does the full
discriminator validation at class definition.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- deps/renderers c86c50b -> 2ec28a8 — picks up the PEP 562
  ``__getattr__`` so the workspace install path also lazy-loads
  concrete renderer classes.
- deps/verifiers 04651be -> d107442 — adds the typed
  ``ClientConfig.renderer_config`` field that ``setup_clients``
  forwards. Fixes the unit-test
  ``test_setup_clients_assigns_renderer_and_dp_rank_headers``
  AttributeError on ``renderer_config``.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread src/prime_rl/utils/client.py Outdated
Comment thread src/prime_rl/orchestrator/orchestrator.py
Two cleanups on the typed renderer surface:

1. Per Samsja: drop the ``RendererConfig as RendererSettings`` alias
   in ``client.py`` / ``elastic.py``. The alias was a workaround for the
   former local ``RendererConfig`` wrapper class; with that gone, we
   just import ``RendererConfig`` from renderers directly.

2. Per Cursor bugbot: ``OrchestratorConfig.renderer = None`` (MITO) was
   silently dropped by ``model_dump(exclude_none=True)``, so any saved
   ``control/orch.toml`` would reload as the default
   ``AutoRendererConfig()`` (TITO). Add a wrapping ``model_serializer``
   that re-injects ``renderer = "None"`` (string) when the field is
   ``None``, matching what ``BaseConfig._none_str_to_none`` accepts on
   reload. MITO and TITO now round-trip through TOML.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 2072afe. Configure here.


use_renderer: bool = True
"""Use the renderer-backed TITO client (client-side tokenization via the ``renderers`` package, served by ``/v1/generate``). When True, the ``[orchestrator.renderer]`` block (name / tool_parser / reasoning_parser / pool_size) applies. Default for both text-only and VLM rollouts; VLMs require it. False falls back to MITO (``openai_chat_completions``)."""

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing breaking config changelog

Medium Severity

This PR removes orchestrator.use_renderer, moves renderer pool sizing to orchestrator.renderer_pool_size, and replaces the flat [orchestrator.renderer] schema with the typed renderers.RendererConfig union (MITO via renderer = "None"). Those are breaking config changes, but CHANGELOG.md was not updated with a migration entry.

Fix in Cursor Fix in Web

Triggered by project rule: BugBot Instructions

Reviewed by Cursor Bugbot for commit 2072afe. Configure here.

Comment thread packages/prime-rl-configs/src/prime_rl/configs/orchestrator.py Outdated
Comment thread packages/prime-rl-configs/src/prime_rl/configs/orchestrator.py Outdated
renderer: RendererConfig = RendererConfig()
"""Client-side renderer configuration. Only consumed when ``use_renderer=true``."""
renderer: RendererConfig | None = Field(default_factory=AutoRendererConfig)
"""Typed renderer config (``renderers.RendererConfig`` discriminated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make this docstring a bit more concise

Comment on lines +587 to +599
@model_serializer(mode="wrap")
def _preserve_mito_renderer(self, handler: SerializerFunctionWrapHandler) -> dict[str, Any]:
"""Emit ``renderer = "None"`` (string) when MITO so
``model_dump(exclude_none=True)`` round-trips: dumped TOML has
``renderer = "None"``, and on reload
``BaseConfig._none_str_to_none`` coerces it back to ``None``.
Without this, a MITO orchestrator config saved to
``control/orch.toml`` would lose the renderer key entirely and
reload as the default ``AutoRendererConfig()`` (TITO)."""
result = handler(self)
if self.renderer is None:
result["renderer"] = "None"
return result
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm this is code smell from the pydantic-config side. can merge for now but can you put an issue to fix this one

…cstring

- renderer field uses direct AutoRendererConfig() default (frozen, no
  factory needed)
- renderer_pool_size -> pool_size on OrchestratorConfig; cascade through
  client.py, elastic.py, orchestrator.py call sites and the test;
  vf.ClientConfig kwarg name (verifiers-side) stays renderer_pool_size
- shorter docstring on the renderer field

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants