feat: add Llama-3 renderer for Llama-3.2-1B/3B-Instruct by hallerite · Pull Request #9 · PrimeIntellect-ai/renderers

hallerite · 2026-05-07T17:39:20Z

Summary

Hand-coded Llama3Renderer for Meta's Llama-3.x chat template, plus matching parse_llama_3 parser. Initial scope: Llama-3.2-1B-Instruct and Llama-3.2-3B-Instruct (auto-routed via MODEL_RENDERER_MAP). No version bump.

How tests work without a Meta-license HF token

MODEL_RENDERER_MAP registers the canonical meta-llama/... paths so production callers auto-route. Tests load the tokenizer via the unrestricted unsloth/Llama-3.2-{1B,3B}-Instruct mirror — the chat-template SHA matches Meta's bit-for-bit and the underlying tiktoken-BPE files are identical. CI doesn't need an HF_TOKEN with Meta license access.

Implementation notes

No <think> / reasoning channel — Llama-3 doesn't ship one. preserve_*_thinking constructor flags raise NotImplementedError if set (matches DefaultRenderer's contract for the same case).
<|begin_of_text|> (BOS) is emitted at the start of every render; system block is always emitted with the fixed Cutting Knowledge Date / Today Date preamble even when no system message is supplied.
date_string is a constructor kwarg, defaulting to "26 Jul 2024" (the chat template's strftime fallback) so output stays deterministic. Override per-instance for production runs that want today's date.
tools_in_user_message defaults to True (matches chat template). Tools + JSON signatures inject into the first user message; pass False to flip to system-block mode. Both modes parity-tested.
Single tool call per assistant message (chat template raises otherwise). Tool calls render as a JSON blob {"name": "...", "parameters": ...} inside the assistant body. Tool responses render under role ipython regardless of source role; mirrors the chat template's content | tojson branch — including the Jinja quirk that strings are iterable, so plain-string tool content gets JSON-quoted.
parse_llama_3 detects the JSON tool-call body shape with a strict starts-with-{ + parses-as-dict-with-name check; malformed JSON falls through to content rather than dropping silently.

Tests

47 dedicated tests in tests/test_llama_3.py:

MODEL_RENDERER_MAP shape + factory routing
Constructor contract (default date, preserve_*_thinking rejection, tools_in_user_message toggle)
Byte parity vs apply_chat_template across 11 conversation shapes (system + user, user-only, multi-turn, gen prompt, whitespace trimming, custom date, tools-in-user, tools-in-system, tool call round-trip, dict tool response, multiple-tool-calls rejection)
parse_response (plain, tool call, malformed JSON fallthrough)
Bridge contract (extends prev verbatim, matches fresh render, rejects assistant in extension, synthesises close on truncation)

Test plan

pytest tests/test_llama_3.py — 47 cases pass on both 1B and 3B mirrors
Full suite (pytest tests/ --ignore=tests/test_client.py) — 947 pass, 48 skipped, 1 xfailed (no regressions)
Pre-commit hooks (ruff check + format) clean
Maintainer with Meta-license HF_TOKEN can verify meta-llama/Llama-3.2-1B-Instruct parity directly (the unsloth mirror has been bit-verified, but a once-off canonical run is good defense in depth)

🤖 Generated with Claude Code

Note

Medium Risk
Adds new model-specific rendering/parsing and auto-routing for meta-llama/Llama-3.2-* which can change prompt/token generation and tool-call handling for those models. Risk is mitigated by extensive parity tests but any template mismatch would affect downstream training/inference correctness.

Overview
Adds a new hand-coded Llama3Renderer implementing Meta Llama-3.2 Instruct chat-template rendering (including deterministic date_string, tool injection mode, tool-call/response formatting, stop tokens, and a bridge_to_next_turn fast-path).

Wires the renderer into exports and auto-detection by extending MODEL_RENDERER_MAP/RENDERER_REGISTRY with a new llama-3 entry for meta-llama/Llama-3.2-1B/3B-Instruct, and adds parse_llama_3 to interpret Llama-3 JSON tool-call completions.

Introduces a dedicated tests/test_llama_3.py suite that byte-compares renderer output against tokenizer.apply_chat_template (using unsloth/... mirrors) and covers tools, parsing, and bridge behavior.

^{Reviewed by Cursor Bugbot for commit c5d2aa5. Bugbot is set up for automated code reviews on this repo. Configure here.}

Note

Add Llama3Renderer for Llama-3.2-1B/3B-Instruct chat template rendering

Adds renderers/llama_3.py with Llama3Renderer, a deterministic renderer for Llama-3.x Instruct models implementing token/ID rendering, tool-call emission, ipython tool responses, response parsing, stop tokens, and turn bridging.
Maps meta-llama/Llama-3.2-1B-Instruct and meta-llama/Llama-3.2-3B-Instruct in MODEL_RENDERER_MAP so create_renderer(..., renderer='auto') routes to the new renderer; renderer='llama-3' also resolves via the registry.
Tool calls are emitted as a single JSON object {"name": ..., "parameters": ...}; multiple tool calls per assistant message raise an error. parse_llama_3 in renderers/parsing.py parses this back into ParsedResponse.tool_calls, falling back to raw content for non-JSON output.
preserve_*_thinking flags raise NotImplementedError; the default date_string is hardcoded to '26 Jul 2024'.

^{Macroscope summarized c5d2aa5.}

Hand-coded Llama3Renderer mirroring Meta's Llama-3.x chat template. Initial scope: Llama-3.2-1B-Instruct and Llama-3.2-3B-Instruct (and the unrestricted unsloth/... mirrors with byte-identical chat templates). MODEL_RENDERER_MAP routes the canonical meta-llama paths; tests load via the unsloth mirrors so CI doesn't need an HF_TOKEN with Meta license access. Implementation notes: * No <think> / reasoning channel — preserve_*_thinking constructor flags raise NotImplementedError if set (matches DefaultRenderer's contract for the same case). * <|begin_of_text|> (BOS) is emitted at the start of every render. The system block is emitted UNCONDITIONALLY with a fixed "Cutting Knowledge Date / Today Date" preamble even when no system message is supplied. date_string is a constructor kwarg pinned at "26 Jul 2024" by default (matches the chat template's strftime fallback); override per instance for production runs that want today's date. * tools_in_user_message defaults to True. Tools + JSON signatures inject into the first user message; pass False at construction to flip to system-block mode. Both modes parity-tested. * Single tool call per assistant message (chat template raises otherwise). Tool calls render as a JSON blob inside the assistant body. Tool responses render under role ipython regardless of source role; mirrors the chat template's content|tojson branch including the Jinja quirk that strings are iterable so plain-string tool content gets JSON-quoted. * parse_llama_3 detects the JSON tool-call body shape with a strict check; malformed JSON falls through to content. 47 dedicated tests covering map shape, constructor contract, byte parity across 11 conversation shapes (including tool calls, multi-turn, custom date, tools-in-system mode), parse_response, and bridge contract. Full suite: 947 passed, 48 skipped, 1 xfailed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Resolve conflicts in renderers/__init__.py and renderers/base.py: - Add LagunaXS2Renderer (origin/main) alongside Llama3Renderer (PR). - Rename Llama-3 registry key from "llama_3" to "llama-3" to match origin/main's hyphenated convention (also applied to deepseek-v3, kimi-k2, kimi-k2.5, nemotron-3, gpt-oss). Update the matching MODEL_RENDERER_MAP entries and tests/test_llama_3.py assertions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit c5d2aa5. Configure here.}

cursor · 2026-05-20T13:36:53Z

+        emit_special(self._end_header, -1)
+        emit_text("\n\n", -1)
+
+        return previous_ids + ext


Bridge returns list not RenderedTokens

High Severity

Llama3Renderer.bridge_to_next_turn returns a bare list[int], while the Renderer protocol and every other hand-coded renderer return RenderedTokens | None with tokens in .token_ids. Callers such as RendererPool and tests/test_bridge.py use bridged.token_ids, which raises AttributeError on a list.

^{Reviewed by Cursor Bugbot for commit c5d2aa5. Configure here.}

macroscopeapp · 2026-05-20T13:43:58Z

Approvability

Verdict: Needs human review

This PR introduces a new Llama-3 renderer (~400 lines of new code), which constitutes a new feature/capability requiring human review. Additionally, an unresolved high-severity review comment identifies a protocol violation in bridge_to_next_turn that would cause runtime errors.

^{You can customize Macroscope's approvability policy. Learn more.}

hallerite and others added 2 commits May 7, 2026 17:38

hallerite marked this pull request as draft May 20, 2026 13:33

hallerite marked this pull request as ready for review May 20, 2026 13:33

cursor Bot reviewed May 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add Llama-3 renderer for Llama-3.2-1B/3B-Instruct#9

feat: add Llama-3 renderer for Llama-3.2-1B/3B-Instruct#9
hallerite wants to merge 2 commits into
mainfrom
feat/llama-3-renderer

hallerite commented May 7, 2026 •

edited by macroscopeapp Bot

Loading

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot May 20, 2026

Uh oh!

macroscopeapp Bot commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hallerite commented May 7, 2026 • edited by macroscopeapp Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How tests work without a Meta-license HF token

Implementation notes

Tests

Test plan

Add Llama3Renderer for Llama-3.2-1B/3B-Instruct chat template rendering

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 20, 2026

Choose a reason for hiding this comment

Bridge returns list not RenderedTokens

Uh oh!

macroscopeapp Bot commented May 20, 2026

Approvability

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hallerite commented May 7, 2026 •

edited by macroscopeapp Bot

Loading