feat: add Llama-3 renderer for Llama-3.2-1B/3B-Instruct#9
Conversation
Hand-coded Llama3Renderer mirroring Meta's Llama-3.x chat template. Initial scope: Llama-3.2-1B-Instruct and Llama-3.2-3B-Instruct (and the unrestricted unsloth/... mirrors with byte-identical chat templates). MODEL_RENDERER_MAP routes the canonical meta-llama paths; tests load via the unsloth mirrors so CI doesn't need an HF_TOKEN with Meta license access. Implementation notes: * No <think> / reasoning channel — preserve_*_thinking constructor flags raise NotImplementedError if set (matches DefaultRenderer's contract for the same case). * <|begin_of_text|> (BOS) is emitted at the start of every render. The system block is emitted UNCONDITIONALLY with a fixed "Cutting Knowledge Date / Today Date" preamble even when no system message is supplied. date_string is a constructor kwarg pinned at "26 Jul 2024" by default (matches the chat template's strftime fallback); override per instance for production runs that want today's date. * tools_in_user_message defaults to True. Tools + JSON signatures inject into the first user message; pass False at construction to flip to system-block mode. Both modes parity-tested. * Single tool call per assistant message (chat template raises otherwise). Tool calls render as a JSON blob inside the assistant body. Tool responses render under role ipython regardless of source role; mirrors the chat template's content|tojson branch including the Jinja quirk that strings are iterable so plain-string tool content gets JSON-quoted. * parse_llama_3 detects the JSON tool-call body shape with a strict check; malformed JSON falls through to content. 47 dedicated tests covering map shape, constructor contract, byte parity across 11 conversation shapes (including tool calls, multi-turn, custom date, tools-in-system mode), parse_response, and bridge contract. Full suite: 947 passed, 48 skipped, 1 xfailed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Resolve conflicts in renderers/__init__.py and renderers/base.py: - Add LagunaXS2Renderer (origin/main) alongside Llama3Renderer (PR). - Rename Llama-3 registry key from "llama_3" to "llama-3" to match origin/main's hyphenated convention (also applied to deepseek-v3, kimi-k2, kimi-k2.5, nemotron-3, gpt-oss). Update the matching MODEL_RENDERER_MAP entries and tests/test_llama_3.py assertions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit c5d2aa5. Configure here.
| emit_special(self._end_header, -1) | ||
| emit_text("\n\n", -1) | ||
|
|
||
| return previous_ids + ext |
There was a problem hiding this comment.
Bridge returns list not RenderedTokens
High Severity
Llama3Renderer.bridge_to_next_turn returns a bare list[int], while the Renderer protocol and every other hand-coded renderer return RenderedTokens | None with tokens in .token_ids. Callers such as RendererPool and tests/test_bridge.py use bridged.token_ids, which raises AttributeError on a list.
Reviewed by Cursor Bugbot for commit c5d2aa5. Configure here.
ApprovabilityVerdict: Needs human review This PR introduces a new Llama-3 renderer (~400 lines of new code), which constitutes a new feature/capability requiring human review. Additionally, an unresolved high-severity review comment identifies a protocol violation in You can customize Macroscope's approvability policy. Learn more. |


Summary
Hand-coded
Llama3Rendererfor Meta's Llama-3.x chat template, plus matchingparse_llama_3parser. Initial scope: Llama-3.2-1B-Instruct and Llama-3.2-3B-Instruct (auto-routed viaMODEL_RENDERER_MAP). No version bump.How tests work without a Meta-license HF token
MODEL_RENDERER_MAPregisters the canonicalmeta-llama/...paths so production callers auto-route. Tests load the tokenizer via the unrestrictedunsloth/Llama-3.2-{1B,3B}-Instructmirror — the chat-template SHA matches Meta's bit-for-bit and the underlying tiktoken-BPE files are identical. CI doesn't need an HF_TOKEN with Meta license access.Implementation notes
<think>/ reasoning channel — Llama-3 doesn't ship one.preserve_*_thinkingconstructor flags raiseNotImplementedErrorif set (matchesDefaultRenderer's contract for the same case).<|begin_of_text|>(BOS) is emitted at the start of every render; system block is always emitted with the fixedCutting Knowledge Date / Today Datepreamble even when no system message is supplied.date_stringis a constructor kwarg, defaulting to"26 Jul 2024"(the chat template'sstrftimefallback) so output stays deterministic. Override per-instance for production runs that want today's date.tools_in_user_messagedefaults toTrue(matches chat template). Tools + JSON signatures inject into the first user message; passFalseto flip to system-block mode. Both modes parity-tested.{"name": "...", "parameters": ...}inside the assistant body. Tool responses render under roleipythonregardless of source role; mirrors the chat template'scontent | tojsonbranch — including the Jinja quirk that strings are iterable, so plain-string tool content gets JSON-quoted.parse_llama_3detects the JSON tool-call body shape with a strict starts-with-{+ parses-as-dict-with-namecheck; malformed JSON falls through tocontentrather than dropping silently.Tests
47 dedicated tests in
tests/test_llama_3.py:MODEL_RENDERER_MAPshape + factory routingpreserve_*_thinkingrejection,tools_in_user_messagetoggle)apply_chat_templateacross 11 conversation shapes (system + user, user-only, multi-turn, gen prompt, whitespace trimming, custom date, tools-in-user, tools-in-system, tool call round-trip, dict tool response, multiple-tool-calls rejection)parse_response(plain, tool call, malformed JSON fallthrough)Test plan
pytest tests/test_llama_3.py— 47 cases pass on both 1B and 3B mirrorspytest tests/ --ignore=tests/test_client.py) — 947 pass, 48 skipped, 1 xfailed (no regressions)meta-llama/Llama-3.2-1B-Instructparity directly (the unsloth mirror has been bit-verified, but a once-off canonical run is good defense in depth)🤖 Generated with Claude Code
Note
Medium Risk
Adds new model-specific rendering/parsing and auto-routing for
meta-llama/Llama-3.2-*which can change prompt/token generation and tool-call handling for those models. Risk is mitigated by extensive parity tests but any template mismatch would affect downstream training/inference correctness.Overview
Adds a new hand-coded
Llama3Rendererimplementing Meta Llama-3.2 Instruct chat-template rendering (including deterministicdate_string, tool injection mode, tool-call/response formatting, stop tokens, and abridge_to_next_turnfast-path).Wires the renderer into exports and auto-detection by extending
MODEL_RENDERER_MAP/RENDERER_REGISTRYwith a newllama-3entry formeta-llama/Llama-3.2-1B/3B-Instruct, and addsparse_llama_3to interpret Llama-3 JSON tool-call completions.Introduces a dedicated
tests/test_llama_3.pysuite that byte-compares renderer output againsttokenizer.apply_chat_template(usingunsloth/...mirrors) and covers tools, parsing, and bridge behavior.Reviewed by Cursor Bugbot for commit c5d2aa5. Bugbot is set up for automated code reviews on this repo. Configure here.
Note
Add Llama3Renderer for Llama-3.2-1B/3B-Instruct chat template rendering
Llama3Renderer, a deterministic renderer for Llama-3.x Instruct models implementing token/ID rendering, tool-call emission, ipython tool responses, response parsing, stop tokens, and turn bridging.meta-llama/Llama-3.2-1B-Instructandmeta-llama/Llama-3.2-3B-InstructinMODEL_RENDERER_MAPsocreate_renderer(..., renderer='auto')routes to the new renderer;renderer='llama-3'also resolves via the registry.{"name": ..., "parameters": ...}; multiple tool calls per assistant message raise an error.parse_llama_3in renderers/parsing.py parses this back intoParsedResponse.tool_calls, falling back to raw content for non-JSON output.preserve_*_thinkingflags raiseNotImplementedError; the defaultdate_stringis hardcoded to'26 Jul 2024'.Macroscope summarized c5d2aa5.