fix(parsing): accept bare-string args for union schemas with string#63
Merged
Conversation
For schemas like ``anyOf: [{"type": "string"}, {"type": "boolean"}]``
the top-level ``type`` key is absent, so ``_coerce_arg_value`` fell
through the pure-string short-circuit and tried ``json.loads`` on the
bare text. The text-fallback path was correct (right value), but the
function flagged ``used_json_fallback=True`` — causing
``parse_qwen35`` to stamp ``ToolCallParseStatus.INVALID_JSON``.
Downstream, ``verifiers/clients/renderer_client.py`` silently dropped
all non-OK tool calls, so any tool with a ``str | bool`` (or similar)
parameter where the model emitted a string was effectively erased —
the env saw no ``tool_calls`` and ended the rollout via the
``no_tools_called`` stop predicate.
Detect ``anyOf``/``oneOf`` branches containing ``"string"`` and treat
the json-failure → text fallback as expected (not flagged) under those
schemas. Matches the docstring's stated intent — "the string branch
wins as fallback" — which the original implementation didn't enforce.
Tests: extend ``test_tool_arg_type_preservation.py`` with a
parametrized case set for union-with-string schemas, asserting both
``status == OK`` and value preservation across the existing XML-style
parsers.
hallerite
approved these changes
May 26, 2026
ApprovabilityVerdict: Approved This is a straightforward bug fix that corrects false INVALID_JSON flags when parsing tool arguments with union schemas (anyOf/oneOf) containing string types. The change is small, well-documented, and includes comprehensive tests validating the fix. You can customize Macroscope's approvability policy. Learn more. |
hallerite
added a commit
that referenced
this pull request
May 26, 2026
Folds in #63's bare-string-for-union-schemas fix and refactors _coerce_arg_value to use vLLM's current shared helpers (extract_types_from_schema + coerce_to_schema_type from vllm/tool_parsers/utils.py, landed in vLLM #43025 on 2026-05-19), replacing this branch's earlier single-type ladder. Why: #52 was originally written against pre-refactor vLLM and would have regressed #63 (anyOf schemas routed to "object" → json.loads on bare strings flagged INVALID_JSON). vLLM's new shape recursively flattens anyOf/oneOf/allOf into a type set and walks a priority- ordered ladder (null > integer > number > boolean > object > array > string) where string is the always-succeeding terminal — which absorbs #63's case as a side effect. Behavior changes relative to the pre-merge branch: - Union schemas (anyOf/oneOf/allOf) recursively flatten — Union[str,X] now accepts bare strings without flagging (the #63 win). - No top-level "null" short-circuit. Null coercion only fires when "null" is in the schema's type set (Optional[X] → anyOf [X, null] or type: ["X", "null"]). String-typed "null" stays a string. - "yes" / non-boolean text for bool-typed param returns raw text + INVALID flag (was: False + INVALID). - Number branch demotes whole floats to int regardless of source shape (1e3 → 1, 1.0 → 1) — matches vLLM's val.is_integer() rule. - No ast.literal_eval anywhere — vLLM doesn't use it. Python-literal dicts ('k':1) no longer parse for object params. - "binary" dropped as a bool alias (not in vLLM's _TYPE_ALIASES). Renderers-specific deviation kept: used_fallback flag returned alongside the value, propagated to ToolCallParseStatus.INVALID_JSON for the verifier / RL-loss signal. vLLM has no such signal. Tests updated for new behavior; full suite green (1775+99=1874 passed). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fix
_coerce_arg_valueto recognizeanyOf/oneOfschemas with a string branch, so XML-style parsers (Qwen3.5, Qwen3.6, GLM, MiniMax, Laguna) don't wrongly flag bare-string args asINVALID_JSON.The bug
form_inputand similar tools have parameters typed asUnion[str, bool](orOptional[str]), which Pydantic serialises to:{"anyOf": [{"type": "string"}, {"type": "boolean"}]}The top-level
"type"key is absent. So_coerce_arg_value:"string"type).json.loads("LIS")→JSONDecodeError.(text, True)— flagging the fallback as malformed-JSON.parse_qwen35then OR'sused_fallbackacross all params intoany_json_fallback, and stamps the callToolCallParseStatus.INVALID_JSON. Downstream,verifiers/clients/renderer_client.pysilently dropped all non-OK tool calls (lines 604-609 on main), so anyform_input(value="LIS", ...)was effectively erased before reaching the env. The env saw notool_callsand terminated the rollout via theno_tools_calledstop predicate.Empirically (mini-browse-apps multi-turn eval, Qwen3.6-35B-A3B, n=20):
verifiersfilter relaxation: 0.55, basically at MITO parity (/v1/chat/completionsgot 0.60 on the same slice; 4 of MITO's failures were unrelated vLLM mm-cache crashes)The fix
Detect union (
anyOf/oneOf) branches containing"string"and treat thejson.loads-fails → text-fallback path as expected behavior under those schemas, not malformed. That matches the function's docstring intent ("the string branch wins as fallback") which the original code didn't actually enforce.The pure-string short-circuit (
type: "string"andtype: ["string"]) is unchanged.Test plan
Extended
tests/test_tool_arg_type_preservation.pywith one new parametrized testtest_union_with_string_emits_ok_statuscovering three schema shapes:anyOf: [string, boolean]anyOf: [string, null]oneOf: [string, integer]Each case round-trips a bare-string arg through the existing XML-style parsers (Qwen3.5, GLM-5, MiniMax-M2.5, Laguna-XS.2) plus the two JSON-format controls (Qwen3, Kimi K2.5), and asserts both
status == OKand value preservation. All 18 parametrized cases pass.The matching
verifierschange (stop silently dropping non-OKParsedToolCallinrenderer_client.py) is a separate PR — together they restore full multi-turn parity with the/v1/chat/completionsroute.Note
Fix
_coerce_arg_valueto accept bare strings for union schemas containing a string branchWhen a parameter schema uses
anyOforoneOfand includes astringtype, bare string arguments failed JSON parsing and were incorrectly marked withused_json_fallback=True, causing them to be treated as malformed input.stringbranch and returns the raw string withused_json_fallback=Falseon JSON parse failure.anyOfandoneOfunions with string plus boolean/null/integer branches, assertingOKstatus and unchanged argument values.Macroscope summarized 9c30dc0.
Note by @hallerite:
A few factual additions from a reviewer pass:
parse_qwen35(seerenderers/nemotron3.py:33), so the affected set is Qwen3.5, Qwen3.6, Nemotron3, GLM-4.5/5, Laguna-XS.2, MiniMax-M2.parse_qwen3(Qwen3, Qwen3-VL),parse_deepseek_v3,parse_kimi_k2/parse_kimi_k2_section,parse_gpt_oss— theyjson.loadsthe entire args block, so unions work natively.qwen3xml_tool_parser._get_param_type(vllm/tool_parsers/qwen3xml_tool_parser.py:993-1009) defaults missingtypeto"string"and neverjson.loadsfor that branch — never flags this case, but also can't recoverint/boolfrom aUnion[str, int]. This PR still triesjson.loadsfirst and only suppresses the fallback flag when a string branch exists.anyOf/oneOfone level deep — nested unions won't be detected. Pydantic v2 flattens these.{"type": ["string", "boolean"]}isn't handled; existing short-circuit only catches single-element["string"]. Pydantic doesn't emit this shape.param_schema.get("anyOf") or param_schema.get("oneOf") or []short-circuits — a schema with both keys would ignoreoneOf.Note
Medium Risk
Changes shared argument coercion used by multiple XML tool parsers; behavior is narrower (fewer false INVALID_JSON) but affects how non-OK tool calls are classified downstream.
Overview
Fixes
_coerce_arg_valueinrenderers/parsing.pyso XML-style tool parsers no longer mark bare-string argument values as malformed JSON when the parameter schema is a union that includesstring(anyOf/oneOf), not only when the top-leveltypeis"string".After
json.loadsfails, the helper now scans union branches for astringtype and setsused_json_fallbackonly when a string outcome is not allowed. That stops parsers (Qwen3.5-style XML, GLM, MiniMax, Laguna, etc.) from aggregating a falseINVALID_JSONstatus on otherwise valid calls—e.g. Pydanticstr | booltools where values like"LIS"are emitted verbatim in<arg_value>tags.tests/test_tool_arg_type_preservation.pyaddstest_union_with_string_emits_ok_status(three union shapes × existing renderer matrix) assertingToolCallParseStatus.OKand preserved argument values for a bare string.Reviewed by Cursor Bugbot for commit 9c30dc0. Bugbot is set up for automated code reviews on this repo. Configure here.