Skip to content

Commit 3537932

Browse files
committed
feat(utilities): add dereference_json_schema for inlining $defs/$ref
Pydantic's model_json_schema() emits $defs + $ref for nested models; some MCP clients don't resolve those during tools/list discovery, leaving fields with nested types unusable. dereference_json_schema is an opt-in helper that inlines internal $defs references while leaving external refs and self-referential definitions intact. The function lives in mcp.server.mcpserver.utilities.json_schema. No existing call sites are changed — this is purely additive. Callers who want fully-flat schemas can post-process the output of model_json_schema() with one extra call. Covers: empty schemas, single inline, multiple refs to same def, arrays of nested models, anyOf, transitive resolution, sibling-key merging, external/unknown ref preservation, direct self-reference, mutual recursion. 21 tests; existing func_metadata + tool_manager tests still pass.
1 parent 161834d commit 3537932

2 files changed

Lines changed: 503 additions & 0 deletions

File tree

Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
"""JSON Schema post-processing utilities.
2+
3+
Pydantic's :meth:`pydantic.BaseModel.model_json_schema` emits a ``$defs``
4+
block with named definitions for nested models, and ``$ref`` pointers
5+
at each use site. The output is fully valid JSON Schema, but some MCP
6+
clients don't resolve internal references when discovering tools via
7+
``tools/list``. For those clients, fields whose types are nested models
8+
become unusable.
9+
10+
This module provides :func:`dereference_json_schema`, an opt-in helper
11+
that inlines ``$defs`` references into the schema body. It's safe to
12+
apply at any point in the schema lifecycle (the function does not
13+
mutate its input) and conservative about edge cases — external
14+
references and self-referential definitions are preserved unchanged.
15+
"""
16+
17+
from __future__ import annotations
18+
19+
import copy
20+
from typing import Any, cast
21+
22+
_INTERNAL_DEFS_PREFIX = "#/$defs/"
23+
24+
25+
def dereference_json_schema(schema: dict[str, Any]) -> dict[str, Any]:
26+
"""Return ``schema`` with internal ``$defs`` references inlined.
27+
28+
Replaces each ``{"$ref": "#/$defs/<name>"}`` node with the body of
29+
the corresponding ``$defs`` entry, recursing into nested objects
30+
and arrays. Self-referential definitions (a model whose schema
31+
contains a ``$ref`` back to itself, directly or transitively) are
32+
preserved as ``$ref`` at the cycle boundary; the ``$defs`` entries
33+
they depend on are retained so the output schema remains valid.
34+
35+
Only internal ``#/$defs/`` references are resolved. External
36+
references (URLs, pointers into other documents) are preserved
37+
verbatim. Sibling keys alongside a ``$ref`` (allowed in JSON Schema
38+
2020-12 and emitted by Pydantic for some types) are merged into
39+
the resolved object, with sibling values taking precedence.
40+
41+
Args:
42+
schema: A JSON Schema dict. Not mutated.
43+
44+
Returns:
45+
A new schema dict with internal refs inlined. The top-level
46+
``$defs`` block is removed when fully dereferenced, and
47+
retained (containing only the cycle definitions) otherwise.
48+
49+
Example::
50+
51+
from pydantic import BaseModel
52+
from mcp.server.mcpserver.utilities.json_schema import (
53+
dereference_json_schema,
54+
)
55+
56+
class Address(BaseModel):
57+
street: str
58+
city: str
59+
60+
class Person(BaseModel):
61+
name: str
62+
home: Address
63+
64+
flat = dereference_json_schema(Person.model_json_schema())
65+
# flat["properties"]["home"] is now the full Address schema
66+
# rather than {"$ref": "#/$defs/Address"}.
67+
"""
68+
schema = copy.deepcopy(schema)
69+
defs: dict[str, Any] = schema.pop("$defs", {}) or {}
70+
if not defs:
71+
return schema
72+
73+
cycle_roots: set[str] = set()
74+
75+
def _expand(node: Any, resolving: tuple[str, ...]) -> Any:
76+
if isinstance(node, dict):
77+
node_dict = cast("dict[str, Any]", node)
78+
ref = node_dict.get("$ref")
79+
if isinstance(ref, str) and ref.startswith(_INTERNAL_DEFS_PREFIX):
80+
name = ref[len(_INTERNAL_DEFS_PREFIX) :]
81+
if name not in defs:
82+
# Unknown ref — preserve verbatim so the schema stays
83+
# honest about what wasn't resolved.
84+
return node_dict
85+
if name in resolving:
86+
# Cycle: leave the $ref in place at the boundary and
87+
# remember to keep the corresponding definition in
88+
# the output's $defs.
89+
cycle_roots.add(name)
90+
return node_dict
91+
resolved = _expand(defs[name], (*resolving, name))
92+
siblings: dict[str, Any] = {
93+
k: v for k, v in node_dict.items() if k != "$ref"
94+
}
95+
if not siblings:
96+
return resolved
97+
if isinstance(resolved, dict):
98+
resolved_dict = cast("dict[str, Any]", resolved)
99+
merged: dict[str, Any] = dict(resolved_dict)
100+
for k, v in siblings.items():
101+
merged[k] = _expand(v, resolving)
102+
return merged
103+
# $ref pointed at a non-dict (shouldn't happen with
104+
# well-formed schemas, but stay defensive).
105+
return node_dict
106+
expanded_children: dict[str, Any] = {
107+
k: _expand(v, resolving) for k, v in node_dict.items()
108+
}
109+
return expanded_children
110+
if isinstance(node, list):
111+
node_list = cast("list[Any]", node)
112+
return [_expand(item, resolving) for item in node_list]
113+
return node
114+
115+
expanded = _expand(schema, ())
116+
assert isinstance(expanded, dict)
117+
result: dict[str, Any] = cast("dict[str, Any]", expanded)
118+
119+
if cycle_roots:
120+
# Re-expand each cycle root with itself in the resolving set so
121+
# the inner $ref stays at the boundary while other nested refs
122+
# in the definition body are inlined.
123+
result["$defs"] = {
124+
name: _expand(defs[name], (name,)) for name in cycle_roots
125+
}
126+
127+
return result

0 commit comments

Comments
 (0)