fix: prevent InlineSchema leak into record content + defensive guard by Muizzkolapo · Pull Request #631 · Muizzkolapo/agent-actions

Muizzkolapo · 2026-05-24T10:27:23Z

Summary

Root cause prevention + multi-layer defense for a production bug where LLMs echo JSON Schema definitions back as response content, causing RecordContextError crashes in downstream actions.

Prevention (root cause)

_extract_ollama_schema now strips title from the format param sent to client.chat(). The title: "InlineSchema" key leaked framework metadata into LLM context and triggered schema-echo behavior. Ollama's format param only needs structural keys (type, properties, required, additionalProperties).

Detection (defense in depth)

Online path: _reject_schema_echo_items in _validate_llm_output_schema — runs unconditionally, replaces echoes with _parse_error dicts so reprompt retries
Batch processing: Schema-echo check in _process_successful_result — prevents corrupted content from reaching records
Batch reprompt: detect_parse_error now calls is_schema_echo — batch reprompt retries schema echoes on the same cycle

Guard (already-corrupted records)

Zero-overlap guard in scope_builder.py — if a dependency namespace has zero field overlap with declared observe fields, wraps as SKIPPED_NAMESPACE. Catches any content corruption generically, not just InlineSchema.

Changes

File	Change
`agent_actions/llm/providers/ollama/client.py`	Strip `title` from format param in `_extract_ollama_schema`
`agent_actions/utils/schema_echo.py`	Shared `is_schema_echo()` + `make_schema_echo_error()`
`agent_actions/processing/helpers.py`	`_reject_schema_echo_items` in `_validate_llm_output_schema`
`agent_actions/llm/batch/processing/batch_result_strategy.py`	Schema-echo check in `_process_successful_result`
`agent_actions/processing/evaluation/strategies/validation.py`	Schema-echo detection in `detect_parse_error`
`agent_actions/prompt/context/scope_builder.py`	Zero-overlap guard in `DependencyNamespaceBuilder.build`
5 test files	36 new tests

Test plan

36 new tests (prevention, detection, reprompt integration, zero-overlap guard)
7343 total tests pass, 2 skipped (pre-existing)
ruff check + format clean
Manual: run inline-schema workflow on Ollama to confirm no echo

🤖 Generated with Claude Code

…paths When an LLM echoes the JSON Schema definition back as its response (e.g. {"title": "InlineSchema", "type": "object", "properties": {...}}) instead of conforming data, the echoed schema was stored as record content, causing RecordContextError crashes in downstream actions. Add _is_schema_echo() detection and _reject_schema_echo_items() filter in helpers.py (online path) and batch_result_strategy.py (batch path). Schema-echo responses are replaced with _parse_error dicts so reprompt can retry. Detection runs unconditionally, not gated by skip_schema_validation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add safety-net detection in DependencyNamespaceBuilder.build() for corrupted namespaces that contain compiled JSON Schema definitions instead of actual action output. Corrupted namespaces are wrapped as SKIPPED_NAMESPACE with a warning, preventing RecordContextError crashes in downstream observe resolution. Guards already-corrupted records in the database. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Move is_schema_echo() and make_schema_echo_error() to agent_actions/utils/schema_echo.py to avoid circular imports and enable reuse across helpers.py, batch_result_strategy.py, and scope_builder.py - scope_builder.py calls shared is_schema_echo() instead of inlining the detection logic (prevents drift) - Use json.dumps() instead of str() for raw_response serialization - Avoid list allocation in happy path (_reject_schema_echo_items scans first, copies only when an echo is found) - Deduplicate _parse_error dict construction via make_schema_echo_error() Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The defensive guard in scope_builder.py now checks whether the namespace has any overlap with declared observe fields instead of string-matching title == "InlineSchema". This catches all forms of content corruption (schema-echo from any schema name, garbage data, wrong action output) generically. Guard fires when allowed_fields is a non-empty list and set(dep_data.keys()) & set(allowed_fields) is empty. Wildcard observe (action.*) sets allowed_fields=None and bypasses the guard. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

detect_parse_error() now checks for schema-echo content (via is_schema_echo) so the batch reprompt loop retries on the same cycle instead of only catching the echo during post-processing. This closes the gap where batch schema echoes were converted to error records but never triggered a retry. Also simplified batch test assertions for clarity. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The compiled Ollama schema includes "title": "InlineSchema" (from vendor_compilation.py) which leaks framework metadata into the LLM context. Ollama's format parameter only uses structural keys (type, properties, required, additionalProperties) — title is not a structural constraint and can trigger schema-echo behavior where the model returns the schema definition itself instead of conforming data. _extract_ollama_schema now strips title before passing the schema to client.chat(format=...). This prevents the root cause rather than just detecting the symptom. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

MessageBuilder._strip_schema_metadata() removes name, title, and description from schemas before injecting them into prompt text. These keys are framework labels (e.g. "InlineSchema") that leak implementation details into LLM context and can trigger schema-echo. Applied at all three prompt-injection points: - SchemaInjection.PROMPT (Ollama Cloud online + batch) - SchemaInjection.INLINE_FULL (unused but protected) - SchemaInjection.INLINE_FULL_LIST (Gemini) API-parameter paths (OpenAI, Anthropic, Groq) are unaffected — those vendors require name/title for structured output enforcement and their API-level constraints prevent echoing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Avoids recreating the set on every _strip_schema_metadata call. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Muizzkolapo and others added 8 commits May 24, 2026 11:26

refactor: promote _META_KEYS to class-level frozenset

df8ef2a

Avoids recreating the set on every _strip_schema_metadata call. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Muizzkolapo merged commit 1938035 into main May 24, 2026
5 checks passed

github-actions Bot locked and limited conversation to collaborators May 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: prevent InlineSchema leak into record content + defensive guard#631

fix: prevent InlineSchema leak into record content + defensive guard#631
Muizzkolapo merged 8 commits into
mainfrom
fix/inline-schema-stored-as-content

Muizzkolapo commented May 24, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Muizzkolapo commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Prevention (root cause)

Detection (defense in depth)

Guard (already-corrupted records)

Changes

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Muizzkolapo commented May 24, 2026 •

edited

Loading