What
Sub-agents (specialists) have no limit on tool calls. During QA, Thoughts Analyzer made ~150 tool calls — reading every single note instead of searching selectively. Need a configurable limit with graceful degradation.
Solution: Soft Landing (three phases)
1. Silent phase (0–50% of limit)
Agent works normally, unaware of any limit.
2. Countdown phase (50–100%)
After each tool response, append a line: "N tool calls remaining". The agent naturally starts prioritizing — it sees the resource is finite.
3. Forced response (at limit)
Use tool_choice: { type: "tool", name: "respond" } in the API request. The agent is forced to synthesize an answer from what it gathered.
Why tool_choice, not removing tools: Changing the tools array invalidates the prompt cache. All accumulated tool responses would be re-sent to Anthropic as a cache write (1.25x cost) instead of a cache read (0.1x cost). Using tool_choice keeps tools in place, cache stays valid, one cheap final request.
Claude Code comparison
Claude Code uses a hard cutoff: agent hits limit → immediately stops → parent gets NO response. All gathered information is lost. Our approach forces the agent to answer with what it has.
Configuration
Add tool_calls_limit to specialist front matter:
---
name: thoughts-analyzer
description: "Searches through brainstorm notes..."
tool_calls_limit: 30
---
Implementation
- Parse
tool_calls_limit from specialist YAML front matter
- Track tool call count per sub-agent session
- At 50%+: inject countdown text into tool responses
- At 100%: set
tool_choice parameter on next API request
Related
References
Full exploration: thoughts/shared/notes/2026-04-03/qa-brainstorm-event-loop-mailbox.md (Brainstorm: Sub-agent Tool Call Limits — Soft Landing)
What
Sub-agents (specialists) have no limit on tool calls. During QA, Thoughts Analyzer made ~150 tool calls — reading every single note instead of searching selectively. Need a configurable limit with graceful degradation.
Solution: Soft Landing (three phases)
1. Silent phase (0–50% of limit)
Agent works normally, unaware of any limit.
2. Countdown phase (50–100%)
After each tool response, append a line:
"N tool calls remaining". The agent naturally starts prioritizing — it sees the resource is finite.3. Forced response (at limit)
Use
tool_choice: { type: "tool", name: "respond" }in the API request. The agent is forced to synthesize an answer from what it gathered.Why tool_choice, not removing tools: Changing the tools array invalidates the prompt cache. All accumulated tool responses would be re-sent to Anthropic as a cache write (1.25x cost) instead of a cache read (0.1x cost). Using
tool_choicekeeps tools in place, cache stays valid, one cheap final request.Claude Code comparison
Claude Code uses a hard cutoff: agent hits limit → immediately stops → parent gets NO response. All gathered information is lost. Our approach forces the agent to answer with what it has.
Configuration
Add
tool_calls_limitto specialist front matter:Implementation
tool_calls_limitfrom specialist YAML front mattertool_choiceparameter on next API requestRelated
References
Full exploration: thoughts/shared/notes/2026-04-03/qa-brainstorm-event-loop-mailbox.md (Brainstorm: Sub-agent Tool Call Limits — Soft Landing)