Skip to content

docs(dispatching-parallel-agents): rewrite for current Agent tool semantics#1611

Closed
dotcomjack wants to merge 1 commit into
obra:devfrom
dotcomjack:dispatching-parallel-agents-rewrite
Closed

docs(dispatching-parallel-agents): rewrite for current Agent tool semantics#1611
dotcomjack wants to merge 1 commit into
obra:devfrom
dotcomjack:dispatching-parallel-agents-rewrite

Conversation

@dotcomjack
Copy link
Copy Markdown

@dotcomjack dotcomjack commented May 22, 2026

What problem are you trying to solve?

The current skill teaches Claude Code agent dispatch using:

  1. Task(...) as the tool name — obsolete; the actual tool is Agent.
  2. No explicit statement of the one-message-multiple-tool_use-blocks rule (this is also what docs(dispatching-parallel-agents): document single-message-N-blocks mechanic #1470 addresses, additively).
  3. No subagent_type guidance — Claude Code now has specialized subagent types (Explore, general-purpose, Plan, feature-dev:code-architect, feature-dev:code-reviewer, feature-dev:code-explorer, posthog:error-analyzer, etc.), and picking the right one matters more than prompt wording in many cases.
  4. No coverage of modern Agent options: isolation: \"worktree\", run_in_background, model override, name + SendMessage continuation.
  5. No "never delegate understanding" principle, which is in the underlying Agent tool description but not surfaced as a teaching.
  6. Verification frames the summary as authoritative, when an agent's summary describes what it intended to do, not what it did.

Empirical baseline (RED)

I ran a baseline probe before writing the rewrite. Prompt to a fresh subagent (no skill loaded): "Imagine you are the main Claude Code assistant. The user says 'three test files are failing for independent reasons: src/abort.test.ts, src/batch.test.ts, src/race.test.ts. Dispatch parallel agents to fix them.' Sketch the exact tool calls you'd make."

The subagent:

  • Used the wrong tool name (TaskCreate).
  • Treated worktree isolation as a separate tool step ("I'd spin up EnterWorktree per agent first") rather than an Agent parameter.
  • Got foreground/background semantics backwards ("no — TaskCreate is already async/non-blocking from my POV; I await all three results in the next turn").

So even a fresh model with no skill loaded reaches for the obsolete pattern.

What does this PR change?

Rewrites skills/dispatching-parallel-agents/SKILL.md. 83 insertions, 141 deletions; word count drops from 950 to 897 while adding substantive new content (subagent_type table, full Agent argument schema, options section, "never delegate understanding", "read the diff not the summary", Red Flags section). Removes a dated 2025-10-03 narrative example because writing-skills explicitly flags narrative examples as an anti-pattern.

Is this change appropriate for the core library?

Yes. The skill being modified is already in the core library and applies to every Claude Code user dispatching parallel agents. The rewrite is generic to Claude Code's Agent tool, not specific to any project, team, or third-party integration. The subagent_type table mentions plugin-provided types (feature-dev:*, posthog:error-analyzer) as examples of what might be available; the table is framed around job → type, not specific plugins.

What alternatives did you consider?

  1. Targeted patch (the docs(dispatching-parallel-agents): document single-message-N-blocks mechanic #1470 approach). Add only the single-message-N-blocks subsection. Pro: minimal diff. Con: leaves five other concrete teaching gaps in place. Reasonable to land first, but the larger rewrite is still needed.
  2. New skill alongside the existing one (e.g., agent-tool-reference). Pro: zero risk to the current skill. Con: discovery suffers (two skills overlap on the same trigger); description CSO conflict.
  3. Comments on docs(dispatching-parallel-agents): document single-message-N-blocks mechanic #1470 to expand its scope. Considered, but docs(dispatching-parallel-agents): document single-message-N-blocks mechanic #1470 targets main (v5.1.0 release) and is intentionally additive/no-removal — the author was conservative. Expanding scope in-place would change the contract of that PR.

Landing #1470 first and then this rewrite is fine (small rebase, no substance loss). Landing this rewrite supersedes #1470's content (the one-message rule is even more prominent here).

Does this PR contain multiple unrelated changes?

No. Single file, single skill, single concern (modernizing the Agent dispatch guidance).

Existing PRs

Environment tested

Harness Harness version Model Model version/ID
Claude Code current Claude Opus 4.7 (1M context) claude-opus-4-7[1m]

New harness support (required if this PR adds a new harness)

Not applicable — no harness changes.

Evaluation

  • Initial prompt: A user asked their main agent to "analyze our superpowers:dispatching-parallel-agents and determine if it's outdated or lacking quality." The agent identified the gaps above through analysis of the skill text against the live Claude Code Agent tool definition.
  • RED baseline: 1 session. Dispatched a fresh general-purpose subagent (no skill loaded) with the parallel-test-fix scenario. Outcome documented in "Empirical baseline (RED)" above. The transcript explicitly used TaskCreate, invented EnterWorktree as a separate step, and got run_in_background semantics wrong.
  • GREEN test: 1 session. Dispatched a fresh general-purpose subagent with explicit instructions to read the rewritten SKILL.md and apply it to the same scenario. Outcome: subagent used Agent (called out Task/TaskCreate as wrong), batched in one assistant message, picked subagent_type=general-purpose with reasoning, applied isolation: \"worktree\", planned verification as "read each diff (not the summary), grep for overlapping edits, run the FULL suite, spot-check that no assertions were weakened." Caught all RED failures.
  • REFACTOR pass: GREEN-test subagent flagged two real gaps in the first draft: (a) worktree guidance was ambiguous on whether to default-on for parallel test fixes; (b) the full Agent argument schema was only shown by example. Both fixed before commit. Final word count 897 (down from 950).
  • Before/after: RED ran without the skill; even without the skill, model hit three concrete dispatch errors. GREEN ran with the new skill and avoided all three. The old skill would have reinforced one of them (Task(...) is in the old code blocks).

Acknowledged limit: only one GREEN session. The maintainer's bar may be higher; happy to run additional adversarial scenarios if requested.

Rigor

  • If this is a skills change: I used superpowers:writing-skills and completed adversarial pressure testing (RED/GREEN documented above)
  • This change was tested adversarially, not just on the happy path (RED baseline ran before writing the rewrite)
  • I did not modify carefully-tuned content (Red Flags table, rationalizations, "human partner" language) without extensive evals showing the change is an improvement. Note: the original skill did not have a Red Flags section; this rewrite adds one (per writing-skills recommendation). The narrative "Real Example from Session" with date 2025-10-03 was removed because writing-skills explicitly lists narrative examples as an anti-pattern.

Human review

  • A human has reviewed the COMPLETE proposed diff before submission

Human review confirmed; marking ready.

🤖 Generated with Claude Code

…antics

The skill was teaching obsolete tool names (Task) and missing the core
mechanic that makes parallel dispatch work (one message, multiple
tool_use blocks). A baseline probe with a fresh subagent reached for
"TaskCreate", invented a separate "EnterWorktree" step (worktree is an
Agent parameter, not a tool), and misunderstood foreground/background
semantics. This rewrite addresses all three.

Changes:
- Replace Task(...) with Agent throughout; call out Task/TaskCreate as
  an explicit anti-pattern.
- State the single-assistant-message rule as a hard rule with positive
  and negative examples (subsumes the additive change in obra#1470, which
  targets main).
- Add full top-level Agent argument schema (description, prompt,
  subagent_type, isolation, run_in_background, model, name).
- Add subagent_type selection table (Explore, general-purpose, Plan,
  feature-dev:code-architect, code-reviewer, code-explorer,
  posthog:error-analyzer).
- Document isolation="worktree" as default-on for any parallel dispatch
  that touches files.
- Document run_in_background semantics (foreground blocks; background
  notifies).
- Add "never delegate understanding" principle: synthesis stays with the
  dispatcher.
- Rebuild verification around "read the diff, not the summary" (agent
  summaries describe intent, not action) and "spot-check assertions"
  (agents sometimes weaken tests to pass).
- Remove the dated 2025-10-03 narrative example (writing-skills flags
  narrative examples as an anti-pattern).

Net: 83 additions, 141 deletions. Word count drops from 950 to 897 while
adding substantive new content.
@dotcomjack dotcomjack marked this pull request as ready for review May 22, 2026 20:27
@obra obra added enhancement New feature or request subagents Subagent-driven development and dispatch claude-code Claude Code (Anthropic CLI) issues documentation Improvements or additions to documentation labels May 23, 2026
@obra
Copy link
Copy Markdown
Owner

obra commented May 23, 2026

Closing — this is a wholesale rewrite (83 insertions / 141 deletions) of a carefully-tuned skill, and the new content is very Claude-Code-specific (the Agent tool name, subagent_type taxonomy, isolation: "worktree", SendMessage continuation, the table of plugin-provided subagent types). Superpowers tries to keep core skill prose harness-neutral; that's exactly what PR #1486 established as a maintained invariant, and #1603 is adding a lint to keep it from regressing.

Skills are not prose — they are code that shapes agent behavior. … Do not modify carefully-tuned content … without evidence the change is an improvement.

The empirical RED probe in the PR body is appreciated, but a RED probe isn't a GREEN — there's no adversarial pressure-test showing the rewritten skill actually produces better behavior across multiple sessions on multiple harnesses. If you want to pursue this, please:

  1. Keep the diff Claude-Code-specific content inside a Claude-Code-only section, leaving the harness-neutral guidance harness-neutral.
  2. Show before/after eval results across at least two harnesses, not just one fresh-subagent probe.
  3. Submit as a smaller targeted patch (closer to what docs(dispatching-parallel-agents): document single-message-N-blocks mechanic #1470 proposed) rather than a full rewrite.

@obra obra closed this May 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

claude-code Claude Code (Anthropic CLI) issues documentation Improvements or additions to documentation enhancement New feature or request subagents Subagent-driven development and dispatch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants