docs(dispatching-parallel-agents): rewrite for current Agent tool semantics by dotcomjack · Pull Request #1611 · obra/superpowers

dotcomjack · 2026-05-22T20:23:31Z

What problem are you trying to solve?

The current skill teaches Claude Code agent dispatch using:

Task(...) as the tool name — obsolete; the actual tool is Agent.
No explicit statement of the one-message-multiple-tool_use-blocks rule (this is also what docs(dispatching-parallel-agents): document single-message-N-blocks mechanic #1470 addresses, additively).
No subagent_type guidance — Claude Code now has specialized subagent types (Explore, general-purpose, Plan, feature-dev:code-architect, feature-dev:code-reviewer, feature-dev:code-explorer, posthog:error-analyzer, etc.), and picking the right one matters more than prompt wording in many cases.
No coverage of modern Agent options: isolation: \"worktree\", run_in_background, model override, name + SendMessage continuation.
No "never delegate understanding" principle, which is in the underlying Agent tool description but not surfaced as a teaching.
Verification frames the summary as authoritative, when an agent's summary describes what it intended to do, not what it did.

Empirical baseline (RED)

I ran a baseline probe before writing the rewrite. Prompt to a fresh subagent (no skill loaded): "Imagine you are the main Claude Code assistant. The user says 'three test files are failing for independent reasons: src/abort.test.ts, src/batch.test.ts, src/race.test.ts. Dispatch parallel agents to fix them.' Sketch the exact tool calls you'd make."

The subagent:

Used the wrong tool name (TaskCreate).
Treated worktree isolation as a separate tool step ("I'd spin up EnterWorktree per agent first") rather than an Agent parameter.
Got foreground/background semantics backwards ("no — TaskCreate is already async/non-blocking from my POV; I await all three results in the next turn").

So even a fresh model with no skill loaded reaches for the obsolete pattern.

What does this PR change?

Rewrites skills/dispatching-parallel-agents/SKILL.md. 83 insertions, 141 deletions; word count drops from 950 to 897 while adding substantive new content (subagent_type table, full Agent argument schema, options section, "never delegate understanding", "read the diff not the summary", Red Flags section). Removes a dated 2025-10-03 narrative example because writing-skills explicitly flags narrative examples as an anti-pattern.

Is this change appropriate for the core library?

Yes. The skill being modified is already in the core library and applies to every Claude Code user dispatching parallel agents. The rewrite is generic to Claude Code's Agent tool, not specific to any project, team, or third-party integration. The subagent_type table mentions plugin-provided types (feature-dev:*, posthog:error-analyzer) as examples of what might be available; the table is framed around job → type, not specific plugins.

What alternatives did you consider?

Targeted patch (the docs(dispatching-parallel-agents): document single-message-N-blocks mechanic #1470 approach). Add only the single-message-N-blocks subsection. Pro: minimal diff. Con: leaves five other concrete teaching gaps in place. Reasonable to land first, but the larger rewrite is still needed.
New skill alongside the existing one (e.g., agent-tool-reference). Pro: zero risk to the current skill. Con: discovery suffers (two skills overlap on the same trigger); description CSO conflict.
Comments on docs(dispatching-parallel-agents): document single-message-N-blocks mechanic #1470 to expand its scope. Considered, but docs(dispatching-parallel-agents): document single-message-N-blocks mechanic #1470 targets main (v5.1.0 release) and is intentionally additive/no-removal — the author was conservative. Expanding scope in-place would change the contract of that PR.

Landing #1470 first and then this rewrite is fine (small rebase, no substance loss). Landing this rewrite supersedes #1470's content (the one-message rule is even more prominent here).

Does this PR contain multiple unrelated changes?

No. Single file, single skill, single concern (modernizing the Agent dispatch guidance).

Existing PRs

I have reviewed all open AND closed PRs for duplicates or prior art
Related PRs:
- docs(dispatching-parallel-agents): document single-message-N-blocks mechanic #1470 (open) — Adds the single-message-N-blocks subsection additively, targeting main. Substantial overlap on that one point; this PR subsumes it.
- fix(dispatching-parallel-agents): align "Use when" threshold with description #948 (open) — Aligns the "2+ vs 3+" threshold. This PR already uses "2+" throughout, consistent with fix(dispatching-parallel-agents): align "Use when" threshold with description #948's direction.
- fix: remove redundant instructions in dispatching-parallel-agents #1102 (closed) — Targeted redundancy removal; closed without comment. This PR doesn't repeat that exact dedup; it does a fuller rewrite with empirical grounding and eval evidence the closed PR lacked.

Environment tested

Harness	Harness version	Model	Model version/ID
Claude Code	current	Claude Opus 4.7 (1M context)	`claude-opus-4-7[1m]`

New harness support (required if this PR adds a new harness)

Not applicable — no harness changes.

Evaluation

Initial prompt: A user asked their main agent to "analyze our superpowers:dispatching-parallel-agents and determine if it's outdated or lacking quality." The agent identified the gaps above through analysis of the skill text against the live Claude Code Agent tool definition.
RED baseline: 1 session. Dispatched a fresh general-purpose subagent (no skill loaded) with the parallel-test-fix scenario. Outcome documented in "Empirical baseline (RED)" above. The transcript explicitly used TaskCreate, invented EnterWorktree as a separate step, and got run_in_background semantics wrong.
GREEN test: 1 session. Dispatched a fresh general-purpose subagent with explicit instructions to read the rewritten SKILL.md and apply it to the same scenario. Outcome: subagent used Agent (called out Task/TaskCreate as wrong), batched in one assistant message, picked subagent_type=general-purpose with reasoning, applied isolation: \"worktree\", planned verification as "read each diff (not the summary), grep for overlapping edits, run the FULL suite, spot-check that no assertions were weakened." Caught all RED failures.
REFACTOR pass: GREEN-test subagent flagged two real gaps in the first draft: (a) worktree guidance was ambiguous on whether to default-on for parallel test fixes; (b) the full Agent argument schema was only shown by example. Both fixed before commit. Final word count 897 (down from 950).
Before/after: RED ran without the skill; even without the skill, model hit three concrete dispatch errors. GREEN ran with the new skill and avoided all three. The old skill would have reinforced one of them (Task(...) is in the old code blocks).

Acknowledged limit: only one GREEN session. The maintainer's bar may be higher; happy to run additional adversarial scenarios if requested.

Rigor

If this is a skills change: I used superpowers:writing-skills and completed adversarial pressure testing (RED/GREEN documented above)
This change was tested adversarially, not just on the happy path (RED baseline ran before writing the rewrite)
I did not modify carefully-tuned content (Red Flags table, rationalizations, "human partner" language) without extensive evals showing the change is an improvement. Note: the original skill did not have a Red Flags section; this rewrite adds one (per writing-skills recommendation). The narrative "Real Example from Session" with date 2025-10-03 was removed because writing-skills explicitly lists narrative examples as an anti-pattern.

Human review

A human has reviewed the COMPLETE proposed diff before submission

Human review confirmed; marking ready.

🤖 Generated with Claude Code

…antics The skill was teaching obsolete tool names (Task) and missing the core mechanic that makes parallel dispatch work (one message, multiple tool_use blocks). A baseline probe with a fresh subagent reached for "TaskCreate", invented a separate "EnterWorktree" step (worktree is an Agent parameter, not a tool), and misunderstood foreground/background semantics. This rewrite addresses all three. Changes: - Replace Task(...) with Agent throughout; call out Task/TaskCreate as an explicit anti-pattern. - State the single-assistant-message rule as a hard rule with positive and negative examples (subsumes the additive change in obra#1470, which targets main). - Add full top-level Agent argument schema (description, prompt, subagent_type, isolation, run_in_background, model, name). - Add subagent_type selection table (Explore, general-purpose, Plan, feature-dev:code-architect, code-reviewer, code-explorer, posthog:error-analyzer). - Document isolation="worktree" as default-on for any parallel dispatch that touches files. - Document run_in_background semantics (foreground blocks; background notifies). - Add "never delegate understanding" principle: synthesis stays with the dispatcher. - Rebuild verification around "read the diff, not the summary" (agent summaries describe intent, not action) and "spot-check assertions" (agents sometimes weaken tests to pass). - Remove the dated 2025-10-03 narrative example (writing-skills flags narrative examples as an anti-pattern). Net: 83 additions, 141 deletions. Word count drops from 950 to 897 while adding substantive new content.

obra · 2026-05-23T21:02:13Z

Closing — this is a wholesale rewrite (83 insertions / 141 deletions) of a carefully-tuned skill, and the new content is very Claude-Code-specific (the Agent tool name, subagent_type taxonomy, isolation: "worktree", SendMessage continuation, the table of plugin-provided subagent types). Superpowers tries to keep core skill prose harness-neutral; that's exactly what PR #1486 established as a maintained invariant, and #1603 is adding a lint to keep it from regressing.

Skills are not prose — they are code that shapes agent behavior. … Do not modify carefully-tuned content … without evidence the change is an improvement.

The empirical RED probe in the PR body is appreciated, but a RED probe isn't a GREEN — there's no adversarial pressure-test showing the rewritten skill actually produces better behavior across multiple sessions on multiple harnesses. If you want to pursue this, please:

Keep the diff Claude-Code-specific content inside a Claude-Code-only section, leaving the harness-neutral guidance harness-neutral.
Show before/after eval results across at least two harnesses, not just one fresh-subagent probe.
Submit as a smaller targeted patch (closer to what docs(dispatching-parallel-agents): document single-message-N-blocks mechanic #1470 proposed) rather than a full rewrite.

dotcomjack marked this pull request as ready for review May 22, 2026 20:27

obra added enhancement New feature or request subagents Subagent-driven development and dispatch claude-code Claude Code (Anthropic CLI) issues documentation Improvements or additions to documentation labels May 23, 2026

obra closed this May 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs(dispatching-parallel-agents): rewrite for current Agent tool semantics#1611

docs(dispatching-parallel-agents): rewrite for current Agent tool semantics#1611
dotcomjack wants to merge 1 commit into
obra:devfrom
dotcomjack:dispatching-parallel-agents-rewrite

dotcomjack commented May 22, 2026 •

edited

Loading

Uh oh!

obra commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

dotcomjack commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem are you trying to solve?

Empirical baseline (RED)

What does this PR change?

Is this change appropriate for the core library?

What alternatives did you consider?

Does this PR contain multiple unrelated changes?

Existing PRs

Environment tested

New harness support (required if this PR adds a new harness)

Evaluation

Rigor

Human review

Uh oh!

obra commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dotcomjack commented May 22, 2026 •

edited

Loading