unic · orioltf · May 22, 2026 · May 22, 2026
diff --git a/apps/claude-code/pr-review/CONTEXT.md b/apps/claude-code/pr-review/CONTEXT.md
@@ -91,10 +91,14 @@ _Avoid_: initial review, fresh review
 A Review run against an ADO PR where a prior **Bot Signature** is found in the PR's threads. Focuses on commits since the last Review, performs Thread Classification, and replies to or resolves existing Review Threads rather than duplicating them.
 _Avoid_: incremental review, follow-up review, second pass
 
+**Dry-run mode**:
+A Review run against an ADO PR with the `--dry-run` flag. Identical to first-review / re-review for every read-side step — **ADO Fetcher**, **Doc Context Orchestrator**, all Review Aspect agents, and (when prior signature is found) the **Re-review Coordinator's** Thread Classification — but the **ADO Writer** is never invoked. Findings are rendered in the Claude interface using the same severity-grouped format as Pre-PR mode. Used to preview what a Review or Re-review would post before committing to write-back.
+_Avoid_: preview review, simulated review, no-post review
+
 ### Orchestration agents
 
 **ADO Fetcher**:
-A plugin agent that retrieves PR metadata, iterations, changed files, and the raw diff from Azure DevOps. Used by first-review and re-review modes; not invoked in pre-PR mode.
+A plugin agent that retrieves PR metadata, iterations, PR threads, changed files, and the raw diff from Azure DevOps — and determines the Review mode from the thread data. All ADO read operations for a PR review are owned here; the orchestrator makes no inline ADO read calls.
 _Avoid_: fetcher, data agent, ADO client
 
 **Re-review Coordinator**:
@@ -158,9 +162,9 @@ A single end-of-run line printed by the orchestrator to the Claude interface, re
 - A **Doc Context** is assembled via a three-tier pipeline: the **Doc Context Orchestrator** spawns **Work Item Summarizer** and **Confluence Fetcher** agents (Doc Context Sub-agents) in parallel, then delegates their outputs to the **Doc Context Synthesizer**, which produces the final `DOC_CONTEXT` narrative injected into every Review Aspect agent
 - A **Doc Context Sub-agent** operates on a single source (work item or Confluence page) and receives the changed files list and the local diff when available
 - The **Doc Context Orchestrator** returns the **Doc Context Synthesizer**'s output verbatim; it does not rewrite or reformat the narrative
-- The **ADO Fetcher** is invoked by first-review and re-review modes; **Pre-PR mode** skips it entirely and goes directly to Review Aspect agents
-- The **Re-review Coordinator** is invoked only when the mode is re-review; first-review and pre-PR modes never load it
-- The **ADO Writer** is invoked by first-review and re-review modes; **Pre-PR mode** does not write back to ADO
+- The **ADO Fetcher** is invoked by first-review, re-review, and dry-run modes; **Pre-PR mode** skips it entirely and goes directly to Review Aspect agents
+- The **Re-review Coordinator** is invoked when the mode is re-review or a dry-run that detected a prior **Bot Signature**; first-review, pre-PR, and dry-run-on-fresh-PR modes never load it. In dry-run mode the Coordinator performs Thread Classification but its results feed only the rendered output — no Replies are posted
+- The **ADO Writer** is invoked by first-review and re-review modes; **Pre-PR mode** and **Dry-run mode** do not write back to ADO
 - Every operation in an orchestration agent terminates in one of the four **Notice Tiers**. **DEGRADED** and **EMPTY-BY-DESIGN**-with-message operations emit a **Notice** that flows from the agent's structured result block, through the orchestrator's merge step, into the **Review Summary** (for ADO modes) or the printed pre-findings block (for **Pre-PR mode**). The end-of-run **Trailer** carries Notice counts so the invoker sees them without opening the PR.
 
 ## Example dialogue

diff --git a/apps/claude-code/pr-review/docs/adr/0015-canonical-http-tier-mapping.md b/apps/claude-code/pr-review/docs/adr/0015-canonical-http-tier-mapping.md
@@ -1,6 +1,6 @@
 # ADR 0015 — Canonical HTTP-Tier Mapping
 
-**Status:** Accepted
+**Status:** Accepted, amended by 0018
 **Date:** 2026-05-13
 **Deciders:** Oriol Torrent Florensa
 **Context:** ADR 0014 (failure-classification helper layer)
@@ -55,6 +55,9 @@ Retries are not implemented. Reasons:
 Re-evaluate if 5xx Notices prove painful in practice; retries can be added behind the same Notice
 surface without changing the doctrine.
 
+**Update (2026-05-14):** this no-retry policy is extended to agent-spawn failures during orchestrator
+fan-out by ADR 0018. The reasoning is identical (AFK latency, retry-storm risk, accurate Notice).
+
 ### Implementation
 
 The canonical mapping is implemented in `scripts/ado/classify-http-error.mjs`:

diff --git a/apps/claude-code/pr-review/docs/adr/0016-fold-thread-fetch-into-ado-fetcher.md b/apps/claude-code/pr-review/docs/adr/0016-fold-thread-fetch-into-ado-fetcher.md
@@ -0,0 +1,55 @@
+# 0016. Fold thread fetch into ADO Fetcher
+
+**Status:** Accepted (2026-05)
+**Context:** Amends ADR 0013 (orchestrator split for review-pr)
+**Date:** 2026-05-14
+
+## Context
+
+ADR 0013 split `review-pr.md` into a thin orchestrator and focused agents. As part of that split, ADR 0013 carved out **one allowed inline ADO call**: the mode-detection `az repos pr thread list` in the orchestrator's Step 4. The justification was that mode detection had to happen _before_ the ADO Fetcher could be launched, so the orchestrator needed to fetch and inspect threads itself.
+
+Two facts surfaced after shipping the split:
+
+1. **The carved-out command is invalid syntax.** `az repos pr thread list` does not exist in the `azure-devops` extension — valid `az repos pr` subcommands are `create / list / show / update / work-item / set-vote / reviewer / policy / checkout`. There is no `thread` group. The correct call is `az devops invoke --area git --resource pullRequestThreads --route-parameters …`. Every ADO PR review since the orchestrator-split shipped has been failing at Step 4. The first observed-in-the-wild failure is the 2026-05-14 dry-run captured in `docs/conversations/pr-review-dry-run-01.txt`.
+
+2. **The "fetch threads → derive mode" pair is naturally adjacent to the rest of the Fetcher's work.** The Fetcher already calls `az devops invoke` for `pullRequestIterations`, `pullRequestIterationChanges`, `pullRequestWorkItems`. Adding `pullRequestThreads` to that family keeps ADO read knowledge in one place — the agent that exists exactly to own it.
+
+The carve-out can go away if the Fetcher returns `MODE`, `IS_REREVIEW`, `PRIOR_ITERATION_ID`, `SUMMARY_THREAD_ID`, and `RAW_THREADS_JSON` in its structured result block. The orchestrator branches on those fields in Step 5 onwards — the same pattern it already uses for `WORK_ITEM_IDS`, `DIFF_RANGE`, etc.
+
+## Decision
+
+**Move thread fetching and mode detection from the orchestrator into the ADO Fetcher.** The Fetcher gains a new step (Step 2.5) between iterations fetch and changed-files fetch:
+
+1. Call `az devops invoke --area git --resource pullRequestThreads --route-parameters "project=$PROJECT" "repositoryId=$REPO_ID" "pullRequestId=$PR_ID"`.
+2. Apply ADR 0015's HTTP-tier mapping: 401/403 → ABORTED; 404 → OK (treat as empty threads; equivalent to first-review); 5xx / network → DEGRADED with `kind: thread-fetch`, treat as empty threads so mode detection still produces `first-review`.
+3. Run `detectMode` (from `scripts/mode-detection.mjs`, unchanged) on the response's `.value` array. The function returns `{ mode, isRereview, priorIterationId, summaryThreadId }`.
+4. Append `RAW_THREADS_JSON`, `MODE`, `IS_REREVIEW`, `PRIOR_ITERATION_ID`, `SUMMARY_THREAD_ID` to the Fetcher's result block.
+
+The orchestrator's Step 4 simplifies to a single `az repos pr show` call that captures PR metadata (`REPO_ID`, `PROJECT`, `SOURCE_BRANCH`, `TARGET_BRANCH`, `PR_TITLE`, `PR_DESCRIPTION`) and passes them into the Fetcher as inputs. The Fetcher no longer makes its own `az repos pr show` call — the data is already in its prompt.
+
+After this change, the orchestrator contains **no `az devops invoke` calls**. The only remaining inline `az` is `az --version`, `az extension list` (Step 3 preflight), and `az repos pr show` (Step 4 metadata).
+
+ADR 0013's carve-out (_"The one allowed inline ADO call is the mode-detection `az repos pr thread list` in the mode detection block"_) is **removed**. ADR 0013's core decision — thin orchestrator + focused agents — stands; only that one-sentence exception is superseded by this ADR.
+
+## Alternatives considered
+
+**Fix the inline command in Step 4 and leave it in the orchestrator.** Smallest change. Rejected because it preserves the architectural reason for the carve-out (mode detection must precede Fetcher launch) which doesn't actually exist — the orchestrator can equally well branch on Fetcher output. Keeping ADO knowledge in two places means future API changes require edits in two files.
+
+**Introduce a separate `mode-detector` agent that fetches threads.** Cleaner separation of concerns. Rejected because mode detection and the rest of the Fetcher's work share the same ADO call shape (`az devops invoke --area git --resource …`); a new agent would duplicate the auth/retry/error-classification setup the Fetcher already has.
+
+**Replace `az` with direct REST calls.** Long-discussed in ADR 0008's neighbourhood. Out of scope for this ADR; revisit if `az` proves a recurring failure source after spec 14's smoke test catches near-term drift.
+
+## Consequences
+
+- The plugin's mode-detection path actually works (Step 4 was previously broken on every run).
+- ADO read knowledge is centralised in `ado-fetcher.md`.
+- Adding new mode-affecting ADO data (e.g. a `mergeAttempt` field) requires editing one agent prompt.
+- The Fetcher's result block grows by five lines — downstream prompts (Coordinator, Writer) parse them via existing string-extraction patterns, no helper change.
+- ADR 0013's status line is updated to `Accepted (2026-05), amended by 0016`.
+
+## See also
+
+- ADR 0013 — Orchestrator split for review-pr (amended)
+- ADR 0014 — Notice Tier doctrine (governs the new `kind: thread-fetch` DEGRADED path)
+- ADR 0015 — Canonical HTTP-tier mapping (governs the 401/403/5xx behaviour for thread fetch)
+- Spec 12 — Implementation of this ADR
diff --git a/apps/claude-code/pr-review/docs/adr/0017-dry-run-as-fourth-peer-mode.md b/apps/claude-code/pr-review/docs/adr/0017-dry-run-as-fourth-peer-mode.md
@@ -0,0 +1,46 @@
+# 0017. Dry-run as a fourth peer operating mode
+
+**Status:** Accepted (2026-05)
+**Date:** 2026-05-14
+
+## Context
+
+The 2026-05-14 dry-run captured in `docs/conversations/pr-review-dry-run-01.txt` was invoked with the natural-language instruction _"Make a dry-run PR review. DO NOT POST ANY COMMENT TO THE PR! Only report inline"_. The plugin had no formal dry-run mechanism. The LLM-as-orchestrator improvised: inlined `az` data fetches (compounding the Step 4 bug), skipped the `pr-review:ado-fetcher` agent entirely, and never invoked the ADO Writer. The user got useful findings but only by accident — the documented spec was bypassed.
+
+Two design problems:
+
+1. **No deterministic dry-run path.** Asking the LLM to "not post comments" works most of the time but invites free-form deviation from the spec. A documented mode replaces persuasion with branching.
+2. **Dry-run is genuinely useful and not just a debug helper.** Previewing what a re-review would post — before letting it edit threads in a customer-facing PR — is a legitimate user need.
+
+Two shapes for the formalisation: a fourth peer mode, or an orthogonal boolean flag.
+
+## Decision
+
+**Dry-run is a fourth peer mode, alongside pre-PR, first-review, and re-review.** CLI surface: `/pr-review:review-pr <url> --dry-run`. The orchestrator's mode resolution applies the flag after URL parsing; `IS_REREVIEW` (whether a prior bot signature exists) is captured separately and drives Coordinator inclusion within dry-run mode.
+
+Dry-run runs every read-side step identically to first-review / re-review (preflight, metadata fetch, ADO Fetcher, Doc Context Orchestrator, Review Aspects fan-out, and Re-review Coordinator's Thread Classification when `IS_REREVIEW=true`). The **ADO Writer is never invoked**. Findings render to the Claude interface using pre-PR Step E's severity-grouped format. Notices from Fetcher and Coordinator render via `formatNoticesAsPrePrPreamble`. The Trailer reads `🔍 Dry-run complete: <N> findings (<criticals> critical, <importants> important) · <warnings> warning notices · would have posted to <PR URL>`.
+
+Coordinator runs in dry-run mode when `IS_REREVIEW=true`, but its reply-posting branch is short-circuited. Its `freshFindings` output still feeds `FINDINGS_JSON` so the user previews exactly the incremental finding set the real re-review would have posted.
+
+## Alternatives considered
+
+**Orthogonal `DRY_RUN` boolean flag.** Any mode (first-review / re-review) could be dry-run. Conceptually cleaner — dry-run is "do the work, don't post" — but produces awkward mode names internally (`first-review-dry-run`, `re-review-dry-run`) and forces every branch in the orchestrator to read two state variables instead of one. Rejected: peer-mode keeps the orchestrator's mode-switch readable as a single discriminator.
+
+**Document a "DO NOT POST" convention and trust the LLM.** Zero implementation cost. Rejected because it is exactly what failed on 2026-05-14: the LLM honoured the instruction by skipping not just Writer but also the entire structured fetch path, masking the Step 4 bug and producing inconsistent behaviour across runs.
+
+**Make dry-run a flag that disables only the Writer agent invocation.** Halfway between the other two. Rejected because the orchestrator's other branches (Step 8 trailer text, Notice rendering) still need to know whether to use the pre-PR rendering path or the ADO Summary path. A single mode value carries that decision uniformly.
+
+## Consequences
+
+- The orchestrator branches on `MODE ∈ {pre-pr, dry-run, first-review, re-review}` — four cases, no nested flags.
+- Re-review-eligible PRs can be previewed safely: classification runs and reports its plan, but no replies are posted and no thread statuses are PATCHed.
+- The Pre-PR rendering helpers (`formatNoticesAsPrePrPreamble`, severity-grouped finding print) are reused by dry-run — confirming that "render to interface" is a coherent capability, not a pre-PR-only quirk.
+- The Trailer line gets a fourth shape (`🔍 Dry-run complete: ...`). `formatTrailer` in `scripts/ado/notices.mjs` grows one branch + test cases.
+- `CONTEXT.md` gains a `Dry-run mode` term and updates three relationship lines documenting which agents run in which modes.
+
+## See also
+
+- Spec 13 — Implementation of this ADR
+- Spec 12 — Step 4 fix (the upstream bug that exposed the no-formal-dry-run gap)
+- ADR 0014 — Notice Tier doctrine (`formatTrailer`, `formatNoticesAsPrePrPreamble` are governed here)
+- `CONTEXT.md` — `Dry-run mode` definition and the agent-invocation matrix
diff --git a/apps/claude-code/pr-review/docs/adr/0018-fan-out-resilience-trust-harness.md b/apps/claude-code/pr-review/docs/adr/0018-fan-out-resilience-trust-harness.md
@@ -0,0 +1,60 @@
+# 0018. Fan-out resilience — trust the harness, surface partial failure as a Notice
+
+**Status:** Accepted (2026-05)
+**Context:** Amends ADR 0015 (canonical HTTP-tier mapping — no-retry doctrine)
+**Date:** 2026-05-14
+
+## Context
+
+The 2026-05-14 dry-run captured in `docs/conversations/pr-review-dry-run-01.txt` produced a transient `API Error: The socket connection was closed unexpectedly` from the Claude Code HTTP client during Step 6's parallel agent fan-out. Step 6 is the plugin's largest concurrent spawn: Doc Context Orchestrator plus 2–5 review-aspect agents in a single message.
+
+The error printed in the output but all spawned agents completed normally. The Claude Code harness either auto-retried the launch transparently or the server-side launch had already been accepted before the client socket dropped. The user-visible outcome was a successful run that _looked_ broken.
+
+Two unaddressed risks:
+
+1. **Cosmetic confusion.** A scary API error appeared with no framing. AFK users skimming the output can't tell whether a partial failure occurred.
+2. **Real partial-failure invisibility.** If a future fan-out _does_ lose one agent (e.g. the harness fails to recover), the orchestrator currently has no post-condition that detects the missing result block. The Review posts with N-1 aspects' findings and the user has no signal that coverage was reduced.
+
+ADR 0015 already forbids retries for HTTP failures in v1 with three reasons: (1) AFK latency, (2) retry-storm failure mode, (3) DEGRADED Notices carrying accurate information. All three reasons apply identically to agent-spawn failures during fan-out. This ADR extends ADR 0015's doctrine to the orchestrator's fan-out layer.
+
+## Decision
+
+**Trust the harness. Don't retry. Surface partial fan-out as a DEGRADED Notice.**
+
+After Step 6's fan-out completes, the orchestrator runs a post-condition check: every launched Review Aspect agent must have returned a parseable JSON findings array. Agents that return nothing (no result block, no JSON, empty stdout) are recorded as missing.
+
+If any aspects are missing, the orchestrator emits a single Notice:
+
+```js
+{ severity: 'warning', kind: 'agent-spawn',
+  message: 'Review Aspects did not return findings: <name>, <name>. Coverage is reduced for this run.' }
+```
+
+The Notice flows through the same pipeline as Fetcher and Coordinator notices: merged via `mergeNotices`, rendered in the Review Summary (or pre-PR / dry-run preamble), counted in the Trailer's `warning notices` field.
+
+The Review still posts with whatever findings did arrive. No retry. No double-spawn risk.
+
+The `agent-spawn` kind is added to the `NoticeKind` enum in `scripts/ado/notices.mjs`. Per ADR 0014's invariant _"each `kind` has exactly one source agent"_, the source agent is **the orchestrator itself** — it is the only place that knows which agents were launched.
+
+## Alternatives considered
+
+**Retry missing agents sequentially after fan-out.** The orchestrator detects the missing ones and re-spawns them in a second pass. Rejected: the harness may already have launched them server-side (the dry-run example demonstrates the harness recovers transparently). Re-spawning risks double inferences, double findings, and confusing the user with duplicate output. No idempotency guarantee.
+
+**Catch the socket error itself and retry the whole fan-out.** Guaranteed double-spawn of agents that did succeed — explicitly worse than the previous option.
+
+**Add a feature-flagged single retry behind a configuration option.** Postponed. ADR 0015's clause _"Re-evaluate if 5xx Notices prove painful in practice"_ applies equivalently here. If `agent-spawn` Notices become a frequent reality, this ADR can be revisited with the same opt-in approach.
+
+**Treat partial fan-out as ABORTED.** Rejected: one missing aspect (e.g. `comment-analyzer`) is not run-corrupting. The Notice tells the user; they can rerun if needed. ABORTED is reserved for failures that would corrupt cross-run state.
+
+## Consequences
+
+- The orchestrator gains a single post-condition check after Step 6 — a few lines of bash + a `createNotice` call. No new helper, no new test infrastructure beyond extending the existing `tests/notices.test.mjs` cases.
+- The user always sees a documented status: either findings (success), findings + `agent-spawn` Notice (partial), or aborted Trailer (full failure upstream).
+- Future failure modes affecting fan-out (e.g. a `pr-review-toolkit` upgrade introducing a non-JSON response) become observable via the same Notice, without a new failure-classification path.
+- ADR 0015's no-retry doctrine is documented as load-bearing across two layers (HTTP and agent-spawn). Its status line gains `, amended by 0018` and its "No retries in v1" section closes with a cross-reference to this ADR.
+
+## See also
+
+- ADR 0014 — Notice Tier doctrine (`agent-spawn` is a new `kind` in the enum)
+- ADR 0015 — Canonical HTTP-tier mapping (amended)
+- Spec 15 — Implementation of this ADR