fix: address 24 open issues — security, concurrency, semantics, docs by nficano · Pull Request #66 · agentruntimecontrolprotocol/swift-sdk

nficano · 2026-05-25T01:50:02Z

Summary

Batch fix for 24 open issues spanning security, concurrency, protocol semantics, and documentation. Builds clean, all 137 tests (112 existing + 25 new) pass locally on macOS / Swift 6.3.

Security

Bind the challenge nonce in JWTAuthValidator so signed JWTs cannot be replayed #58 JWT challenge binding — JWTAuthValidator now requires a nonce claim matching the runtime-issued challenge so captured tokens can't be replayed across challenges.
Reject model.use child leases that are not true wildcard subsets #53 ModelUse.subsetViolation reworked with a real glob-inclusion check (parent wildcards with required suffix segments can no longer be escaped, e.g. openai:gpt-*o does NOT cover openai:gpt-5).

Concurrency / lifecycle

Register response waiters before starting timeout races #42 PendingRegistry and ARCPClient.ping use a slot state machine so the timeout vs. waiter race can't leave dangling continuations.
Fail pending invocations when the client transport closes #41 ARCPClient.finishUnhandled() fails every pending invocation / finishes progress + result streams when the transport closes. invoke cleans up state if transport.send throws.
Fix race where ARCPRuntime.register can skip a session in mid-handshake #61 ARCPRuntime.register race closed by inserting the jobManagers entry before iterating registeredHandlers.
Prevent subscription events from recursively routing themselves #40 SubscriptionManager.route ignores subscribe.event envelopes to prevent recursive cascade on empty filters.
Remove session-owned subscriptions when a session ends #54 Subscriptions track owner SessionId; runtime cleans them up on session close.
StreamManager.subscribeInbound silently overwrites an existing subscriber for the same stream id #62 StreamManager.subscribeInbound rejects duplicate subscriptions on the same stream id with failedPrecondition.

Client dispatch

Forward unmatched job progress envelopes to ARCPClient.unhandled #55 / Stop double-delivering job.result_chunk envelopes to both the typed stream and unhandled #59 / Forward unmatched tool.error envelopes to ARCPClient.unhandled instead of dropping them #60 Unmatched job.progress, job.result_chunk, and tool.error envelopes fall through to client.unhandled instead of being silently dropped or double-delivered.

Runtime semantics

Implement idempotency key semantics for tool invocation retries #43 Idempotency keys persist the terminal job.completed/failed/cancelled payload (scoped by principal + key) and replay it on duplicate invokes without re-running the handler.
Propagate trace context into envelopes #45 Envelope.init reads Tracing.current for default trace fields; JobManager wraps invocation handling under the inbound trace context.
Start retention sweeps or stop documenting them as automatic #47 Artifact retention sweep starts in ARCPRuntime.init; lease expiry sweep starts in JobManager.init.
Avoid silently discarding extra credentials returned during CredentialManager.rotate #63 CredentialManager.rotate keeps every credential returned by the provisioner instead of silently discarding extras.
Preserve ULID monotonicity when the system clock moves backward #56 Ulid.next preserves strict monotonicity across equal-and-backward clocks and tail overflow.
Replace Array.removeFirst in Mailbox with an O(1) queue #49 Mailbox switched to a head-index queue with periodic compaction (O(1) amortized drain).
Validate lease durations before grant and refresh #48 Lease grant/refresh reject non-positive durations with invalidArgument; handleLeaseRefresh surfaces errors as nack instead of swallowing them.

Docs

Make resume work across reconnected sessions or narrow the documented contract #44 Resume guide / CONFORMANCE.md narrowed to same-session replay only (no cross-session, no checkpoint, no open-stream re-emission).
Do not advertise interrupt support until handlers can observe interrupts #46 Interrupt support documented as wire-acked only — no handler observation callback. Capabilities.interrupt should stay false.
Implement heartbeat acknowledgement tracking or narrow the advertised support #57 Heartbeat recovery documented as telemetry-only (no inbound ack tracking).
Bring public API documentation up to DocC completeness #50 DocC comments added across the most user-facing public surface (Envelope, Capabilities, ARCPClient, JobContext, ToolHandler, ResultChunkStream, AuthScheme, JWTAuth, BearerAuth, etc).
Refresh stale README, conformance, and guide claims #51 Stale claims pruned (resume guide, conformance tables, jobs guide).

Tests / coverage

Raise line coverage above the documented 80 percent threshold #52 New unit + integration suites for the fixes and previously untested files: Mailbox, LeaseManager, PendingRegistry, JWTAuth (was 0% covered), client dispatch fallback, subscription cleanup, idempotency replay, ULID burst monotonicity.
137 tests across 32 suites, all green locally.

Closes

#40 #41 #42 #43 #44 #45 #46 #47 #48 #49 #50 #51 #52 #53 #54 #55 #56 #57 #58 #59 #60 #61 #62 #63

Test plan

swift build — clean
swift test — 137 passed locally
CI green on PR

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

New Features
- JWT challenge nonce binding validation for enhanced security.
- Idempotency support for tool invocations; duplicate calls replay cached results.
- Session-scoped subscription cleanup on disconnect.
- Automatic trace context propagation from ambient tracing.
Bug Fixes
- Enforced strict ULID monotonicity within millisecond bursts.
- Improved wildcard pattern matching for model capability verification.
- Added lease duration validation (positive values required).
Documentation
- Clarified job interrupt behavior (acknowledgment only, no handler callback).
- Specified subscription resume as same-session only.
- Updated conformance and capability documentation.

Fixes a batch of correctness and security gaps in the runtime/client and narrows several over-advertised features in the docs: Security / soundness - JWT challenge binding: JWTAuthValidator now requires a matching `nonce` claim when the runtime issues a challenge (#58). - ModelUse subset: rework `subsetViolation(of:)` to use a real glob-inclusion check so wildcard parents with suffix segments can't be escaped (#53). - Lease durations validated at grant/refresh; `handleLeaseRefresh` surfaces errors instead of swallowing them (#48). Concurrency - PendingRegistry and `ARCPClient.ping` use a slot state machine so the waiter is registered before any timeout race (#42). - Pending invocations are failed and progress/result streams are finished when the transport closes; invoke's send-error path cleans up state (#41). - `ARCPRuntime.register` race window closed by inserting `jobManagers` entry before iterating `registeredHandlers` (#61). - Subscriptions track owner session id; cleanup runs when the session ends (#54). `subscribe.event` envelopes are skipped during route(), preventing recursive cascade on empty filters (#40). - `StreamManager.subscribeInbound` refuses a second subscriber for the same stream id with `failedPrecondition` (#62). Client dispatch - Unmatched `job.progress`, `job.result_chunk`, and `tool.error` envelopes fall through to `unhandled` instead of being silently dropped / double-delivered (#55, #59, #60). Runtime semantics - Idempotency keys persist the terminal `job.completed/failed/cancelled` payload (scoped by principal + key) and replay it on duplicate invokes without re-running the handler (#43). - Trace fields default to `Tracing.current`; `JobManager` propagates the inbound trace context across job lifecycle and handler emissions (#45). - Artifact retention sweep starts in `ARCPRuntime.init`; lease expiry sweep starts in `JobManager.init` (#47). - `CredentialManager.rotate` keeps every credential returned by the provisioner instead of silently discarding extras (#63). - ULID generator preserves strict monotonicity across equal/backward clocks and overflow (#56). Mailbox - O(1) drain via head index with periodic compaction (#49). Docs - Resume guide / CONFORMANCE narrowed to same-session replay only (#44). - Interrupt support documented as wire-acked only — no handler observation callback (#46). - Heartbeat recovery documented as telemetry-only (#57). - DocC comments added across the most user-facing public surface (#50). Tests - New unit / integration suites cover the fixes and previously untested files: Mailbox, LeaseManager, PendingRegistry, JWTAuth, client dispatch fallback, subscription cleanup, idempotency replay, ULID burst (#52). Closes #40 #41 #42 #43 #44 #45 #46 #47 #48 #49 #50 #51 #52 #53 #54 #55 #56 #57 #58 #59 #60 #61 #62 #63 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-25T01:50:14Z

Walkthrough

This PR advances the ARCP v1.1 Swift SDK with multi-layered improvements: JWT challenge nonce binding for replay prevention, ULID/queue/registry infrastructure refactors, stream single-subscriber enforcement, runtime initialization with idempotency caching and tracing propagation, client ping/pong state machine, and enhanced unhandled envelope routing. Includes comprehensive test coverage and specification clarifications.

Changes

ARCP v1.1 Runtime Enhancements

Layer / File(s)	Summary
Authentication and Specification `Sources/ARCP/Auth/JWTAuth.swift`, `Tests/ARCPTests/JWTAuthTests.swift`, `Sources/ARCP/Auth/AuthScheme.swift`, `Sources/ARCP/Auth/BearerAuth.swift`, `Sources/ARCP/Messages/Common.swift`, `CONFORMANCE.md`, `docs/guides/jobs.md`, `docs/guides/resume.md`, `docs/conformance.md`	`JWTAuthValidator.validate()` now enforces matching challenge nonce in JWT payload when a runtime nonce is provided, throwing `ARCPError.unauthenticated` on mismatch or absence. `ARCPClaims` adds optional `nonce` and `jti` fields. Comprehensive specification updates clarify interrupt (wire-ack only, no handler callback), heartbeat (telemetry-only, no ack tracking), and resume (same-session only) behavior.
Infrastructure Refactors `Sources/ARCP/Ids/Ulid.swift`, `Sources/ARCP/Runtime/Mailbox.swift`, `Sources/ARCP/Runtime/PendingRegistry.swift`, `Tests/ARCPTests/IdsTests.swift`, `Tests/ARCPTests/MailboxTests.swift`, `Tests/ARCPTests/PendingRegistryTests.swift`	ULID `MonotonicGenerator.next()` now enforces strict monotonicity across clock backward and overflow scenarios by tracking tail increments and advancing timestamp on full-tail overflow. `Mailbox` replaces `removeFirst()` with head-index dequeue plus periodic compaction for O(1)-amortized performance. `PendingRegistry` refactored to slot-state model (`pending`/`waiting`/`readyValue`/`readyError`) for race-safe suspend/resolve with timeout.
Envelope and Tracing `Sources/ARCP/Envelope/Envelope.swift`, `Sources/ARCP/Client/ResultChunkStream.swift`	`Envelope.init` now inherits `traceId`, `spanId`, `parentSpanId` from ambient `Tracing.current` when available, propagating distributed traces without explicit parameters. Added comprehensive field documentation. `ResultChunkStream` documentation expanded.
Streams and Subscriptions `Sources/ARCP/Runtime/StreamManager.swift`, `Sources/ARCP/Runtime/SubscriptionManager.swift`	`StreamManager.subscribeInbound()` now throws `ARCPError.failedPrecondition` on duplicate subscribe, enforcing single-subscriber semantics. `SubscriptionManager` adds `ownerSessionId` tracking and new `removeAllOwned(by:reason:)` API for cleanup; `route()` ignores `.subscribeEvent` payloads to prevent cascades.
Runtime Initialization `Sources/ARCP/Runtime/ARCPRuntime.swift`, `Sources/ARCP/Runtime/JobManager.swift`, `Sources/ARCP/Runtime/LeaseManager.swift`, `Sources/ARCP/Runtime/CredentialManager.swift`, `Sources/ARCP/Runtime/JobContext.swift`, `Sources/ARCP/Runtime/ToolHandler.swift`, `Tests/ARCPTests/LeaseManagerTests.swift`	`ARCPRuntime` now unconditionally starts artifact retention sweep on init and exposes `stop()` for shutdown; inserts `JobManager` into `jobManagers` before registering handlers; cleans up session-owned subscriptions on session shutdown. `JobManager` accepts `eventLog` and `principalSubject` for idempotency tracking. `LeaseManager.grant()` and `refresh()` validate positive seconds, throwing `ARCPError.invalidArgument` for nonpositive values. `CredentialManager.rotate()` now treats provisioner output as full replacement set, revoking old credential and persisting all newly issued credentials.
Job Execution with Idempotency `Sources/ARCP/Runtime/JobManager.swift`, `Tests/ARCPTests/Integration/IdempotencyTests.swift`	`JobManager.handleToolInvoke()` extracts `TraceContext` from envelope and runs handler within `Tracing.$current`. Terminal responses cached by `(principal, idempotencyKey)` are replayed for duplicate invocations (handler executes once). `runJob()` refactored to compute terminal state, persist idempotency before emission, send envelope, then transition state. Handler cancellation maps to `.jobCancelled`; `ARCPError` maps to `.jobFailed`.
Client Dispatch and Unhandled Routing `Sources/ARCP/Client/ARCPClient.swift`, `Tests/ARCPTests/Integration/ClientDispatchTests.swift`	`ARCPClient` ping/pong refactored from continuation-per-MessageId into `PongSlot` state machine (`pending`/`waiting`/`readyValue`/`readyError`) for race-safe timeout and completion. Dispatch now routes unmatched `jobProgress`, `jobResultChunk`, `toolError` to `client.unhandled` instead of silently dropping. `finishUnhandled()` propagates transport-closed error to all pending pong slots, invocations, and result streams. `invoke()` now handles send errors by cleaning up pending invocation state.
Model Pattern Matching `Sources/ARCP/Messages/ModelUse.swift`, `Tests/ARCPTests/ModelUseTests.swift`	`ModelUse.subsetViolation()` replaces simple prefix check with conservative `patternCovers(parent:child:)` algorithm that decomposes wildcard patterns into segments and enforces prefix/suffix/middle-segment constraints for soundness.
Subscription Cleanup Integration `Tests/ARCPTests/Integration/SubscriptionCleanupTests.swift`	New integration tests verify no cascading `.subscribeEvent` payloads on empty-filter subscription and validate `removeAllOwned()` correctly removes only session-owned subscriptions while preserving foreign ones.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related issues

Implement idempotency key semantics for tool invocation retries #43 — Idempotency replay caching by principal+idempotency key in JobManager directly addresses runtime idempotency behavior.
Bind the challenge nonce in JWTAuthValidator so signed JWTs cannot be replayed #58 — JWT nonce claim enforcement in JWTAuthValidator.validate(auth:challenge:) directly implements challenge nonce binding.
Validate lease durations before grant and refresh #48 — LeaseManager validation for positive durations directly addresses lease duration enforcement requirement.
Replace Array.removeFirst in Mailbox with an O(1) queue #49 — Mailbox head-index O(1)-amortized dequeue directly implements the requested queue optimization.
StreamManager.subscribeInbound silently overwrites an existing subscriber for the same stream id #62 — StreamManager.subscribeInbound() single-subscriber enforcement directly fixes the continuation-overwrite bug.
Avoid silently discarding extra credentials returned during CredentialManager.rotate #63 — CredentialManager.rotate() multi-credential handling directly addresses dropped extra credentials during rotation.
Preserve ULID monotonicity when the system clock moves backward #56 — ULID MonotonicGenerator clock-backward and overflow handling directly address strict monotonicity requirements.
Stop double-delivering job.result_chunk envelopes to both the typed stream and unhandled #59 — ClientDispatchTests suite for job.result_chunk unhandled routing directly tests unmatched envelope fallback.
Reject model.use child leases that are not true wildcard subsets #53 — ModelUse.subsetViolation() wildcard suffix algorithm directly fixes the wildcard subset soundness bug.
Forward unmatched tool.error envelopes to ARCPClient.unhandled instead of dropping them #60 — Client dispatch routing of unmatched tool.error to unhandled directly addresses tool.error fallback.
Forward unmatched job progress envelopes to ARCPClient.unhandled #55 — Client dispatch routing of unmatched job.progress to unhandled directly implements job.progress fallback.
Remove session-owned subscriptions when a session ends #54 — SubscriptionManager ownership tracking and ARCPRuntime session-owned cleanup directly address missing subscription removal on session shutdown.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 43.30% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix: address 24 open issues — security, concurrency, semantics, docs' accurately summarizes the main objective of the PR, which is to fix multiple issues across several categories.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/open-issues-batch-2026-05-24

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

docs/guides/resume.md (1)

82-83: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update ResumePayload table to match current SDK behavior.

These rows conflict with Line 22-25. The table currently implies support that the scope section explicitly says is not implemented/ignored.

Suggested doc fix

-| `checkpointId` | `String?` | Reserved for checkpoint-based resume (deferred to v0.2) |
-| `includeOpenStreams` | `Bool` | If true, re-emit `stream.open` for streams still alive |
+| `checkpointId` | `String?` | Not implemented in this SDK release (`ARCPError.unimplemented`) |
+| `includeOpenStreams` | `Bool` | Currently ignored; open streams are not re-emitted |

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/guides/resume.md` around lines 82 - 83, The ResumePayload table is
inconsistent with the SDK scope (which says these fields are not
implemented/ignored); update the ResumePayload table to reflect current behavior
by removing or marking the `checkpointId` and `includeOpenStreams` rows as "Not
implemented / Ignored (reserved for future release)" (or explicitly state
deferred to v0.2 for `checkpointId`), ensuring the table entries for
`checkpointId` and `includeOpenStreams` match the text in the scope section and
no longer imply active support.

🧹 Nitpick comments (1)

docs/guides/resume.md (1)
68-68: 💤 Low value

Use includeOpenStreams: false in sample to avoid implying current support.

Since this release documents the field as ignored, setting it to false keeps the example aligned with actual behavior.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/guides/resume.md` at line 68, Update the example ResumePayload call to
reflect current behavior by setting includeOpenStreams to false: change the
ResumePayload(...) sample that uses ResumePayload(afterMessageId: lastSeen,
includeOpenStreams: true) so it passes includeOpenStreams: false instead,
ensuring the example matches the documented ignored behavior of the field.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/guides/resume.md`:
- Around line 12-14: The two sentences conflict: reconcile and make a single
clear statement that resume only succeeds when the original session identifier
survives reconnect (session_id must be the same) and that replay is performed
from that surviving session's event log starting after after_message_id; replace
the phrasing that says "new session" with explicit language describing the two
possible behaviors (either reconnect to same session_id and replay from that
session's log after after_message_id, or if a new session_id is allocated then
resume is not possible) and update all occurrences referencing session_id and
after_message_id to use this unified wording.

In `@Sources/ARCP/Client/ARCPClient.swift`:
- Around line 325-333: Pending ping slots are being dropped or left pending
after transport closure so pings can wait for timeout; update the logic to fail
them immediately: in finishUnhandled()/the loop that iterates pendingPongs,
treat the .pending case the same as .waiting by resuming the awaiting
continuation with closedError (use the existing closedError symbol), and in
awaitPong() change the branch that currently treats .none like .pending to
instead fail immediately with closedError (so .none/.pending no longer wait for
timeout). Ensure you update handling for the enum cases .waiting, .pending,
.readyValue, .readyError and keep using pendingPongs and closedError symbols.

In `@Sources/ARCP/Messages/Common.swift`:
- Around line 38-40: The documented comment says heartbeat ack tracking isn't
implemented but Capabilities.init currently defaults heartbeatRecovery to .fail
and thus advertises recovery support; modify the initialization in
Capabilities.init to use a non-recovery default (e.g., .none or an explicit
"unsupported" variant) or add logic to prevent advertising heartbeat recovery
until ack tracking exists by changing the default value of the heartbeatRecovery
property or gating where Capabilities is serialized/advertised (refer to the
HeartbeatRecovery enum and Capabilities.init to locate the code).

In `@Sources/ARCP/Messages/ModelUse.swift`:
- Around line 34-40: The function patternCovers currently treats an expanded
child pattern as covered by a literal parent (e.g., parent "abc" vs child
"abc*abc"); to fix, add a guard that when parent contains no '*' but child does,
patternCovers returns false (i.e., reject wildcard children for literal-only
parents) before falling back to matches(pattern:value:). Update patternCovers to
check parent.contains("*") and child.contains("*") (or explicitly: if
!parent.contains("*") && child.contains("*") { return false }) so
subsetViolation(of:) cannot miss capability expansions.

In `@Sources/ARCP/Runtime/CredentialManager.swift`:
- Around line 67-103: In rotate(jobId:credentialId:) do not append the
provisioner response to the prior tracked set — treat the
provisioner.issue(lease:jobId:sessionId:) result as the authoritative full set:
compute newSet = next, determine removedIds = existing.ids minus newSet.ids,
optionally call await revokeWithRetry(id) for each removedId if you want to
revoke dropped credentials, then replace credentialsByJob[jobId] = newSet and
call try await retention.persistOutstanding(newSet.map(\.id), jobId: jobId);
this ensures omitted credentials are not kept as stale tracked state.

In `@Sources/ARCP/Runtime/PendingRegistry.swift`:
- Around line 24-26: The method awaitResponse currently unconditionally does
slots[id] = .pending which can overwrite an existing .waiting and orphan its
continuation; update awaitResponse to first check slots[id] for an existing
value and if a .waiting entry (or any non-nil) exists, prevent registration by
returning/throwing a descriptive error (e.g. DuplicateAwaitError) instead of
overwriting, otherwise insert the new .pending entry and proceed to create the
timeout Task; reference the awaitResponse function, the slots dictionary,
MessageId key, and the .pending/.waiting slot cases when making the check and
error handling so the original continuation is never lost.

In `@Tests/ARCPTests/Integration/IdempotencyTests.swift`:
- Line 10: Reset the shared static counter on CountingTool before each test
invocation to avoid state leakage between in-process runs: locate the test cases
that create IntegrationFixture(handler: CountingTool()) in
IdempotencyTests.swift (instances referencing CountingTool and
IntegrationFixture) and set CountingTool.counter = 0 (or the appropriate initial
value) immediately before the invocation sequence/fixture setup in each test
(including the other spots noted around lines 46–47 and 56–63) so each test
starts with a deterministic counter state.

---

Outside diff comments:
In `@docs/guides/resume.md`:
- Around line 82-83: The ResumePayload table is inconsistent with the SDK scope
(which says these fields are not implemented/ignored); update the ResumePayload
table to reflect current behavior by removing or marking the `checkpointId` and
`includeOpenStreams` rows as "Not implemented / Ignored (reserved for future
release)" (or explicitly state deferred to v0.2 for `checkpointId`), ensuring
the table entries for `checkpointId` and `includeOpenStreams` match the text in
the scope section and no longer imply active support.

---

Nitpick comments:
In `@docs/guides/resume.md`:
- Line 68: Update the example ResumePayload call to reflect current behavior by
setting includeOpenStreams to false: change the ResumePayload(...) sample that
uses ResumePayload(afterMessageId: lastSeen, includeOpenStreams: true) so it
passes includeOpenStreams: false instead, ensuring the example matches the
documented ignored behavior of the field.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 1cecc6d7-8cad-4f41-82d7-9b7f920b0122

📥 Commits

Reviewing files that changed from the base of the PR and between 743377f and ef54c28.

📒 Files selected for processing (32)

CONFORMANCE.md
Sources/ARCP/Auth/AuthScheme.swift
Sources/ARCP/Auth/BearerAuth.swift
Sources/ARCP/Auth/JWTAuth.swift
Sources/ARCP/Client/ARCPClient.swift
Sources/ARCP/Client/ResultChunkStream.swift
Sources/ARCP/Envelope/Envelope.swift
Sources/ARCP/Ids/Ulid.swift
Sources/ARCP/Messages/Common.swift
Sources/ARCP/Messages/ModelUse.swift
Sources/ARCP/Runtime/ARCPRuntime.swift
Sources/ARCP/Runtime/CredentialManager.swift
Sources/ARCP/Runtime/JobContext.swift
Sources/ARCP/Runtime/JobManager.swift
Sources/ARCP/Runtime/LeaseManager.swift
Sources/ARCP/Runtime/Mailbox.swift
Sources/ARCP/Runtime/PendingRegistry.swift
Sources/ARCP/Runtime/StreamManager.swift
Sources/ARCP/Runtime/SubscriptionManager.swift
Sources/ARCP/Runtime/ToolHandler.swift
Tests/ARCPTests/IdsTests.swift
Tests/ARCPTests/Integration/ClientDispatchTests.swift
Tests/ARCPTests/Integration/IdempotencyTests.swift
Tests/ARCPTests/Integration/SubscriptionCleanupTests.swift
Tests/ARCPTests/JWTAuthTests.swift
Tests/ARCPTests/LeaseManagerTests.swift
Tests/ARCPTests/MailboxTests.swift
Tests/ARCPTests/ModelUseTests.swift
Tests/ARCPTests/PendingRegistryTests.swift
docs/conformance.md
docs/guides/jobs.md
docs/guides/resume.md

coderabbitai · 2026-05-25T02:00:01Z

+  The runtime accepts the resume request on the *new* session and
+  replays from the new session's event log starting after
+  `after_message_id`.


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Clarify contradictory session semantics in resume scope.

Line 12-14 implies replay on a new session log, while Line 19-21 says resume only works if the same session_id survives reconnect. These conflict and will confuse implementers.

Suggested doc fix

-- After a transport drop, you must reconnect over a fresh transport. - The runtime accepts the resume request on the *new* session and - replays from the new session's event log starting after - `after_message_id`. +- After a transport drop, you must reconnect over a fresh transport. + Resume replay is scoped to the active `session_id` on reconnect and + starts after `after_message_id` in that same session's event log.

Also applies to: 19-21

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/guides/resume.md` around lines 12 - 14, The two sentences conflict: reconcile and make a single clear statement that resume only succeeds when the original session identifier survives reconnect (session_id must be the same) and that replay is performed from that surviving session's event log starting after after_message_id; replace the phrasing that says "new session" with explicit language describing the two possible behaviors (either reconnect to same session_id and replay from that session's log after after_message_id, or if a new session_id is allocated then resume is not possible) and update all occurrences referencing session_id and after_message_id to use this unified wording.

coderabbitai · 2026-05-25T02:00:01Z

+        let pongSlots = pendingPongs
        pendingPongs.removeAll()
+        for (_, slot) in pongSlots {
+            switch slot {
+            case .waiting(let cont):
+                cont.resume(throwing: closedError)
+            case .pending, .readyValue, .readyError:
+                break
+            }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail pending ping immediately after transport close.

finishUnhandled() drops .pending pong slots, and awaitPong() currently treats .none like .pending (Line 389), so a ping in that window waits for timeout instead of getting the transport-closed error immediately.

💡 Suggested fix

public actor ARCPClient { + private var isTransportClosed = false @@ private func finishUnhandled() { + isTransportClosed = true unhandledContinuation?.finish() unhandledContinuation = nil @@ private func awaitPong(id: MessageId) async throws -> PongPayload { try await withCheckedThrowingContinuation { (cont: CheckedContinuation<PongPayload, Error>) in switch pendingPongs[id] { case .readyValue(let payload): pendingPongs.removeValue(forKey: id) cont.resume(returning: payload) case .readyError(let err): pendingPongs.removeValue(forKey: id) cont.resume(throwing: err) - case .pending, .none: + case .pending: pendingPongs[id] = .waiting(cont) + case .none: + let err = ARCPError.unavailable(reason: "transport closed", retryAfter: nil) + cont.resume(throwing: err) case .waiting: cont.resume(throwing: ARCPError.internal(detail: "double-attach pong \(id)", cause: nil)) } } } }

Also applies to: 389-391

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Sources/ARCP/Client/ARCPClient.swift` around lines 325 - 333, Pending ping slots are being dropped or left pending after transport closure so pings can wait for timeout; update the logic to fail them immediately: in finishUnhandled()/the loop that iterates pendingPongs, treat the .pending case the same as .waiting by resuming the awaiting continuation with closedError (use the existing closedError symbol), and in awaitPong() change the branch that currently treats .none like .pending to instead fail immediately with closedError (so .none/.pending no longer wait for timeout). Ensure you update handling for the enum cases .waiting, .pending, .readyValue, .readyError and keep using pendingPongs and closedError symbols.

coderabbitai · 2026-05-25T02:00:01Z

+    /// Heartbeat recovery policy (RFC §10.3). **Heartbeat ack tracking is
+    /// not implemented in this release** — heartbeats are telemetry only.
    public var heartbeatRecovery: HeartbeatRecovery


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Align heartbeatRecovery default with documented behavior.

These docs correctly state heartbeat recovery is not implemented, but Capabilities.init still defaults heartbeatRecovery to .fail (Line 70). That can over-advertise behavior during negotiation. Consider defaulting to a non-recovery value (or gating advertisement) until ack tracking exists.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Sources/ARCP/Messages/Common.swift` around lines 38 - 40, The documented comment says heartbeat ack tracking isn't implemented but Capabilities.init currently defaults heartbeatRecovery to .fail and thus advertises recovery support; modify the initialization in Capabilities.init to use a non-recovery default (e.g., .none or an explicit "unsupported" variant) or add logic to prevent advertising heartbeat recovery until ack tracking exists by changing the default value of the heartbeatRecovery property or gating where Capabilities is serialized/advertised (refer to the HeartbeatRecovery enum and Capabilities.init to locate the code).

coderabbitai · 2026-05-25T02:00:01Z

+    static func patternCovers(parent: String, child: String) -> Bool {
+        if parent == "*" { return true }
+        if parent == child { return true }
+        // If child has no wildcards, defer to literal matching.
+        if !child.contains("*") {
+            return matches(pattern: parent, value: child)
+        }


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Reject wildcard children when the parent pattern is literal-only.

patternCovers can currently return true for expanded child patterns when parent has no * (e.g., parent = "abc", child = "abc*abc"). That is a false positive and can let subsetViolation(of:) miss a capability expansion.

🔧 Proposed fix

static func patternCovers(parent: String, child: String) -> Bool { if parent == "*" { return true } if parent == child { return true } + if !parent.contains("*") && child.contains("*") { return false } // If child has no wildcards, defer to literal matching. if !child.contains("*") { return matches(pattern: parent, value: child) }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Sources/ARCP/Messages/ModelUse.swift` around lines 34 - 40, The function patternCovers currently treats an expanded child pattern as covered by a literal parent (e.g., parent "abc" vs child "abc*abc"); to fix, add a guard that when parent contains no '*' but child does, patternCovers returns false (i.e., reject wildcard children for literal-only parents) before falling back to matches(pattern:value:). Update patternCovers to check parent.contains("*") and child.contains("*") (or explicitly: if !parent.contains("*") && child.contains("*") { return false }) so subsetViolation(of:) cannot miss capability expansions.

coderabbitai · 2026-05-25T02:00:01Z

+    /// The `CredentialProvisioner` contract is treated as returning the full
+    /// credential set for the lease (a provider may mint multiple paired
+    /// credentials per issuance — e.g. access + refresh token, or several
+    /// vendor credentials in a single transactional issue). The rotation
+    /// flow therefore:
+    ///
+    /// 1. Revokes the credential that is being rotated.
+    /// 2. Removes that credential from the job's tracked set.
+    /// 3. Appends every credential returned by the provisioner to the job's
+    ///    tracked set so the runtime can revoke them later — none are
+    ///    silently discarded.
+    /// 4. Persists the updated outstanding-id list through retention.
+    ///
+    /// Returns the credential identified as the replacement: the entry whose
+    /// `id` matches `credentialId` if the provisioner is performing an
+    /// in-place rotation, otherwise the first newly minted entry.
    public func rotate(jobId: JobId, credentialId: String) async throws -> ProvisionedCredential {
        guard let lease = leaseByJob[jobId] else {
            throw ARCPError.notFound(kind: "job credentials", id: jobId.rawValue)
        }
        let next = try await provisioner.issue(lease: lease, jobId: jobId, sessionId: sessionId)
-        guard let replacement = next.first else {
-            throw ARCPError.unavailable(reason: "credential rotation returned no credential", retryAfter: nil)
+        guard !next.isEmpty else {
+            throw ARCPError.unavailable(
+                reason: "credential rotation returned no credentials",
+                retryAfter: nil
+            )
        }
        var existing = credentialsByJob[jobId] ?? []
-        if let old = existing.first(where: { $0.id == credentialId }) {
-            await revokeWithRetry(old.id)
+        if existing.contains(where: { $0.id == credentialId }) {
+            await revokeWithRetry(credentialId)
            existing.removeAll { $0.id == credentialId }
        }
-        existing.append(replacement)
+        existing.append(contentsOf: next)
        credentialsByJob[jobId] = existing
        try await retention.persistOutstanding(existing.map(\.id), jobId: jobId)
+        let replacement = next.first(where: { $0.id == credentialId }) ?? next[0]
        return replacement


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid retaining stale credentials when rotation returns a full replacement set.

Line 67 says rotation treats provisioner output as the full lease credential set, but Lines 99-101 append next onto prior existing. Credentials omitted from next remain tracked/persisted, which can stale retention state and later revocation behavior. Replace tracked state with the returned set (optionally revoking dropped IDs).

Proposed fix

- var existing = credentialsByJob[jobId] ?? [] - if existing.contains(where: { $0.id == credentialId }) { + let existing = credentialsByJob[jobId] ?? [] + if existing.contains(where: { $0.id == credentialId }) { await revokeWithRetry(credentialId) - existing.removeAll { $0.id == credentialId } } - existing.append(contentsOf: next) - credentialsByJob[jobId] = existing - try await retention.persistOutstanding(existing.map(\.id), jobId: jobId) + // Provisioner response is the authoritative replacement set. + let nextIds = Set(next.map(\.id)) + for stale in existing where !nextIds.contains(stale.id) && stale.id != credentialId { + await revokeWithRetry(stale.id) + } + credentialsByJob[jobId] = next + try await retention.persistOutstanding(next.map(\.id), jobId: jobId) let replacement = next.first(where: { $0.id == credentialId }) ?? next[0]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

/// The `CredentialProvisioner` contract is treated as returning the full

/// credential set for the lease (a provider may mint multiple paired

/// credentials per issuance — e.g. access + refresh token, or several

/// vendor credentials in a single transactional issue). The rotation

/// flow therefore:

///

/// 1. Revokes the credential that is being rotated.

/// 2. Removes that credential from the job's tracked set.

/// 3. Appends every credential returned by the provisioner to the job's

/// tracked set so the runtime can revoke them later — none are

/// silently discarded.

/// 4. Persists the updated outstanding-id list through retention.

///

/// Returns the credential identified as the replacement: the entry whose

/// `id` matches `credentialId` if the provisioner is performing an

/// in-place rotation, otherwise the first newly minted entry.

public func rotate(jobId: JobId, credentialId: String) async throws -> ProvisionedCredential {

guard let lease = leaseByJob[jobId] else {

throw ARCPError.notFound(kind: "job credentials", id: jobId.rawValue)

}

let next = try await provisioner.issue(lease: lease, jobId: jobId, sessionId: sessionId)

guard let replacement = next.first else {

throw ARCPError.unavailable(reason: "credential rotation returned no credential", retryAfter: nil)

guard !next.isEmpty else {

throw ARCPError.unavailable(

reason: "credential rotation returned no credentials",

retryAfter: nil

)

}

var existing = credentialsByJob[jobId] ?? []

if let old = existing.first(where: { $0.id == credentialId }) {

await revokeWithRetry(old.id)

if existing.contains(where: { $0.id == credentialId }) {

await revokeWithRetry(credentialId)

existing.removeAll { $0.id == credentialId }

}

existing.append(replacement)

existing.append(contentsOf: next)

credentialsByJob[jobId] = existing

try await retention.persistOutstanding(existing.map(\.id), jobId: jobId)

let replacement = next.first(where: { $0.id == credentialId }) ?? next[0]

return replacement

/// The `CredentialProvisioner` contract is treated as returning the full

/// credential set for the lease (a provider may mint multiple paired

/// credentials per issuance — e.g. access + refresh token, or several

/// vendor credentials in a single transactional issue). The rotation

/// flow therefore:

///

/// 1. Revokes the credential that is being rotated.

/// 2. Removes that credential from the job's tracked set.

/// 3. Appends every credential returned by the provisioner to the job's

/// tracked set so the runtime can revoke them later — none are

/// silently discarded.

/// 4. Persists the updated outstanding-id list through retention.

///

/// Returns the credential identified as the replacement: the entry whose

/// `id` matches `credentialId` if the provisioner is performing an

/// in-place rotation, otherwise the first newly minted entry.

public func rotate(jobId: JobId, credentialId: String) async throws -> ProvisionedCredential {

guard let lease = leaseByJob[jobId] else {

throw ARCPError.notFound(kind: "job credentials", id: jobId.rawValue)

}

let next = try await provisioner.issue(lease: lease, jobId: jobId, sessionId: sessionId)

guard !next.isEmpty else {

throw ARCPError.unavailable(

reason: "credential rotation returned no credentials",

retryAfter: nil

)

}

let existing = credentialsByJob[jobId] ?? []

if existing.contains(where: { $0.id == credentialId }) {

await revokeWithRetry(credentialId)

}

// Provisioner response is the authoritative replacement set.

let nextIds = Set(next.map(\.id))

for stale in existing where !nextIds.contains(stale.id) && stale.id != credentialId {

await revokeWithRetry(stale.id)

}

credentialsByJob[jobId] = next

try await retention.persistOutstanding(next.map(\.id), jobId: jobId)

let replacement = next.first(where: { $0.id == credentialId }) ?? next[0]

return replacement

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Sources/ARCP/Runtime/CredentialManager.swift` around lines 67 - 103, In rotate(jobId:credentialId:) do not append the provisioner response to the prior tracked set — treat the provisioner.issue(lease:jobId:sessionId:) result as the authoritative full set: compute newSet = next, determine removedIds = existing.ids minus newSet.ids, optionally call await revokeWithRetry(id) for each removedId if you want to revoke dropped credentials, then replace credentialsByJob[jobId] = newSet and call try await retention.persistOutstanding(newSet.map(\.id), jobId: jobId); this ensures omitted credentials are not kept as stale tracked state.

coderabbitai · 2026-05-25T02:00:01Z

    public func awaitResponse(id: MessageId, deadline: Duration) async throws -> Response {
-        try await withThrowingTaskGroup(of: Response.self) { group in
-            group.addTask {
-                try await self.suspend(id: id)
-            }
-            group.addTask { [weak self] in
-                try await Task.sleep(for: deadline)
-                // Drop the waiter without resolving it — that lets the
-                // deadline error win the race deterministically. We don't
-                // want a cancel-resume here because that would race the
-                // resume against this task's throw.
-                await self?.dropForTimeout(id: id)
-                throw ARCPError.deadlineExceeded(operation: "pending response \(id)")
-            }
-            defer { group.cancelAll() }
-            guard let result = try await group.next() else {
-                throw ARCPError.internal(detail: "pending registry empty", cause: nil)
-            }
-            return result
+        slots[id] = .pending
+        let timeoutTask = Task { [weak self] in


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard against duplicate awaitResponse registration for the same id.

At Line 25, unconditional slots[id] = .pending can overwrite an existing .waiting slot. A second awaitResponse for the same MessageId can orphan the first continuation (never resumed).

Suggested fix

public func awaitResponse(id: MessageId, deadline: Duration) async throws -> Response { + if slots[id] != nil { + throw ARCPError.internal(detail: "duplicate pending \(id)", cause: nil) + } slots[id] = .pending let timeoutTask = Task { [weak self] in try? await Task.sleep(for: deadline) if Task.isCancelled { return } await self?.dropForTimeout(id: id) } defer { timeoutTask.cancel() } return try await suspend(id: id) }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

public func awaitResponse(id: MessageId, deadline: Duration) async throws -> Response {

try await withThrowingTaskGroup(of: Response.self) { group in

group.addTask {

try await self.suspend(id: id)

}

group.addTask { [weak self] in

try await Task.sleep(for: deadline)

// Drop the waiter without resolving it — that lets the

// deadline error win the race deterministically. We don't

// want a cancel-resume here because that would race the

// resume against this task's throw.

await self?.dropForTimeout(id: id)

throw ARCPError.deadlineExceeded(operation: "pending response \(id)")

}

defer { group.cancelAll() }

guard let result = try await group.next() else {

throw ARCPError.internal(detail: "pending registry empty", cause: nil)

}

return result

slots[id] = .pending

let timeoutTask = Task { [weak self] in

public func awaitResponse(id: MessageId, deadline: Duration) async throws -> Response {

if slots[id] != nil {

throw ARCPError.internal(detail: "duplicate pending \(id)", cause: nil)

}

slots[id] = .pending

let timeoutTask = Task { [weak self] in

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Sources/ARCP/Runtime/PendingRegistry.swift` around lines 24 - 26, The method awaitResponse currently unconditionally does slots[id] = .pending which can overwrite an existing .waiting and orphan its continuation; update awaitResponse to first check slots[id] for an existing value and if a .waiting entry (or any non-nil) exists, prevent registration by returning/throwing a descriptive error (e.g. DuplicateAwaitError) instead of overwriting, otherwise insert the new .pending entry and proceed to create the timeout Task; reference the awaitResponse function, the slots dictionary, MessageId key, and the .pending/.waiting slot cases when making the check and error handling so the original continuation is never lost.

coderabbitai · 2026-05-25T02:00:01Z

+struct IdempotencyTests {
+    @Test("Second invoke with same idempotency key returns the cached terminal")
+    func duplicateInvocationReplays() async throws {
+        let fixture = IntegrationFixture(handler: CountingTool())


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Reset shared counter to keep this test deterministic.

CountingTool.counter is static, so state can leak between in-process executions. Reset it before the invocation sequence.

💡 Suggested fix

`@Suite`("Idempotency semantics (issue `#43`)") struct IdempotencyTests { `@Test`("Second invoke with same idempotency key returns the cached terminal") func duplicateInvocationReplays() async throws { + await CountingTool.counter.reset() let fixture = IntegrationFixture(handler: CountingTool()) let open = try await fixture.open() defer { open.close() } @@ private actor AtomicCounter { private(set) var value: Int = 0 + func reset() { value = 0 } `@discardableResult` func increment() -> Int { value += 1 return value } }

Also applies to: 46-47, 56-63

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Tests/ARCPTests/Integration/IdempotencyTests.swift` at line 10, Reset the shared static counter on CountingTool before each test invocation to avoid state leakage between in-process runs: locate the test cases that create IntegrationFixture(handler: CountingTool()) in IdempotencyTests.swift (instances referencing CountingTool and IntegrationFixture) and set CountingTool.counter = 0 (or the appropriate initial value) immediately before the invocation sequence/fixture setup in each test (including the other spots noted around lines 46–47 and 56–63) so each test starts with a deterministic counter state.

coderabbitai Bot reviewed May 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: address 24 open issues — security, concurrency, semantics, docs#66

fix: address 24 open issues — security, concurrency, semantics, docs#66
nficano wants to merge 1 commit into
mainfrom
fix/open-issues-batch-2026-05-24

nficano commented May 25, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 25, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 25, 2026

Uh oh!

coderabbitai Bot May 25, 2026

Uh oh!

coderabbitai Bot May 25, 2026

Uh oh!

coderabbitai Bot May 25, 2026

Uh oh!

coderabbitai Bot May 25, 2026

Uh oh!

coderabbitai Bot May 25, 2026

Uh oh!

coderabbitai Bot May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nficano commented May 25, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Security

Concurrency / lifecycle

Client dispatch

Runtime semantics

Docs

Tests / coverage

Closes

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

nficano commented May 25, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 25, 2026 •

edited

Loading