Skip to content

Commit ac6af28

Browse files
nficanoclaude
andcommitted
fix: address open audit issues #36-#58
Spans client correlation/cancellation, runtime handshake/resume/heartbeat supervision, idempotency replay, result stream sizing/units, eventlog streaming/serialization, stdio transport, CLI/conformance/guides docs alignment, and test determinism + branch coverage. Closes #36 Closes #37 Closes #38 Closes #39 Closes #40 Closes #41 Closes #42 Closes #43 Closes #44 Closes #45 Closes #46 Closes #47 Closes #48 Closes #49 Closes #50 Closes #51 Closes #52 Closes #53 Closes #54 Closes #55 Closes #56 Closes #57 Closes #58 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 98897be commit ac6af28

33 files changed

Lines changed: 1186 additions & 328 deletions

docs/cli.md

Lines changed: 22 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -31,22 +31,23 @@ arcp serve --token demo-token --principal cli-user
3131

3232
### `arcp submit`
3333

34-
Submit a job to a running runtime and print the result.
34+
Submit a single job and print the terminal `job.result` JSON.
3535

3636
```bash
37-
arcp submit --url ws://localhost:7777/arcp --agent echo \
38-
--input '{"url":"https://example.com"}' --token my-bearer-token
37+
arcp submit --url ws://localhost:7777/arcp --token my-bearer-token \
38+
--agent echo --input '{"url":"https://example.com"}' \
39+
--lease '{"net.fetch":["https://*"]}'
3940
```
4041

4142
**Arguments:**
4243

43-
| Argument | Description |
44-
|---|---|
45-
| `--url` | WebSocket URL of the runtime |
46-
| `--agent` | Agent name |
47-
| `--input` | JSON-encoded input |
48-
| `--token` | Bearer token |
49-
| `--lease` | Lease JSON object |
44+
| Argument | Default | Description |
45+
|---|---|---|
46+
| `--url` | required | WebSocket URL of the runtime (e.g. `ws://host:7777/arcp`) |
47+
| `--token` | required | Bearer token |
48+
| `--agent` | required | Agent name (optionally `name@version`) |
49+
| `--input` | `null` | JSON-encoded input passed to the agent |
50+
| `--lease` | `{}` | Lease as JSON object (e.g. `{"net.fetch":["https://*"]}`) |
5051

5152
### `arcp tail`
5253

@@ -60,10 +61,18 @@ Prints each `JobEvent` as a JSON line. Press `Ctrl-C` to stop.
6061

6162
### `arcp replay`
6263

63-
Replay a recorded event stream from a JSONL file.
64+
Replay a recorded event stream from a SQLite event log.
6465

6566
```bash
66-
arcp replay --db events.sqlite --session SESSION_ID
67+
arcp replay --db events.sqlite --session SESSION_ID --after-seq 0
6768
```
6869

69-
Useful for debugging: record a live stream from `tail` and replay it from the SQLite event log.
70+
**Arguments:**
71+
72+
| Argument | Default | Description |
73+
|---|---|---|
74+
| `--db` | required | SQLite event log path |
75+
| `--session` | required | Session id to replay |
76+
| `--after-seq` | `0` | Skip envelopes whose `event_seq` is &lt;= this value |
77+
78+
Each line is one envelope as JSON. Useful for debugging an event stream recorded with `arcp serve --db events.sqlite` after the fact.

docs/conformance.md

Lines changed: 60 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -1,57 +1,78 @@
11
# Conformance
22

3-
This page records which sections of the [ARCP v1.1 specification](https://arcp.dev/spec/v1.1) are implemented by the Python SDK, and points to the source that implements each requirement.
3+
This page records which sections of the [ARCP v1.1 specification](https://arcp.dev/spec/v1.1) are implemented by the Python SDK, and points to the source that implements each requirement. Every source path listed below exists in this repository under `src/arcp/`.
44

55
## Conformance matrix
66

77
| Spec section | Title | Status | Source |
88
|---|---|---|---|
9-
| §4 | Versioning | ✅ Full | `src/arcp/_version.py` |
10-
| §5 | Transport framing | ✅ Full | `src/arcp/_transport/` |
11-
| §6 | Sessions | ✅ Full | `src/arcp/_runtime/session.py` |
12-
| §6.1 | Authentication — Bearer | ✅ Full | `src/arcp/_auth/bearer.py` |
13-
| §6.1 | Authentication — custom verifier | ✅ Full | `src/arcp/_auth/jwt.py` |
14-
| §6.2 | Agent versions | ✅ Full | `src/arcp/_runtime/server.py` |
15-
| §6.3 | Stream resume | ✅ Full | `src/arcp/_runtime/session.py` |
16-
| §7 | Jobs | ✅ Full | `src/arcp/_runtime/_handlers.py` |
17-
| §7.1 | Idempotency keys | ✅ Full | `src/arcp/_store/idempotency.py` |
18-
| §7.2 | Job cancellation | ✅ Full | `src/arcp/_runtime/_handlers.py` |
19-
| §8 | Job events | ✅ Full | `src/arcp/_envelope.py` |
20-
| §8.1 | `job.queued` | ✅ Full | `src/arcp/_envelope.py` |
21-
| §8.2 | `job.started` | ✅ Full | `src/arcp/_envelope.py` |
22-
| §8.3 | `job.log` | ✅ Full | `src/arcp/_runtime/_handlers.py` |
23-
| §8.4 | `job.progress` | ✅ Full | `src/arcp/_runtime/_handlers.py` |
24-
| §8.5 | `job.result_chunk` | ✅ Full | `src/arcp/_runtime/_handlers.py` |
25-
| §8.6 | `job.completed` | ✅ Full | `src/arcp/_runtime/_handlers.py` |
26-
| §8.7 | `job.failed` | ✅ Full | `src/arcp/_runtime/_handlers.py` |
27-
| §8.8 | `job.cancelled` | ✅ Full | `src/arcp/_runtime/_handlers.py` |
28-
| §8.9 | `job.heartbeat` | ✅ Full | `src/arcp/_runtime/_handlers.py` |
29-
| §9 | Leases | ✅ Full | `src/arcp/_runtime/lease.py` |
30-
| §9.1 | Cost budgets | ✅ Full | `src/arcp/_runtime/lease.py` |
31-
| §9.2 | Time budgets (`expires_in_s`) | ✅ Full | `src/arcp/_runtime/lease.py` |
32-
| §9.3 | `expires_at` (absolute timestamp) | ✅ Full | `src/arcp/_runtime/lease.py` |
33-
| §10 | Delegation | ✅ Full | `src/arcp/_runtime/_handlers.py` |
34-
| §11 | Observability | ✅ Full | `src/arcp/middleware/otel.py` |
35-
| §12 | Errors | ✅ Full | `src/arcp/_errors.py` |
36-
| §13 | Capability negotiation | ✅ Full | `src/arcp/_runtime/server.py` |
37-
| §14 | List jobs | ✅ Full | `src/arcp/_runtime/_handler_list_jobs.py` |
38-
| §15 | Vendor extensions | ✅ Full | `src/arcp/_extensions.py` |
9+
| §4 | Versioning / protocol constant | Full | `src/arcp/_version.py`, `src/arcp/_envelope.py` |
10+
| §5.1 | Envelope shape (8 fields) | Full | `src/arcp/_envelope.py` |
11+
| §5.2 | Wire framing (WebSocket / stdio / memory) | Full | `src/arcp/_transport/` |
12+
| §6.1 | `session.hello` / `session.welcome` | Full | `src/arcp/_runtime/_handshake.py` |
13+
| §6.1 | Authentication — bearer (static + custom) | Full | `src/arcp/_auth/bearer.py` |
14+
| §6.1 | Authentication — JWT / JWKS | Full | `src/arcp/_auth/jwt.py` |
15+
| §6.2 | Capability / feature negotiation | Full | `src/arcp/_runtime/_handshake.py`, `src/arcp/_version.py` |
16+
| §6.3 | Session resume (`hello.resume`) | Full | `src/arcp/_runtime/_handshake.py`, `src/arcp/_store/eventlog.py` |
17+
| §6.4 | Heartbeat / liveness | Full | `src/arcp/_runtime/session.py` (`heartbeat_loop`) |
18+
| §6.5 | Ack backpressure | Full | `src/arcp/_runtime/_handlers.py` (`handle_ack`) |
19+
| §6.6 | `session.bye` / orderly close | Full | `src/arcp/_runtime/_handlers.py` (`handle_bye`) |
20+
| §7.1 | `job.submit` / `job.accepted` | Full | `src/arcp/_runtime/_handlers.py` (`handle_submit`) |
21+
| §7.2 | Idempotency keys (accept + terminal replay) | Full | `src/arcp/_store/idempotency.py`, `src/arcp/_runtime/_handlers.py` |
22+
| §7.3 | Agent versions (`agent@version` selection) | Full | `src/arcp/_runtime/server.py` (`_resolve_agent`) |
23+
| §7.4 | `job.cancel` (submitter-only) | Full | `src/arcp/_runtime/_handlers.py` (`handle_cancel`) |
24+
| §7.5 | `session.list_jobs` (filter + cursor) | Full | `src/arcp/_runtime/_handler_list_jobs.py` |
25+
| §7.6 | `job.subscribe` / `job.unsubscribe` | Full | `src/arcp/_runtime/_handlers.py` |
26+
| §8.1 | `log`, `thought`, `status` events | Full | `src/arcp/_runtime/job.py` (`JobContext`) |
27+
| §8.2 | `metric`, `tool_call`, `tool_result` | Full | `src/arcp/_runtime/job.py`, `src/arcp/_messages/event_bodies.py` |
28+
| §8.3 | `progress` events | Full | `src/arcp/_runtime/job.py` (`JobContext.progress`) |
29+
| §8.4 | `result_chunk` + streamed `job.result` | Full | `src/arcp/_runtime/result_stream.py` |
30+
| §8.5 | Terminal `job.result` / `job.error` | Full | `src/arcp/_runtime/job.py` (`Job.emit_result`, `Job.emit_error`) |
31+
| §9.1 | Lease shape + glob validation | Full | `src/arcp/_runtime/lease.py` (`validate_lease_shape`) |
32+
| §9.2 | Cost budgets (Decimal arithmetic) | Full | `src/arcp/_runtime/lease.py`, `src/arcp/_runtime/job.py` |
33+
| §9.3 | Lease constraints (`expires_at`, `model.use`) | Full | `src/arcp/_runtime/lease.py` (`validate_lease_constraints`) |
34+
| §9.4 | Lease watchdog (auto-expiry) | Full | `src/arcp/_runtime/_job_runner.py` (`_lease_watchdog`) |
35+
| §9.5 | Sublease shape rules for delegation | Full | `src/arcp/_runtime/lease.py` (`assert_lease_subset`) |
36+
| §10 | Provisioned credentials + revocation | Full | `src/arcp/_runtime/credentials.py`, `src/arcp/_runtime/_handlers.py` |
37+
| §11 | Observability — OTel transport wrapper | Full | `src/arcp/middleware/otel.py` |
38+
| §12 | Typed errors + `session.error` payload | Full | `src/arcp/_errors.py` |
39+
| §13 | Vendor extensions (`x-*` passthrough) | Full | `src/arcp/_extensions.py` |
40+
| §14 | Authorization defaults + policy seam | Full | `src/arcp/_runtime/server.py` (`AuthorizationContext`) |
3941

4042
## Notes
4143

4244
### §6.3 Stream resume
4345

44-
Resume remains session-scoped in the current implementation. Treat the older
45-
`resume_token` submit flow as deferred until the runtime exposes it publicly.
46+
`SessionResume` carries `(session_id, resume_token, last_event_seq)` in a
47+
`session.hello`. The runtime keeps a per-session resume record for
48+
`resume_window_sec` after disconnect, validates the resume token + principal
49+
on rejoin, reuses the original `session_id`, rotates the resume token, and
50+
replays buffered envelopes from the event log past `last_event_seq`. Jobs do
51+
not run across the disconnect boundary; only history is replayed.
4652

47-
### §9.3 `expires_at`
53+
### §6.5 Ack backpressure
4854

49-
`expires_at` accepts an ISO 8601 datetime string (UTC). The runtime converts it to `expires_in_s` for internal tracking.
55+
`session.ack` releases acked prefixes from the in-memory or SQLite event log
56+
synchronously inside the dispatch loop (per-session serialized).
57+
`AutoAckOptions` on the client coalesces acks by event count and interval.
5058

51-
### §10 Delegation
59+
### §7.2 Idempotency
5260

53-
Delegation tokens are signed JWTs. The SDK provides `runtime.create_delegation_token(principal, scopes)` and verifies incoming tokens automatically.
61+
A duplicate `job.submit` with the same `(principal, idempotency_key)`
62+
replays the original `job.accepted` AND, if the original job has reached a
63+
terminal state, also replays the stored `job.result` or `job.error`. This
64+
prevents the duplicate `JobHandle` from hanging on `await handle.done`.
5465

55-
### §15 Vendor extensions
66+
### §10 Provisioned credentials
5667

57-
Any `x-*` key in a submit payload or event payload is passed through without modification. Use `arcp._extensions.get_extension(event, "x-my-field")` to read them safely.
68+
Credentials are issued via a pluggable `CredentialProvisioner`, recorded in
69+
a durable `RevocationLog`, attached to `job.accepted`, and revoked when the
70+
job ends. Credential rotation is exposed via `JobContext.rotate_credential`.
71+
72+
### §13 Vendor extensions
73+
74+
Any `x-vendor.*` key in an envelope payload or event body is passed through
75+
without modification — every pydantic payload model in `arcp._messages.*`
76+
uses `model_config = ConfigDict(extra="allow")`. Validate keys with
77+
`arcp._extensions.validate_extension_key("x-vendor.my-field")` before sending
78+
to ensure forward compatibility.

docs/guides/jobs.md

Lines changed: 34 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -50,35 +50,46 @@ async def my_agent(input, ctx):
5050
await ctx.log("info", "starting")
5151
await ctx.progress(0, total=100)
5252

53-
for i in range(10):
54-
chunk = await do_work(i)
55-
await ctx.result_chunk(chunk)
56-
await ctx.progress((i + 1) * 10, total=100)
57-
await ctx.metric({"name": "cost.inference", "value": 0.001, "unit": "USD"})
53+
async with ctx.stream_result() as stream:
54+
for i in range(10):
55+
await stream.write(await do_work(i))
56+
await ctx.progress((i + 1) * 10, total=100)
57+
await ctx.metric({"name": "cost.inference", "value": 0.001, "unit": "USD"})
5858

5959
return {"done": True}
6060
```
6161

62-
| Method | Description |
62+
| Member | Description |
6363
|---|---|
64-
| `ctx.log(level, message)` | Emit a `job.log` event |
65-
| `ctx.progress(current, total=..., units=..., message=...)` | Emit a `job.progress` event |
66-
| `ctx.result_chunk(chunk)` | Emit a `job.result_chunk` event |
67-
| `ctx.metric(body)` | Emit a `metric` event |
68-
| `ctx.principal` | The authenticated client identity |
69-
| `ctx.job_id` | The current job ID |
70-
| `ctx.session_id` | The current session ID |
64+
| `ctx.log(level, message, attributes=...)` | Emit a `log` event (`level``debug`/`info`/`warn`/`error`) |
65+
| `ctx.progress(current, *, total=..., units=..., message=...)` | Emit a `progress` event (`total` is keyword-only) |
66+
| `ctx.result_chunk(body)` | Emit one `result_chunk` event |
67+
| `ctx.stream_result()` | Open a `ResultStream` writer for chunked results |
68+
| `ctx.metric(body)` | Emit a `metric` event (cost.* automatically decrements budget) |
69+
| `ctx.status(phase, message=...)` | Emit a `status` event |
70+
| `ctx.tool_call(body)` / `ctx.tool_result(body)` | Emit MCP-style tool-call events |
71+
| `ctx.authorize(capability, target)` | Validate a lease op for `capability:target` |
72+
| `ctx.rotate_credential(id, new_value)` | Publish a credential rotation |
73+
| `ctx.budget` | Read-only snapshot of remaining `cost.budget` |
74+
| `ctx.lease` / `ctx.lease_constraints` | The lease attached to this job |
75+
| `ctx.job_id` / `ctx.session_id` / `ctx.trace_id` | Identity helpers |
76+
| `ctx.agent` / `ctx.agent_version` / `ctx.agent_ref` | Agent name + selected version |
7177

7278
## Streaming results
7379

80+
The agent returns chunks through a `ResultStream`; the client consumes them via `JobHandle.chunks()` and joins on `JobHandle.done` for the terminal `job.result`.
81+
7482
```python
7583
handle = await client.submit(agent="stream", input={"n": 5})
76-
async for event in handle.events():
77-
if event.kind == "job.result_chunk":
78-
print(event.chunk) # each chunk as it arrives
79-
elif event.kind == "job.completed":
80-
print("Final:", event.result)
81-
break
84+
85+
# Collect chunks as they arrive (each chunk is a dict from the wire body).
86+
async for chunk in handle.chunks():
87+
print("chunk:", chunk)
88+
89+
# All other event kinds (log, progress, metric, status, …) are surfaced on
90+
# `events()` and the terminal payload is awaited via `done`.
91+
result = await handle.done
92+
print("final:", result.result_size, "bytes")
8293
```
8394

8495
## Cancellation
@@ -87,10 +98,12 @@ async for event in handle.events():
8798
handle = await client.submit(agent="slow", input={})
8899
await asyncio.sleep(1.0)
89100
await client.cancel_job(handle.job_id)
90-
# handle.done raises JobCancelledError
101+
# handle.done resolves to a JobResultPayload with final_status="cancelled".
102+
result = await handle.done
103+
assert result.final_status == "cancelled"
91104
```
92105

93-
Cancellation is cooperative: the runtime sends a cancellation signal and waits for the agent to finish its current operation. Agents can check `ctx.cancelled` to exit early.
106+
Cancellation is cooperative: the runtime cancels the running task, which surfaces in the agent coroutine as `asyncio.CancelledError`. Agents should let it propagate (or catch, clean up, and re-raise).
94107

95108
## Idempotency
96109

docs/recipes/mcp-skill.md

Lines changed: 43 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -24,44 +24,37 @@ uv add arcp mcp
2424

2525
```python
2626
import asyncio
27-
import json
2827
from typing import Any
2928

3029
from mcp import ClientSession, StdioServerParameters
3130
from mcp.client.stdio import stdio_client
3231

33-
from arcp import ARCPRuntime, JobContext
34-
from arcp.auth import StaticBearerVerifier
35-
from arcp.transport import pair_memory_transports
32+
from arcp import (
33+
ARCPClient,
34+
ClientInfo,
35+
RuntimeInfo,
36+
pair_memory_transports,
37+
)
38+
from arcp.runtime import ARCPRuntime, StaticBearerVerifier
39+
from arcp._runtime.job import JobContext
3640

3741

3842
def make_mcp_skill(tool_name: str, server_params: StdioServerParameters):
3943
"""Return an ARCP agent function that proxies *tool_name* on the MCP server."""
4044

41-
async def agent(ctx: JobContext) -> None:
42-
# Collect arguments from the single-item input stream.
43-
args: dict[str, Any] = {}
44-
async for item in ctx.input_stream():
45-
args.update(item)
46-
47-
# Connect to the MCP server and call the tool.
45+
async def agent(arguments: dict[str, Any], ctx: JobContext) -> dict[str, Any]:
4846
async with stdio_client(server_params) as (read, write):
4947
async with ClientSession(read, write) as session:
5048
await session.initialize()
51-
result = await session.call_tool(tool_name, args)
52-
53-
# Emit MCP result content as ARCP result chunks.
54-
for content_block in result.content:
55-
if content_block.type == "text":
56-
await ctx.emit_event(
57-
"result.chunk",
58-
{"text": content_block.text},
59-
)
60-
else:
61-
await ctx.emit_event(
62-
"result.chunk",
63-
{"data": json.loads(content_block.model_dump_json())},
64-
)
49+
result = await session.call_tool(tool_name, arguments)
50+
51+
async with ctx.stream_result() as stream:
52+
for content_block in result.content:
53+
if content_block.type == "text":
54+
await stream.write(content_block.text)
55+
else:
56+
await stream.write(content_block.model_dump_json())
57+
return {"tool": tool_name}
6558

6659
agent.__name__ = f"mcp_{tool_name}"
6760
return agent
@@ -79,33 +72,38 @@ fs_server = StdioServerParameters(
7972
server_transport, client_transport = pair_memory_transports()
8073

8174
runtime = ARCPRuntime(
82-
transport=server_transport,
83-
auth=StaticBearerVerifier("secret"),
75+
runtime=RuntimeInfo(name="mcp-bridge", version="1.0.0"),
76+
bearer=StaticBearerVerifier({"secret": "principal-1"}),
8477
)
8578
runtime.register_agent("read_file", make_mcp_skill("read_file", fs_server))
8679
```
8780

8881
## Client
8982

9083
```python
91-
from arcp import ARCPClient
84+
import asyncio
85+
from arcp import ARCPClient, ClientInfo
9286

9387

9488
async def main() -> None:
95-
async with ARCPClient(client_transport, token="secret") as client:
96-
handle = await client.submit(
97-
agent="read_file",
98-
input=[{"path": "/tmp/hello.txt"}],
99-
)
89+
asyncio.create_task(runtime.accept(server_transport))
90+
client = ARCPClient(
91+
client=ClientInfo(name="mcp-caller", version="1.0.0"),
92+
token="secret",
93+
)
94+
await client.connect(client_transport)
95+
handle = await client.submit(
96+
agent="read_file",
97+
input={"path": "/tmp/hello.txt"},
98+
)
99+
async for chunk in handle.chunks():
100+
# `chunk` is the result_chunk wire body; `data` is the decoded text
101+
# or base64 bytes (per `encoding`).
102+
print(chunk.get("data"))
103+
await handle.done
104+
await client.close()
100105

101-
async for event in handle.events():
102-
if event.kind == "result.chunk":
103-
print(event.data.get("text", event.data))
104106

105-
await handle.done
106-
107-
108-
import asyncio
109107
asyncio.run(main())
110108
```
111109

@@ -121,18 +119,18 @@ for tool in TOOLS:
121119
## Error propagation
122120

123121
If the MCP tool raises, the exception propagates naturally and ARCP converts it
124-
to a `job.failed` event with an appropriate error code (spec
122+
to a `job.error` envelope with an appropriate error code (spec
125123
[§12](https://arcp.dev/spec/v1.1#section-12)). You can also catch MCP errors
126124
explicitly and re-raise as typed ARCP exceptions:
127125

128126
```python
129-
from mcp.exceptions import McpError
130-
from arcp.errors import AgentError
127+
from mcp.shared.exceptions import McpError
128+
from arcp import InvalidRequestError
131129

132130
try:
133-
result = await session.call_tool(tool_name, args)
131+
result = await session.call_tool(tool_name, arguments)
134132
except McpError as exc:
135-
raise AgentError(str(exc)) from exc
133+
raise InvalidRequestError(str(exc)) from exc
136134
```
137135

138136
## Related

0 commit comments

Comments
 (0)