tunnel: tunnel-history MCP tool with per-client event ring#57
Open
joelgwebber wants to merge 2 commits intomainfrom
Open
tunnel: tunnel-history MCP tool with per-client event ring#57joelgwebber wants to merge 2 commits intomainfrom
joelgwebber wants to merge 2 commits intomainfrom
Conversation
Adds a per-TunnelClient lifecycle event ring (capacity 64) and a new tunnel-history MCP tool that surfaces it. Captures connect-start, ws-open, hello-sent, ready, ping-sent, ws-close, ws-error, stale-fired, reconnect-scheduled, unexpected-response, handshake-error, transport-error, and need-live-tunnel — each with a wall-clock timestamp and small structured detail. Also keeps the last 4 dead-tunnel histories in main.ts so callers can still inspect a tunnel after need_live_tunnel fires and the live entry is removed from the map (the most useful case to debug). Motivation: when tunnels die during a Remy run, neither the agent driving the browser nor the human watching has any way to inspect what happened — stderr from the npx subprocess goes to the parent agent, not to the MCP tool surface. tunnel-history fills that gap. The presence/absence of ping-sent events between connect and stale-fired is a particularly high-signal probe for "is the keepalive timer firing?" vs. "is the inbound side silently half-closed?". Wires opts.onPingSent through TransportOptions → YamuxSession.#sendPing so the ping event is recorded only on successful enqueue (absence then reliably means "timer fired but send threw" rather than "timer didn't fire"). Bumps to 0.1.15. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A small Node script that mints a relayUrl via the live-tunnel MCP tool, then drives a TunnelClient directly (no MCP server wrapper, no Remy). Periodic snapshots dump the full event ring so the operator can watch the WS lifecycle play out in one terminal. Use this to reproduce intermittent disconnect / reconnect failures against staging without rebuilding remy-agent or going through the MCP stdio child process. Skips the MCP initialize handshake intentionally — initialize creates server-side session state that's only valid on the pod that handled it, and the affinity router doesn't bind subsequent requests to that pod. A stateless one-shot tools/call is enough for live-tunnel and avoids that whole class of routing race. Default --mcp-url is staging; --api-key-env auto-picks the env var based on URL (SUBTEXT_STAGING_API_KEY for *.staging.fullstory.com, etc.). Override with --api-key-env if needed. Usage: npm run build node scripts/probe.mjs # 30-min idle test node scripts/probe.mjs --ping-ms 60000 # exercise stale timer node scripts/probe.mjs --allow http://localhost:3000 # different allowlist node scripts/probe.mjs --mcp-url https://api.onfire.fyi/mcp/subtext Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a per-
TunnelClientlifecycle event ring (capacity 64) and a newtunnel-historyMCP tool that surfaces it. Captures the full WS lifecycle:connect-start,disconnect-requested,reconnect-scheduledws-open,hello-sent,ws-close,ws-error,unexpected-responseready,handshake-errorping-sent,stale-firedtransport-error,need-live-tunnelEach event has a wall-clock timestamp (ms since epoch — aligns with lidar logs) and an optional small structured
detail.Also keeps the last 4 dead-tunnel histories in
main.tsso callers can still inspect a tunnel afterneed_live_tunnelfires and the liveclientsentry has been removed (the most common case worth debugging).Motivation
When tunnels die during an agent-driven session (Remy especially), neither the agent driving the browser nor the human watching has any way to inspect what happened — stderr from the
npxsubprocess goes to the parent agent process, not to the MCP tool surface.tunnel-historycloses that gap.The presence/absence of
ping-sentevents during a long quiet window is a particularly high-signal probe:ping-sentbetween connect andstale-fired→ keepalive timer wasn't firing → likely event-loop starvation in the npx child (e.g. stdio backpressure when the parent isn't draining).ping-sent× N thenstale-fired→ pings going out but no inbound → silent half-close upstream (linkerd, intermediate LB).ws-closearrives beforestale-fired→ server tore the WS down (look at lidar logs for matchingtunnel: %s disconnected).Wiring details
opts.onPingSentthreads throughTransportOptions→YamuxSession.#sendPing. The hook fires only on successfulws.send()enqueue, so the absence of aping-sentevent reliably means "the timer fired but the send threw" rather than "the timer didn't fire" — preserving the diagnostic signal.The
tunnel-historytool description tells callers how to read the ring as a chronological story so agents can self-diagnose. Example call shapes:Test plan
history ring records connect, ws-open, hello-sent, ready, ws-close) covers basic ordering and detail content.Bumps to 0.1.15. Companion to #54 (Fix A.client) — together these address SUBTEXT-338 and give us the diagnostic surface for the next class of tunnel issues.
🤖 Generated with Claude Code