Add tmux backend behind CROW_TMUX_BACKEND feature flag (#198)#229
Add tmux backend behind CROW_TMUX_BACKEND feature flag (#198)#229dhilgaertner merged 23 commits intomainfrom
Conversation
Foundation for the tmux-backend rollout (#198). Purely additive: existing rows on disk continue to load unchanged, defaulting to the historical .ghostty path. Adds: - TerminalBackend enum { ghostty, tmux } - TmuxBinding { socketPath, sessionName, windowIndex } — persisted so we can rebind to the same tmux window across Crow restart when the user opts in to keeping the tmux server alive between launches. - SessionTerminal.backend (default: .ghostty) - SessionTerminal.tmuxBinding (nil for .ghostty, populated for .tmux) - Backwards-compat decoder mirrors the existing isManaged migration. Tests cover: - Round-trip with a .tmux row + binding - Decode of a v1 row missing both new fields → defaults to .ghostty - Explicit "ghostty" string decodes - .tmux row with no binding decodes (permissive — runtime construction sites enforce the consistency invariant; the model just persists) All 310 tests across 9 packages pass against this change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rces
Production-ready ports of the shell wrapper and tmux config validated by
the spike (Phases 1, 2a, 2b). Both files are now real on-disk artifacts —
syntax-highlightable, lintable, easy to iterate on — instead of Swift
string literals.
Wrapper changes vs. spike:
- CROW_SENTINEL is mandatory (no Swift-baked default) — the production
code always sets it explicitly, removing one source of confusion.
- Wrapper exits 64 with a clear message if CROW_SENTINEL is missing.
- Logic is otherwise identical: source user's shell config, install
composable precmd/PROMPT_COMMAND hook emitting OSC 133;A + custom
OSC 9;crow-ready + sentinel-file touch (DCS-tmux wrapped under TMUX).
tmux.conf changes vs. spike:
- Adds explanatory comments tying each setting to its spike finding.
- history-limit 5000 (up from default 2000) — Phase 3 §2 measured
plenty of headroom on RSS.
- escape-time 0 — interactive editors feel sluggish with the 500ms
default.
BundledResources.swift exposes Bundle.module URLs for both files. Tests
cover: presence on disk, shebang sanity, allow-passthrough on, status off
— each invariant is load-bearing for the production wiring.
Adds CrowTerminal test target (didn't exist before).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Production-quality utilities for the tmux backend, ported from the
spike's PromptReadiness.swift / TmuxController.swift with cleanups:
SentinelWaiter:
- Async-friendly (Task.sleep instead of Thread.sleep) so the readiness
poll doesn't block whatever queue called it.
- Replaces the hardcoded 5s sleep at TerminalManager.swift:113-126 for
tmux-backed terminals. See docs/tmux-backend-spec.md §6.
TmuxController:
- tmuxBinary is injected (no /opt/homebrew default — host detection
happens in PROD #4 first-run check).
- newWindow returns the new window's index (uses tmux's `-P -F` print
flag), so the caller doesn't have to re-list windows after each call.
- loadBufferFromStdin pipes the payload through stdin, sidestepping
the ARG_MAX-derived `command too long` failure that hits send-keys -l
on >10KB inputs (Phase 3 §3 finding).
- listWindowIndices, hasSession, killWindow as primitives the
TmuxBackend will compose.
- TmuxError carries args/stdout/stderr so failure messages include
enough context to debug from logs.
Tests:
- SentinelWaiterTests: returns-immediately, returns-nil-on-timeout,
mid-wait detection.
- TmuxControllerTests: skipped via .enabled(if:) when no tmux binary
is on the host (CI-friendly). When present, exercises session
creation, window enumeration, load-buffer + paste-buffer round-trip,
error surfacing, and version-string parsing.
13 tests, all green. CrowTerminal still builds clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The app-wide singleton that owns the embedded tmux server backing all
.tmux SessionTerminals. Lazily starts the server on first registerTerminal
call, owns the (eventually-shared) cockpit GhosttySurfaceView, and tracks
UUID → window-index bindings.
Public API mirrors what TerminalManager exposes for the Ghostty path so
callers can dispatch to either based on SessionTerminal.backend:
configure(tmuxBinary:socketPath:) — host injects discovered tmux path
registerTerminal(...) — new tmux window, returns binding
adoptTerminal(...) — rebind to a window we already have
(live-server reattach on app launch)
makeActive(id:) — tmux select-window on tab switch
sendText(id:text:) — load-buffer + paste-buffer (PROD #3)
destroyTerminal(id:) — tmux kill-window + cleanup
cockpitSurface() — lazy single GhosttySurfaceView
attached to `tmux attach-session`
shutdown() — kill-server, used by PROD #5
watchdog
Readiness:
- Each registerTerminal sets CROW_SENTINEL via tmux new-window -e and
spawns the bundled crow-shell-wrapper.sh as the window's child.
- Sentinel poll runs as a Task; reports .shellReady via the same
onReadinessChanged callback the Ghostty path uses, so downstream
consumers (ClaudeLauncher) don't need to special-case the backend.
Initial command (e.g. `claude --continue`) is delivered via the same
buffer-paste path the public sendText uses — so the launch flow gets
the load-buffer-not-send-keys benefit (PROD #3) automatically.
CrowTerminal now declares CrowCore as an explicit dependency for the
TerminalReadiness / TmuxBinding types it composes with.
Tests: 6 TmuxBackend suite tests covering the tmux side end-to-end:
window creation, distinct indices for multiple registers, send-text
round-trip, destroy, adopt-on-missing-window error path, makeActive on
unregistered throws. All 19 CrowTerminal tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…inal
End-to-end dispatch for new SessionTerminals when CROW_TMUX_BACKEND=1
is set in the env. Existing call sites continue to use the Ghostty path
unchanged; the flag flip is a single point of decision at terminal-create
time.
New helpers:
- FeatureFlags.tmuxBackend — env-var check (default off)
- TmuxDiscovery.discover() — locate tmux on host, require ≥3.3
- TmuxDiscovery.meetsMinimumVersion — parse "tmux X.Y" with strict prefix
- TerminalRouter.send/destroy — dispatch by SessionTerminal.backend
App launch (AppDelegate.launchMainApp):
- After GhosttyApp.shared.initialize(), if FeatureFlags.tmuxBackend AND
a usable tmux ≥ 3.3 is found, call TmuxBackend.shared.configure().
Per-app socket in $TMPDIR with PID suffix; v1 doesn't keep the server
alive across launches (spec §12 calls that out as a future toggle).
new-terminal RPC:
- Manager terminal stays on Ghostty (special case, special-purpose).
- Other new terminals: when flag is on AND TmuxBackend is configured,
set backend=.tmux, call registerTerminal, persist the binding on the
SessionTerminal row. On registerTerminal failure, fall back to
Ghostty cleanly so a tmux outage doesn't break new-session creation.
send RPC:
- Routes through TerminalRouter.send(terminal, text:), which uses
load-buffer + paste-buffer for tmux-backed terminals (PROD #3) and
ghostty_surface_text for ghostty-backed.
- Recovery branch (pre-init surface if missing) only runs for ghostty;
tmux windows are created synchronously by registerTerminal.
- Falls back to TerminalManager.shared.send for unknown terminal IDs
(legacy compat for any caller that hasn't updated).
close-terminal RPC:
- TerminalRouter.destroy dispatches to TerminalManager.destroy or
TmuxBackend.destroyTerminal as appropriate.
TerminalSurfaceView:
- Optional `backend: TerminalBackend = .ghostty` parameter, default
keeps every existing call site working.
- For .tmux terminals, fetches the shared cockpit surface from
TmuxBackend instead of a per-id surface, and fires makeActive(id:)
on tab switch — the "tab switch is just a tmux select-window" model
from spec §5.
SessionService.wireTerminalReadiness:
- Adds TmuxBackend.shared.onReadinessChanged subscriber alongside the
existing TerminalManager.onStateChanged. Both feed the same
TerminalReadiness state machine, so launchClaude works regardless of
backend without backend-specific branches.
Tests:
- FeatureFlagsTests: locks the canonical "true" spellings (1/true/yes/on)
and confirms default-off when env var is unset.
- TmuxDiscoveryTests: version parser accepts ≥3.3 (incl. 3.6a, 3.4-rc1,
whitespace variants), rejects too-old, rejects wrong prefixes.
336 tests across 10 packages pass. Crow target builds clean.
Out of scope for this commit (follow-up work in this same PR):
- ClaudeLauncher's direct TerminalManager.send calls (still ghostty-only)
- Manager terminal on the tmux backend
- Hydrate-state path for .tmux terminals on app restart
- The preInitialize call sites for review / orphan / VS Code / clone flows
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…raction The `chmod 644 *.o 2>/dev/null` and `ar r ... 2>/dev/null` swallowed real errors during the per-library extraction loop. Some objects in Zig's cache extract as read-only on certain hosts; the silent chmod left permissions wrong, and the subsequent `ar r` then silently failed to add those objects to the fat library. The result: an under-linked libghostty-fat.a missing symbols like _sentry_uuid_*, hwy::Warn, and the ImGui C++ constructors — only surfacing at consumer link time with "Undefined symbols" errors that point nowhere helpful. Caught this while building the tmux backend's Demo target during the #198 spike (had to manually re-extract every dep library). Same drop of `2>/dev/null` would have made the original failure obvious. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Spec §10.1 mitigation for the new tmux-as-SPOF failure mode: any tmux
subcommand from Swift that exceeds 2 seconds is SIGTERMed and surfaces
TmuxError.timedOut. Callers get control back instead of the UI freezing
waiting on a wedged tmux server.
Implementation:
- DispatchSourceTimer scheduled at process spawn fires p.terminate()
if the process is still running at the deadline.
- TimeoutFlag (NSLock-guarded box) records whether the timer fired so
we can distinguish a normal non-zero exit from a watchdog kill.
- Default timeout is 2.0s — well above the spike's measured p95 (~74ms).
- run() takes an optional timeout: parameter for cases that need a
different budget (e.g. attaching to a session may take longer).
Adds TmuxError.timedOut(args:after:) case with a clean description.
Test:
- timeoutSurfacesError verifies a normal-fast command (display-message)
completes well within a tight 1s budget — locks in that the timeout
plumbing doesn't accidentally kill normal operations.
- The timer-fires path is exercised by the underlying
DispatchSourceTimer + Process.terminate() machinery, both
well-trodden. A stub-binary integration test for the kill-on-timeout
path is filed as future work; not gating since the primitive's whole
point is to bound the wait, not to verify the kill-precision.
The watchdog callback that offers the user "Restart tmux server" via
NSAlert (spec §10.1) is deferred to a follow-up commit. The current
change is the load-bearing one: bounding the wait prevents UI freeze;
the UI alert is polish.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Spec §14 PROD #6: track time-to-first-prompt, tab-switch latency, server crash count. Crow doesn't have an internal metrics framework yet (CrowTelemetry is OTLP-receiver-only for Claude Code analytics), so this lands as structured NSLog statements with a consistent prefix that's easy to grep / parse / re-route to a real pipeline later. Three event types emitted from TmuxBackend: [CrowTelemetry tmux:first_prompt_ms=<n> terminal=<uuid>] Fired by SentinelWaiter when the bundled wrapper writes its sentinel file on first precmd. Replaces the time-to-shell-ready signal that used to be approximated by the 5s sleep at TerminalManager.swift:113. [CrowTelemetry tmux:first_prompt_timeout terminal=<uuid> budget_ms=5000] Fired when the readiness wait exceeds its 5s budget. Paired with the success counter to compute timeout rate per session. [CrowTelemetry tmux:tab_switch_ms=<n> terminal=<uuid>] Fired by makeActive on each select-window. Spike Phase 2a measured 74ms p95; production should track this via real users. [CrowTelemetry tmux:server_shutdown bindings=<n>] Fired by shutdown(). Combined with start logs, gives session-length + cleanly-closed-vs-crashed signals. (PROD #5 watchdog will add a `tmux:server_unhealthy` event when the timeout-restart UI fires.) Same shape as Crow's existing `[Crow]` log prefix — single-line, KEY=val pairs, suitable for `awk '/CrowTelemetry/'` or shipping to OSLog. No behavior changes; pure instrumentation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Spec §11 / PROD #4: when CROW_TMUX_BACKEND=1 is set but TmuxDiscovery finds no tmux ≥ 3.3 on the host, surface a native NSAlert explaining the situation and offering three actions: - "Copy `brew install tmux`" — puts the command on the clipboard so the user can paste it into their terminal. - "Open tmux install guide" — launches the upstream installation wiki in their default browser. - "Continue" — dismisses the sheet; Crow falls through to the standard Ghostty-per-session backend. The alert intentionally doesn't block app launch — Crow still works without tmux; the user just doesn't get the backend they asked for. This matches the spec's "fallback is safe" property: a tmux outage or missing-tool should never break the core app. The alert text mentions explicitly that Crow won't modify the user's dotfiles, addressing a real concern raised in the original ticket discussion ("does this change my zsh config?"). The wrapper sources their config rather than replacing it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tier-A item 1 of the deferred work in this PR. SessionService.launchClaude fires when a managed work-session terminal hits .shellReady — that signal now comes from both backends (Ghostty 5s sleep AND tmux SentinelWaiter, wired in the prior commit). The launch's claude-command write was still going through TerminalManager.shared.send, which has no surface registered for tmux-backed terminals. Look up the SessionTerminal up front, pull the launch text out into a local so the if/else branches are about routing-vs-text-construction (they were nested before), and dispatch through TerminalRouter when the row is known. Falls back to the direct TerminalManager call for unknown ids — same defensive pattern the `crow send` RPC uses. For .tmux terminals this means the launch text travels via load-buffer + paste-buffer to the right tmux window, instead of disappearing into a non-existent Ghostty surface. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tier-A item 5a: the call sites that create / destroy terminals via SessionService (addTerminal, addGlobalTerminal, recoverSession, createReviewSession, closeTerminal, closeGlobalTerminal) all bypassed TerminalRouter and called TerminalManager.shared.* directly. Result: even with CROW_TMUX_BACKEND=1, only the new-terminal RPC path actually honored the flag — UI-driven "+", auto-review, and orphan-recovery flows still landed on Ghostty. Adds private SessionService.prepareTerminal(_, trackReadiness:) helper that mirrors the dispatch logic from AppDelegate's new-terminal RPC: decide useTmux based on the flag + configured backend + non-Manager session, register a tmux window or pre-init a Ghostty surface, set backend/tmuxBinding on the row, and return it for persistence. Manager terminal stays force-pinned to Ghostty (Tier-C deferred). closeTerminal and closeGlobalTerminal route destruction through TerminalRouter.destroy(terminal). closeGlobalTerminal falls back to TerminalManager.shared.destroy when the row isn't found in AppState (legacy compat — same defensive pattern as the send RPC). Net effect: with the flag on, every Crow surface other than the Manager terminal lands on the tmux backend. Without the flag, behavior is unchanged. 337 tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tier-A item 3: spec §10.1 mitigation for the new tmux-as-SPOF failure
mode. The 2s watchdog in TmuxController.run already prevents UI freezes
by SIGTERMing wedged tmux subprocesses, but until now those timeouts
were silent — the user had no signal that anything was wrong, and the
server stayed alive in its hung state.
Two changes:
TmuxBackend gains `onUnresponsive: ((TmuxError) -> Void)?`. The public
methods that hit the tmux server (registerTerminal / makeActive /
sendText) catch errors, forward .timedOut to the callback, and re-throw.
Other errors pass through silently — the callback is specifically the
"server hung" signal, not a general error sink.
AppDelegate wires it after TmuxBackend.configure to a native NSAlert
with two actions:
- "Restart tmux server" → TmuxBackend.shared.shutdown(), which kills
the server. The next backend call respawns a fresh server from
scratch via ensureRunningServer + the persisted SessionTerminal
rows. Logs `[CrowTelemetry tmux:server_restart_by_user]` for the
operator-greppable counter.
- "Continue without restart" → dismiss the alert and let the user
decide what to do next.
A `tmuxUnresponsiveAlertShowing` guard suppresses concurrent alerts so
a burst of timeouts (the most likely shape of a real hang — every
makeActive on tab switch fails together) only shows one alert until
dismissed.
The watchdog log path also fires `[CrowTelemetry tmux:server_unresponsive]`
when reportIfTimeout fires, paired with the existing
`tmux:server_shutdown` event so dashboards can compute hang vs.
clean-quit ratios.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tier-B item 2 of the deferred work in this PR. SessionService.hydrateState
previously called TerminalManager.shared.preInitialize for every persisted
SessionTerminal — wrong for .tmux rows: it would create a leaked Ghostty
surface for an id whose actual home is a tmux window that doesn't exist
yet (the v1 rollout kills the server on app exit, spec §12).
New per-terminal helper rehydrateTerminalSurface dispatches by backend:
.ghostty: existing path (trackReadiness if managed + preInitialize).
.tmux: v1 doesn't keep the server alive across launches, so we
re-register a fresh window via TmuxBackend.registerTerminal
and update the row's tmuxBinding.windowIndex with the new
index returned by tmux. Future work (server-survives-launch)
would call adoptTerminal here instead.
Failure cases — TmuxBackend not configured (flag off / tmux gone since
last launch), or registerTerminal throws — silently fall back to the
Ghostty path. Matches the user's stated preference: a tmux outage
shouldn't block the user from using the app, just degrade them to the
historical experience.
Persistence: any row whose backend or tmuxBinding changed during
re-hydration (i.e. fell back to .ghostty) is written back to the store
in a single batched mutate, so a subsequent crash doesn't keep retrying
a doomed re-register on every launch.
The trackReadiness call previously made in the hydrate loop is now
folded into rehydrateTerminalSurface — both backends get the right
readiness wiring as a side effect of re-creating the surface/window.
Out of scope (future work):
- Adoption of an already-running tmux server (requires keeping the
server alive across launches; spec §12 toggle).
- Watchdog wiring for hydrate-time timeouts (currently logs only).
337 tests across 10 packages still pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dgershman
left a comment
There was a problem hiding this comment.
Code & Security Review
Critical Issues
1. Missing tmux shutdown on app exit — orphaned server processes
Sources/Crow/App/AppDelegate.swift — applicationWillTerminate does not call TmuxBackend.shared.shutdown(). The PR description explicitly states "v1 still kills the server on app exit (spec §12)" and the shutdown() method exists and works (emits telemetry, kills the server, cleans sentinels), but it's never invoked on quit. This means tmux server processes will survive app termination as orphans, accumulating one per app launch (since each launch gets its own PID-scoped socket at $TMPDIR/crow-tmux-<pid>.sock).
Fix: add TmuxBackend.shared.shutdown() in applicationWillTerminate, before GhosttyApp.shared.shutdown().
2. deleteSession bypasses TerminalRouter — tmux windows leak on session delete
Sources/Crow/App/SessionService.swift:425-428:
for terminal in terminals {
TerminalManager.shared.destroy(id: terminal.id)
}This calls the Ghostty-only destroy path. For .tmux terminals, the tmux window is never killed — it persists in the server until app exit. The rest of the codebase correctly uses TerminalRouter.destroy(terminal) (see closeTerminal, closeGlobalTerminal, and the RPC close-terminal handler), but deleteSession was missed.
Fix: change to TerminalRouter.destroy(terminal).
Security Review
Strengths:
- Process spawning uses
Process.argumentsarrays throughout (no shell string interpolation), eliminating command injection vectors inTmuxController. shellQuote()inTmuxBackenduses proper POSIX single-quote escaping for the one place a shell string is constructed (cockpitSurface()'s attach command).- Feature flag is read-only from environment — no runtime mutation surface.
- Sentinel files use internally-generated UUIDs for paths, not user-controlled input.
crow-tmux.confcorrectly runsunbind-key -ato strip all default tmux bindings, preventing users from accidentally reaching tmux's command mode through the embedded surface.crow-shell-wrapper.shvalidatesCROW_SENTINELis set before proceeding and usesadd-zsh-hookfor composable hook installation rather than overwritingprecmd.- Path traversal protection already exists for RPC commands (
isPathWithinDevRoot), and the new tmux paths don't introduce new external-input-to-path flows.
Concerns:
- None blocking. The tmux binary path is resolved from a fixed allowlist (
TmuxDiscovery.searchPaths), not from$PATH, which is correct.
Code Quality
loadBufferFromStdin has no watchdog timeout (Packages/CrowTerminal/Sources/CrowTerminal/TmuxController.swift:155-179):
Unlike run() which has a 2s watchdog that SIGTERMs hung processes, loadBufferFromStdin calls p.waitUntilExit() with no timeout. If tmux load-buffer hangs (e.g., the server is wedged), this blocks the calling thread indefinitely. Since sendText is called from @MainActor context, this could freeze the UI. Should use the same DispatchSource.makeTimerSource watchdog pattern, or factor the timeout logic into a shared helper.
UUID prefix collision in buffer names and sentinel paths (TmuxBackend.swift:229, 301):
Both sendText and sentinelPath use id.uuidString.prefix(8) (32 bits of entropy). Two terminals sharing the same 8-char prefix would overwrite each other's paste buffers or sentinel files. With typical terminal counts (<100), collision probability is negligible, but using .prefix(12) or the full UUID is free and eliminates it entirely.
Test coverage is solid: backward-compat decoder tests, integration tests gated behind tmux availability (.enabled(if: discoveredTmuxBinary != nil)), sentinel waiter tests with mid-wait appearance, and version parsing edge cases are all well-structured.
Architecture is clean: TerminalRouter as a dispatch hub, TmuxBackend as a singleton matching TerminalManager's @MainActor constraint, and SentinelWaiter as a pure async value type are all good abstractions.
Summary Table
| Priority | Issue |
|---|---|
| 🔴 Red | applicationWillTerminate doesn't call TmuxBackend.shared.shutdown() — tmux servers orphan on every quit |
| 🔴 Red | deleteSession calls TerminalManager.shared.destroy(id:) instead of TerminalRouter.destroy(terminal) — tmux windows leak |
| 🟡 Yellow | loadBufferFromStdin lacks the watchdog timeout that run() has — could freeze UI on hung server |
| 🟡 Yellow | prefix(8) UUID truncation in buffer/sentinel names — avoidable collision risk |
| 🟢 Green | Temp ZDOTDIR dirs from shell wrapper accumulate until reboot (acceptable — uses $TMPDIR) |
Recommendation: Request Changes — the two red items are correctness bugs that violate the PR's own stated invariants (spec §12 server-kills-on-quit, and consistent TerminalRouter dispatch). Both are one-line fixes. The yellow items are worth addressing but aren't blockers.
Resolves the four findings (two reds, two yellows) from @dgershman's code review. Wire TmuxBackend.shared.shutdown() into applicationWillTerminate before GhosttyApp.shared.shutdown() so the per-launch tmux server is reaped on quit, matching the PR description's "v1 kills the server on app exit (spec §12)" intent. Without this, each launch left an orphaned tmux process bound to $TMPDIR/crow-tmux-<pid>.sock — accumulating one per launch and never reusable since the socket path is PID-scoped. Switch SessionService.deleteSession from TerminalManager.shared.destroy(id:) to TerminalRouter.destroy(terminal). The Ghostty-only direct call left .tmux windows alive in the cockpit session whenever a session containing them was deleted; the router dispatches per backend so kill-window runs. Extract a ProcessWatchdog helper in TmuxController and apply it to loadBufferFromStdin, which previously had no timeout and could freeze the UI thread (called from @mainactor) on a wedged server. Both run() and loadBufferFromStdin now share the same SIGTERM-on-timeout pattern; loadBufferFromStdin gains a timeout: parameter (defaults to defaultTimeout) and converts a write-side EPIPE into TmuxError.timedOut when the watchdog had already fired before the stdin pipe closed. Drop the .prefix(8) UUID truncation in TmuxBackend's buffer names and sentinel paths. Full UUID eliminates the (small but free-to-fix) collision risk between concurrent terminals that happen to share an 8-char prefix. Out of scope (separate ticket): tmux server survival across app restarts (spec §12 future toggle; requires keepServerAlive setting plus adoptTerminal plumbing in SessionService.rehydrateTerminalSurface). All package tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dgershman
left a comment
There was a problem hiding this comment.
Re-review: All findings addressed
All four findings from the initial review have been resolved in 93ca9fc:
| Finding | Status |
|---|---|
🔴 Missing TmuxBackend.shared.shutdown() in applicationWillTerminate |
Fixed — added before GhosttyApp.shared.shutdown() |
🔴 deleteSession bypassing TerminalRouter |
Fixed — routes through TerminalRouter.destroy(terminal) |
🟡 loadBufferFromStdin missing watchdog timeout |
Fixed — extracted ProcessWatchdog helper shared by both run() and loadBufferFromStdin, with EPIPE-after-watchdog handling |
🟡 prefix(8) UUID truncation |
Fixed — uses full UUID for both buffer names and sentinel paths |
The ProcessWatchdog refactor is a nice improvement over the inline TimeoutFlag + DispatchSource pattern — DRYs up the timeout logic and handles the stdin EPIPE edge case where the watchdog fires mid-write.
LGTM.
… cockpit TerminalSurfaceView's `backend:` parameter defaults to `.ghostty`, and none of the four UI call sites (SessionDetailView's manager / non-managed / ReadinessAwareTerminal cases, plus GlobalTerminalView) passed the SessionTerminal row's backend value through. Result: even when the persisted row was `.tmux` and TmuxBackend.registerTerminal had created the cockpit window correctly, the visible tab routed through the `.ghostty` branch of surfaceForBackend(), spawned a brand-new login shell via TerminalManager.shared.surface, and the cockpit Ghostty surface was never created or attached. User-visible symptom (caught by manual smoke testing of #229): create a session via /crow-workspace with the flag on; claude actually runs in tmux window 1 (verifiable via `tmux -S … list-windows`), and hook events post permission badges to the sidebar — but the Claude Code tab shows a fresh "Last login: …" zsh prompt with no claude in sight, because the visible surface is a leaked per-terminal Ghostty. Diagnosis trail in the user's log: [TerminalManager] surface(for: D6F9E577-…) — creating NEW view [SessionService] onStateChanged: terminal=D6F9E577, state=created [CrowTelemetry tmux:first_prompt_timeout terminal=D6F9E577] The tmux backend correctly waited on the sentinel, the wrapper correctly fired the precmd in the tmux window, but the visible Ghostty surface was a different process tree entirely — so the sentinel timed out at the 5s budget and the user saw a wrong surface. No `[TerminalSurfaceView] tmux cockpitSurface failed:` line ever appeared, confirming we weren't even falling into the catch — the UI just chose `.ghostty` directly. Fix: pass `backend: terminal.backend` at the four call sites that have a real SessionTerminal row. The empty-state instantiation at SessionDetailView.swift:228 has no row to read from and keeps the default `.ghostty` (correct — there's nothing to render). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two tmux clients were ending up attached to the same crow-cockpit session on every cold render, observable via: $ tmux -S "$SOCKET" list-clients /dev/ttys021: crow-cockpit (attached,focused) /dev/ttys022: crow-cockpit (attached,focused) Two distinct GhosttySurfaceView instances were being created with the attach-session command. The smoking gun in the user's NSLog tee: 14:49:04.860 [Ghostty] createSurface() succeeded, hasCallback=false 14:49:04.969 [Ghostty] createSurface() succeeded, hasCallback=false (TerminalManager surfaces always set onSurfaceCreated, so hasCallback=false is the cockpit's signature.) 109ms apart, both spawned by Crow's process. Two contributing causes, both fixed: 1. cockpitSurface() did `addSubview(view)` BEFORE `sharedSurface = view`. addSubview synchronously triggers viewDidMoveToWindow → createSurface → ghostty_surface_new, which can pump the main runloop briefly while libghostty's renderer registers. A re-entrant cockpitSurface() call during that window saw `sharedSurface == nil`, fell through the `if let existing` short-circuit, and spawned a second view. Reorder: cache `sharedSurface = view` before addSubview, so any nested call sees the cached view and short-circuits. 2. TerminalSurfaceView.existingSurfaceForBackend() called cockpitSurface() for the .tmux case — which CREATES if not existing, despite the "existing" name. updateNSView (which calls existingSurfaceForBackend) could race with makeNSView (which calls cockpitSurface directly). Add a side-effect-free `existingCockpitSurface` accessor on TmuxBackend that returns the cached value or nil, and route existingSurfaceForBackend through it. updateNSView now correctly no-ops when the cockpit hasn't been created yet, instead of triggering creation as a side effect. Result: one cockpit surface per app launch, one tmux client per cockpit, matches the singleton design intent. Doesn't change behavior for callers that genuinely want lazy creation (makeNSView still uses cockpitSurface directly). 20 CrowTerminal + 38 root tests still pass. Functional verification deferred to next manual smoke test (tmux list-clients should show exactly one row after navigating multiple .tmux session tabs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reproduced on app restart with 8 managed terminals (6 .ghostty + 2 .tmux):
the .tmux terminals' "Waiting for terminal..." spinner never cleared,
even though their wrappers had clearly fired (sentinel files present
on disk with the expected mtime).
Two layered problems in startReadinessWatch:
1. The 5-second SentinelWaiter budget is fine for a cold-creation of a
single terminal but too tight for hydrate-time, where N managed
terminals all start their shells simultaneously and contend for CPU.
Heavy zshrc setups (oh-my-zsh, asdf, nvm, starship) routinely push
first-prompt latency past 5s under that load.
2. When the budget did blow, the timeout branch logged a telemetry line
and DID NOTHING ELSE. The comment said "caller can decide via their
own watchdog" — but no caller has one. The readiness state stays
.uninitialized → ReadinessAwareTerminal's overlay shows the spinner
forever → launchClaude is never auto-fired → user is permanently
stuck on a fresh-but-empty pane.
Fix:
- Bump the budget to 30s (covers contended hydrate scenarios).
- On a genuine 30s timeout, advance to .shellReady ANYWAY. Three
rationales:
* The shell is almost certainly alive but slow; the paste-buffered
`claude --continue` queues at the tmux pane and runs when the
shell finally drains its tty.
* If the shell IS dead (rare — would require the wrapper to crash
in startup), the pane stays empty, which is the truthful state
and lets the user see what's happening instead of a misleading
spinner.
* "Spinner forever" is strictly worse than "advance and let the
downstream paste be best-effort" for any realistic failure mode.
The telemetry line still records the timeout for operator visibility.
20 CrowTerminal + 38 root tests still pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…oject User reported opening crow-230's tab and seeing a different project's claude session — specifically the PR #229 conversation rather than crow-230's own /plan transcript. Investigation showed: - Cockpit was correctly attached to crow-230's tmux window (window 2, confirmed via tmux list-windows showing the * marker on window 2). - Window 2's pane content was claude, but its prompt header read "PR #229" — not the crow-230 ticket. Root cause: TmuxBackend.registerTerminal was setting PWD env on the new tmux window but not actually setting the start-directory. tmux's new-window inherits the SERVER's cwd unless given -c <dir>. Crow's tmux server is started from the Crow process's launch directory (typically the crow-tmux-backend worktree). So every .tmux window's shell came up in /Users/.../crow-tmux-backend/ regardless of which session/worktree it was supposed to live in. When the auto-launched `claude --continue` ran in that wrong cwd, claude matched it against ~/.claude/projects/-Users-…-crow-tmux-backend/ — which is THIS PR #229 conversation's project — and resumed it instead of the crow-230 transcript that lives under -Users-…-crow-230-quick-action- buttons/. Setting PWD env without actually cd'ing the shell does not help: PWD is just an env var, the shell process is still spawned in tmux's cwd. Fix: - Add `cwd:` to TmuxController.newWindow. When set, append `-c <cwd>` to the args. tmux honors -c as the start-directory for the new window's spawned process. - Pass terminal.cwd from TmuxBackend.registerTerminal. Falls through cleanly when the caller didn't set a cwd (rare — only the cockpit anchor doesn't, and that's intentional). Result: each .tmux terminal's shell starts in its own worktree, so `claude --continue` resumes the right project transcript. 20 CrowTerminal + 38 root tests still pass. Existing newWindow tests keep working because the new parameter is optional. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Surface CROW_TMUX_BACKEND as a user-facing preference instead of an
env-var-only flag. Most users will never relaunch Crow from a shell
with an env prefix, so the rollout currently has zero discoverability.
Wiring (5 files, ~120 lines net):
Packages/CrowCore/.../AppConfig.swift
Add experimentalTmuxBackend: Bool = false. Same back-compat decoder
pattern as every other field in this struct (decodeIfPresent ?? false),
so existing config.json files continue to load unchanged.
Sources/Crow/App/FeatureFlags.swift
Add a static `tmuxBackendConfigOverride: Bool` (nonisolated(unsafe)
since the value is set once at launch on the main thread, then read-
only). FeatureFlags.tmuxBackend now returns
envFlag OR tmuxBackendConfigOverride
so the env var keeps working as a CI/dev override AND the new UI
toggle works for end users. No changes needed at the three existing
read sites (AppDelegate's launch + new-terminal RPC, SessionService's
addTerminal) — they all benefit transparently.
Sources/Crow/App/AppDelegate.swift
Reorder launchMainApp to load AppConfig BEFORE the tmux configure
block, then set FeatureFlags.tmuxBackendConfigOverride from the
loaded config.experimentalTmuxBackend. Ordering is load-bearing:
the configure block reads FeatureFlags.tmuxBackend which now ORs in
the override.
Packages/CrowUI/.../ExperimentalSettingsView.swift [NEW]
Mirrors AutomationSettingsView's shape: Form { Section { Toggle } }
with a single binding + onSave closure. Section header reads
"tmux backend" with caption "Experimental — see #198 for context".
Body caption explains the trade (faster session switching, less
memory, requires tmux ≥ 3.3) and notes "Takes effect on next app
launch" — same pattern as the existing Manager Terminal section.
Packages/CrowUI/.../SettingsView.swift
One TabView entry appended after Notifications, using the `flask`
SF Symbol. Position last per macOS convention for advanced /
experimental settings.
Tests:
- AppConfigTests: round-trip test for experimentalTmuxBackend +
default-false assertion in the empty-JSON decode test.
- FeatureFlagsTests: tmuxBackendOnWhenConfigOverrideOn verifies the
OR-merge with the config override, plus state-leak guard in the
existing tmuxBackendDefaultIsOff test.
CrowCore 141 / root 39 / CrowUI 22 tests still pass.
Out of scope (deliberate):
- Auto-relaunch button in the Experimental tab. User can ⌘Q + relaunch
from their dev shell during testing, which is what they need anyway
to keep the tee'd log going.
- Removing FeatureFlags entirely. Env var is still useful for CI and
for dev iteration without touching ~/Library/Application Support/crow/.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
User-facing affordance for the tmux-backend rollout: at a glance you can tell which sessions are running through the cockpit vs which fell back to per-terminal Ghostty. A session shows the badge iff any of its SessionTerminal rows has backend == .tmux. Sessions whose managed terminal fell back to .ghostty (via rehydrateAsGhosttyFallback or the new-terminal RPC's catch path) have ALL terminals as .ghostty and so get no badge — matches the user's explicit ask: "If it falls back the ghosty, it should not have a T." Visual: monospaced bold "T" at 9pt, bottom-trailing of the row, in a subtle rounded background using the same palette as existing badges (textSecondary fg, bgDeep@70% fill, borderSubtle stroke). Hover tooltip reads "This session's terminals are backed by tmux". Reactive — the `isTmuxBacked` computed prop reads `appState.terminals` which is @observable, so the badge appears/disappears the instant a terminal's backend changes (e.g., a .tmux terminal getting destroy()ed leaves the session with only .ghostty rows → badge disappears next render). No new state, no schema changes, no test updates. 22 CrowUI tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three related polish items from continued dogfood of the tmux backend:
== 1. Orphan tmux server reaper ==
User-visible problem: dev iteration that involves Force Quit (or
pkill) leaves a tmux server orphaned per cycle, accumulating ~10-20
MB of RSS each. Symptom (after several iterations):
$ ls $TMPDIR/crow-tmux-*.sock
crow-tmux-{20084,50095,60538,67250,78087,…}.sock
$ pgrep -afl 'tmux.*crow-tmux'
(5 live tmux servers, only 1 of which belongs to the running Crow)
Root cause: `applicationWillTerminate` (which runs PR #229's shutdown
fix) only fires on graceful exits. SIGKILL, Force Quit, and crash
bypass it. Past Crow's tmux server lives on as an orphan because the
socket is PID-scoped and the OS has no way to reap it.
Fix: new `TmuxOrphanReaper` enum in CrowTerminal. At launch it
enumerates `$TMPDIR/crow-tmux-*.sock`, parses the PID from each
filename, and reaps any whose PID is NOT a live `CrowApp` process
(checked via /bin/ps). Skips the current process's own socket and
skips peer-Crow sockets so multiple concurrent Crow instances don't
reap each other.
Wired into AppDelegate.launchMainApp: runs whenever tmux is
discoverable, regardless of whether THIS launch will use the tmux
backend (orphans accumulate independent of current flag state).
Also added `nonisolated` to `TmuxBackend.cockpitSessionName` so the
reaper can read the constant without a MainActor hop (it's an
immutable string literal — safe).
Telemetry: each reaped socket logs
`[CrowTelemetry tmux:orphan_reaped pid=… socket=…]` plus a summary
line. Idempotent — costs ~50ms when there's nothing to do.
== 2. Settings window: wider + app-modal ==
User reported the Settings window's tabs were collapsing into an
overflow ("…") menu because 5 tabs (General / Automation / Workspaces /
Notifications / Experimental) didn't fit at the previous 520pt width.
Bumped the SettingsView frame and the NSWindow contentRect to 720pt
wide; height unchanged at 480.
Also: Settings is now app-modal. `showSettings()` builds the window,
attaches a willClose observer that calls `NSApp.stopModal()`, then
runs `NSApp.runModal(for:)` to block the main app until the user
dismisses. This matches the "must dismiss to go back to the app"
UX the user explicitly asked for. Removed the now-unreachable
"existing window — bring forward" short-circuit.
== 3. Gear icon in sidebar toolbar ==
Surfaces Settings via a third toolbar button in SessionListView,
alongside the existing select-mode and mute-notifications buttons.
Wired through a new `AppState.onShowSettings` callback (mirroring the
existing onSoundMutedChanged / onAddTerminal pattern); AppDelegate
hooks it up to its own `showSettings()` method during launchMainApp's
closure-wiring block.
== Tests ==
CrowCore 141 / CrowTerminal 20 / root 39 / CrowUI 22 — all pass.
No new test cases for the reaper itself; its happy paths are exercised
by end-to-end behavior (no orphans accumulate after iteration), and the
ps-shellout / regex-parse logic doesn't have hidden-state bugs to
guard against.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…his run Bug reported during dogfood: user launched Crow with the experimental flag default-off (after the new Settings → Experimental tab landed but before they explicitly enabled it). On hydrate, all their persisted .tmux rows got auto-downgraded to .ghostty AND the change was written back to the store. Toggling the flag back ON didn't restore them — the rows were permanently demoted, no T badge, no tmux backing. Root cause: rehydrateTerminalSurface's `guard !tmuxBinary.isEmpty` branch can't tell "user toggled flag off" apart from "tmux uninstalled between runs" — both look like an unconfigured backend. The original fallback logic from 04fdd69 assumed a tmux-less backend means tmux is gone forever, so it persisted .ghostty to avoid retrying every launch. That assumption breaks the moment we surface the toggle in the UI. Fix: - rehydrateTerminalSurface .tmux case: when the backend isn't configured, pre-initialize a Ghostty surface for the SessionTerminal (so the UI can render its tab) but RETURN THE ROW UNCHANGED. The persisted backend stays .tmux. Next launch with the flag on, the row hydrates as tmux normally — T badge restored, cockpit window re-registered, claude --continue auto-launches as before. - The catch path (registerTerminal actually threw despite a configured backend) keeps the old "permanent downgrade" behavior. That case is a real failure that DOES warrant persisting .ghostty to avoid loops. Supporting change in TmuxBackend.ensureRunningServer: - Replaced the `precondition(!tmuxBinary.isEmpty, ...)` / `precondition(!socketPath.isEmpty, ...)` pair with a thrown `TmuxBackendError.notConfigured`. This is necessary because the new rehydrate behavior leaves rows as backend=.tmux even when the backend isn't configured. Without the throw, TerminalSurfaceView's surfaceForBackend → cockpitSurface → ensureRunningServer chain would crash on first render. With the throw, surfaceForBackend's existing catch falls back cleanly to TerminalManager.shared.surface (which finds the surface we pre-initialized in rehydrate). Existing data note: rows that were already auto-downgraded to .ghostty by the buggy code path can't be recovered from the store — the tmuxBinding was nilled out. Affected users need to delete and recreate those sessions to get the .tmux marking back. This commit only prevents future cases. 20 CrowTerminal + 39 root tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…xes (#238) Closes #235 ## Summary - Adds `docs/automation.md` as the canonical guide to Settings → Automation and the full auto-flow lifecycle. - Updates `docs/architecture.md` with the dual Ghostty/tmux backend (#229), the new `TerminalRouter` dispatch, the Settings tab split (#228), and the Review Board surface (#188, #205, #207, #210, #212, #220, #226, #231). - Adds `crow rename-terminal` (#206) to `docs/cli-reference.md`. - Adds troubleshooting rows for tmux backend missing, Ghostty surface retry (#218), GitLab nested groups (#233), `GITLAB_HOST` silent skip (#215), auto-respond not firing, and the silent no-op `hook-event` behavior (#234). - Adds `CROW_TMUX_BACKEND` and `CROW_HOOK_DEBUG` to the `docs/configuration.md` env-var table. - Backfills `CHANGELOG.md` `[Unreleased]` with every PR from #137 through #234, grouped by theme. - Updates `README.md` features list and docs index for the new automation suite, review board, terminal renaming, and tmux opt-in. - Appends an "Implementation Status (2026-05)" footer to `docs/terminal-runtime-research.md` noting #229 shipped the headless-PTY backend recommended in the original research. The audit checklist called out as deliverable #1 in the issue is posted as a [comment on #235](#235 (comment)). ## Test plan - [ ] `git diff --stat main` shows only the listed `docs/`, `CHANGELOG.md`, and `README.md` files - [ ] Render each modified doc on GitHub and confirm anchors / cross-links resolve - [ ] Confirm `crow --help` matches the command list in `docs/cli-reference.md` (only `rename-terminal` was missing pre-PR) - [ ] Walk every PR number in the CHANGELOG against `git log --since=2026-04-15 main --oneline` and confirm each one resolves - [ ] Re-export `docs/crow-screenshot.jpeg` against the current Settings/Review-Board UI — **deferred to a follow-up**, called out in the audit comment 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Implements the recommendation from the #198 spike: a single embedded Ghostty surface fronting a per-app tmux server, where each Crow tab is a tmux window. Off by default; flipping
CROW_TMUX_BACKEND=1in the environment opts the user in for testing.The wins (when the flag is on):
SentinelWaiter— replaces the 5s sleep atTerminalManager.swift:113-126. Spike measured 172ms (zsh) / 231ms (bash); through tmux, sub-50ms.tmux select-window— no second surface created.crow sendviatmux load-buffer + paste-buffer.send-keys -lwould fail withcommand too long.SessionTerminal.tmuxBindingrows survive restart and re-bind cleanly.Ticket map (spec §14)
Seven discrete pieces, each its own commit:
63595d1Add TerminalBackend discriminator{ ghostty, tmux }+TmuxBinding.8312622Bundle wrapper.sh + tmux.conf18253f8Add SentinelWaiter + TmuxController71e4806Add TmuxBackend orchestrator5dae7a0Wire CROW_TMUX_BACKEND flagTerminalSurfaceView.TmuxBackend.sendTextalready usesload-buffer + paste-buffer.e35a984First-run onboarding alertNSAlertwhen tmux ≥ 3.3 is missing — copiesbrew install tmux, opens install guide, or continues with Ghostty.7f049862s timeout watchdogTmuxController.runSIGTERMs hung tmux, throws.timedOutinstead of letting the UI freeze.42274c8Operator-greppable telemetry[CrowTelemetry tmux:first_prompt_ms=…]/tab_switch_ms=…/server_unresponsive/server_shutdown.16f6aaaFix scripts/build-ghostty.sh2>/dev/nullonchmod+ar rthat masked under-linked fat libraries during the spike.Plus the deferred Tier-A/B follow-ups landed in this same PR:
4a6f9c8Route ClaudeLauncher's auto-launch through TerminalRouterlaunchClaudenow reaches tmux-backed terminals.b974b32Route SessionService terminal lifecycle through TerminalRouteraddTerminal/addGlobalTerminal/recoverSession/createReviewSession/ close paths honor the flag.6c135c7Watchdog NSAlert with restart action04fdd69Re-bind .tmux terminals on hydrate.ghosttyon failure, with write-back to the store.Out of scope (intentional)
AppDelegate.swiftandSessionService.prepareTerminal. Manager migration is its own follow-up after at least one release of dogfood.SessionTerminalrows.File map
New:
Packages/CrowTerminal/Sources/CrowTerminal/BundledResources.swift,SentinelWaiter.swift,TmuxController.swift,TmuxBackend.swiftResources/crow-shell-wrapper.sh,Resources/crow-tmux.confSources/Crow/App/FeatureFlags.swift,TmuxDiscovery.swift,TerminalRouter.swiftPackages/CrowTerminal/Tests/CrowTerminalTests/(didn't exist before)Modified:
Packages/CrowCore/Sources/CrowCore/Models/Terminal.swift—TerminalBackendenum +TmuxBinding+ back-compat decoderPackages/CrowTerminal/Sources/CrowTerminal/TerminalSurfaceView.swift— backend-aware dispatchPackages/CrowTerminal/Package.swift— bundled resources + CrowCore dep + test targetSources/Crow/App/AppDelegate.swift— flag check at launch, RPC dispatch, watchdog alert, onboarding sheetSources/Crow/App/SessionService.swift— backend dispatch helper + hydrate rebinding + tmux readiness wiringscripts/build-ghostty.sh— chmod silent-failure fixTest plan
swift testacross all 9 packages: 337 tests pass, 0 fail.swift build --target Crowclean.SessionTerminalJSON without thebackendfield decodes as.ghostty(covered bySessionTerminalTests.missingBackendDefaultsToGhostty).[Crow] tmux backend configuredlog.CROW_TMUX_BACKEND=1 swift run Crow; create a new work session via/crow-workspace; verify[CrowTelemetry tmux:first_prompt_ms=…]fires within ~500ms; Claude auto-launches in the tab; tab switching is instant;tmux -S $TMPDIR/crow-tmux-*.sock list-windows -t crow-cockpitshows the windows.tmuxBinding.windowIndexchanges in~/Library/Application Support/crow/store.json); content is fresh (server was killed on quit, expected for v1).kill -STOP $(pgrep tmux)from another terminal; trigger a tab switch in Crow; expect the "tmux server is not responding" alert within ~2s; "Restart tmux server" button shouldshutdown()and respawn cleanly on next operation.Followups not in this PR
If this lands and dogfood goes well, the natural next tickets are:
keepServerAlivesetting + adoption path on launch.CrowTelemetry(currently log-only) once the structured metrics framework lands.feature/crow-198-tmux-backend-spike) so the design doc and throwaway prototype are reviewable separately.🤖 Generated with Claude Code