Skip to content

Add tmux backend behind CROW_TMUX_BACKEND feature flag (#198)#229

Merged
dhilgaertner merged 23 commits intomainfrom
feature/crow-tmux-backend
May 1, 2026
Merged

Add tmux backend behind CROW_TMUX_BACKEND feature flag (#198)#229
dhilgaertner merged 23 commits intomainfrom
feature/crow-tmux-backend

Conversation

@dhilgaertner
Copy link
Copy Markdown
Contributor

Summary

Implements the recommendation from the #198 spike: a single embedded Ghostty surface fronting a per-app tmux server, where each Crow tab is a tmux window. Off by default; flipping CROW_TMUX_BACKEND=1 in the environment opts the user in for testing.

The wins (when the flag is on):

  • Reliable shell-readiness via a sentinel-file polled by SentinelWaiter — replaces the 5s sleep at TerminalManager.swift:113-126. Spike measured 172ms (zsh) / 231ms (bash); through tmux, sub-50ms.
  • Single libghostty surface for the whole app — N sessions = 1 Metal context, not N.
  • Switching tabs is just tmux select-window — no second surface created.
  • Bounded 50KB+ input routing for crow send via tmux load-buffer + paste-buffer. send-keys -l would fail with command too long.
  • Free session persistence design in place — v1 still kills the server on app exit (spec §12), but SessionTerminal.tmuxBinding rows survive restart and re-bind cleanly.

Ticket map (spec §14)

Seven discrete pieces, each its own commit:

# Commit Result
1 63595d1 Add TerminalBackend discriminator Backwards-compat persisted-row migration to { ghostty, tmux } + TmuxBinding.
2 8312622 Bundle wrapper.sh + tmux.conf Real on-disk artifacts, syntax-highlightable.
18253f8 Add SentinelWaiter + TmuxController Async-friendly readiness; ARG_MAX-safe paste.
71e4806 Add TmuxBackend orchestrator Owns server + cockpit surface + UUID→window map.
5dae7a0 Wire CROW_TMUX_BACKEND flag App-launch detect, RPC dispatch, surface dispatch in TerminalSurfaceView.
3 (folded into #2) TmuxBackend.sendText already uses load-buffer + paste-buffer.
4 e35a984 First-run onboarding alert Native NSAlert when tmux ≥ 3.3 is missing — copies brew install tmux, opens install guide, or continues with Ghostty.
5 7f04986 2s timeout watchdog TmuxController.run SIGTERMs hung tmux, throws .timedOut instead of letting the UI freeze.
6 42274c8 Operator-greppable telemetry [CrowTelemetry tmux:first_prompt_ms=…] / tab_switch_ms=… / server_unresponsive / server_shutdown.
7 16f6aaa Fix scripts/build-ghostty.sh Drops the 2>/dev/null on chmod + ar r that masked under-linked fat libraries during the spike.

Plus the deferred Tier-A/B follow-ups landed in this same PR:

Item Commit Result
A.1 4a6f9c8 Route ClaudeLauncher's auto-launch through TerminalRouter launchClaude now reaches tmux-backed terminals.
A.5a b974b32 Route SessionService terminal lifecycle through TerminalRouter addTerminal / addGlobalTerminal / recoverSession / createReviewSession / close paths honor the flag.
A.3 6c135c7 Watchdog NSAlert with restart action Surfaces hangs to the user; debounces concurrent re-fires.
B.2 04fdd69 Re-bind .tmux terminals on hydrate App restart re-creates tmux windows from persisted rows; silent fallback to .ghostty on failure, with write-back to the store.

Out of scope (intentional)

  • Manager terminal stays on Ghostty. Tier C — too risky to migrate concurrently with a feature-flag rollout. The carve-out is in AppDelegate.swift and SessionService.prepareTerminal. Manager migration is its own follow-up after at least one release of dogfood.
  • Server-survives-app-quit. Spec §12 calls this out as a future toggle; v1 kills the server on quit, sessions are recreated from SessionTerminal rows.
  • Adoption of an already-running tmux server on launch. Tied to the above; we re-register fresh windows for now.

File map

New:

  • Packages/CrowTerminal/Sources/CrowTerminal/
    • BundledResources.swift, SentinelWaiter.swift, TmuxController.swift, TmuxBackend.swift
    • Resources/crow-shell-wrapper.sh, Resources/crow-tmux.conf
  • Sources/Crow/App/
    • FeatureFlags.swift, TmuxDiscovery.swift, TerminalRouter.swift
  • Tests covering the above + Packages/CrowTerminal/Tests/CrowTerminalTests/ (didn't exist before)

Modified:

  • Packages/CrowCore/Sources/CrowCore/Models/Terminal.swiftTerminalBackend enum + TmuxBinding + back-compat decoder
  • Packages/CrowTerminal/Sources/CrowTerminal/TerminalSurfaceView.swift — backend-aware dispatch
  • Packages/CrowTerminal/Package.swift — bundled resources + CrowCore dep + test target
  • Sources/Crow/App/AppDelegate.swift — flag check at launch, RPC dispatch, watchdog alert, onboarding sheet
  • Sources/Crow/App/SessionService.swift — backend dispatch helper + hydrate rebinding + tmux readiness wiring
  • scripts/build-ghostty.sh — chmod silent-failure fix

Test plan

  • swift test across all 9 packages: 337 tests pass, 0 fail.
  • swift build --target Crow clean.
  • Backwards-compat: persisted SessionTerminal JSON without the backend field decodes as .ghostty (covered by SessionTerminalTests.missingBackendDefaultsToGhostty).
  • Manual smoke (flag off): launch Crow without the env var; create / close / switch tabs; verify behavior matches main. No [Crow] tmux backend configured log.
  • Manual smoke (flag on, tmux installed): CROW_TMUX_BACKEND=1 swift run Crow; create a new work session via /crow-workspace; verify [CrowTelemetry tmux:first_prompt_ms=…] fires within ~500ms; Claude auto-launches in the tab; tab switching is instant; tmux -S $TMPDIR/crow-tmux-*.sock list-windows -t crow-cockpit shows the windows.
  • Manual smoke (flag on, tmux missing): temporarily move tmux out of PATH, launch with flag on; expect onboarding alert with three buttons; "Continue" should fall back to Ghostty cleanly.
  • Manual smoke (app restart with .tmux rows): with flag on, create a session; quit Crow; relaunch with flag still on; the same session's terminal re-binds to a new tmux window (the tmuxBinding.windowIndex changes in ~/Library/Application Support/crow/store.json); content is fresh (server was killed on quit, expected for v1).
  • Manual smoke (watchdog): with flag on, force a hang via kill -STOP $(pgrep tmux) from another terminal; trigger a tab switch in Crow; expect the "tmux server is not responding" alert within ~2s; "Restart tmux server" button should shutdown() and respawn cleanly on next operation.

Followups not in this PR

If this lands and dogfood goes well, the natural next tickets are:

  1. Migrate the Manager terminal to the tmux backend.
  2. keepServerAlive setting + adoption path on launch.
  3. Wire telemetry through CrowTelemetry (currently log-only) once the structured metrics framework lands.
  4. Open a PR for the spike branch itself (feature/crow-198-tmux-backend-spike) so the design doc and throwaway prototype are reviewable separately.

🤖 Generated with Claude Code

Dustin Hilgaertner and others added 13 commits April 28, 2026 23:12
Foundation for the tmux-backend rollout (#198). Purely additive:
existing rows on disk continue to load unchanged, defaulting to the
historical .ghostty path.

Adds:
  - TerminalBackend enum { ghostty, tmux }
  - TmuxBinding { socketPath, sessionName, windowIndex } — persisted so
    we can rebind to the same tmux window across Crow restart when
    the user opts in to keeping the tmux server alive between launches.
  - SessionTerminal.backend (default: .ghostty)
  - SessionTerminal.tmuxBinding (nil for .ghostty, populated for .tmux)
  - Backwards-compat decoder mirrors the existing isManaged migration.

Tests cover:
  - Round-trip with a .tmux row + binding
  - Decode of a v1 row missing both new fields → defaults to .ghostty
  - Explicit "ghostty" string decodes
  - .tmux row with no binding decodes (permissive — runtime construction
    sites enforce the consistency invariant; the model just persists)

All 310 tests across 9 packages pass against this change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rces

Production-ready ports of the shell wrapper and tmux config validated by
the spike (Phases 1, 2a, 2b). Both files are now real on-disk artifacts —
syntax-highlightable, lintable, easy to iterate on — instead of Swift
string literals.

Wrapper changes vs. spike:
  - CROW_SENTINEL is mandatory (no Swift-baked default) — the production
    code always sets it explicitly, removing one source of confusion.
  - Wrapper exits 64 with a clear message if CROW_SENTINEL is missing.
  - Logic is otherwise identical: source user's shell config, install
    composable precmd/PROMPT_COMMAND hook emitting OSC 133;A + custom
    OSC 9;crow-ready + sentinel-file touch (DCS-tmux wrapped under TMUX).

tmux.conf changes vs. spike:
  - Adds explanatory comments tying each setting to its spike finding.
  - history-limit 5000 (up from default 2000) — Phase 3 §2 measured
    plenty of headroom on RSS.
  - escape-time 0 — interactive editors feel sluggish with the 500ms
    default.

BundledResources.swift exposes Bundle.module URLs for both files. Tests
cover: presence on disk, shebang sanity, allow-passthrough on, status off
— each invariant is load-bearing for the production wiring.

Adds CrowTerminal test target (didn't exist before).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Production-quality utilities for the tmux backend, ported from the
spike's PromptReadiness.swift / TmuxController.swift with cleanups:

SentinelWaiter:
  - Async-friendly (Task.sleep instead of Thread.sleep) so the readiness
    poll doesn't block whatever queue called it.
  - Replaces the hardcoded 5s sleep at TerminalManager.swift:113-126 for
    tmux-backed terminals. See docs/tmux-backend-spec.md §6.

TmuxController:
  - tmuxBinary is injected (no /opt/homebrew default — host detection
    happens in PROD #4 first-run check).
  - newWindow returns the new window's index (uses tmux's `-P -F` print
    flag), so the caller doesn't have to re-list windows after each call.
  - loadBufferFromStdin pipes the payload through stdin, sidestepping
    the ARG_MAX-derived `command too long` failure that hits send-keys -l
    on >10KB inputs (Phase 3 §3 finding).
  - listWindowIndices, hasSession, killWindow as primitives the
    TmuxBackend will compose.
  - TmuxError carries args/stdout/stderr so failure messages include
    enough context to debug from logs.

Tests:
  - SentinelWaiterTests: returns-immediately, returns-nil-on-timeout,
    mid-wait detection.
  - TmuxControllerTests: skipped via .enabled(if:) when no tmux binary
    is on the host (CI-friendly). When present, exercises session
    creation, window enumeration, load-buffer + paste-buffer round-trip,
    error surfacing, and version-string parsing.

13 tests, all green. CrowTerminal still builds clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The app-wide singleton that owns the embedded tmux server backing all
.tmux SessionTerminals. Lazily starts the server on first registerTerminal
call, owns the (eventually-shared) cockpit GhosttySurfaceView, and tracks
UUID → window-index bindings.

Public API mirrors what TerminalManager exposes for the Ghostty path so
callers can dispatch to either based on SessionTerminal.backend:

  configure(tmuxBinary:socketPath:)  — host injects discovered tmux path
  registerTerminal(...)              — new tmux window, returns binding
  adoptTerminal(...)                 — rebind to a window we already have
                                       (live-server reattach on app launch)
  makeActive(id:)                    — tmux select-window on tab switch
  sendText(id:text:)                 — load-buffer + paste-buffer (PROD #3)
  destroyTerminal(id:)               — tmux kill-window + cleanup
  cockpitSurface()                   — lazy single GhosttySurfaceView
                                       attached to `tmux attach-session`
  shutdown()                         — kill-server, used by PROD #5
                                       watchdog

Readiness:
  - Each registerTerminal sets CROW_SENTINEL via tmux new-window -e and
    spawns the bundled crow-shell-wrapper.sh as the window's child.
  - Sentinel poll runs as a Task; reports .shellReady via the same
    onReadinessChanged callback the Ghostty path uses, so downstream
    consumers (ClaudeLauncher) don't need to special-case the backend.

Initial command (e.g. `claude --continue`) is delivered via the same
buffer-paste path the public sendText uses — so the launch flow gets
the load-buffer-not-send-keys benefit (PROD #3) automatically.

CrowTerminal now declares CrowCore as an explicit dependency for the
TerminalReadiness / TmuxBinding types it composes with.

Tests: 6 TmuxBackend suite tests covering the tmux side end-to-end:
window creation, distinct indices for multiple registers, send-text
round-trip, destroy, adopt-on-missing-window error path, makeActive on
unregistered throws. All 19 CrowTerminal tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…inal

End-to-end dispatch for new SessionTerminals when CROW_TMUX_BACKEND=1
is set in the env. Existing call sites continue to use the Ghostty path
unchanged; the flag flip is a single point of decision at terminal-create
time.

New helpers:
  - FeatureFlags.tmuxBackend           — env-var check (default off)
  - TmuxDiscovery.discover()           — locate tmux on host, require ≥3.3
  - TmuxDiscovery.meetsMinimumVersion  — parse "tmux X.Y" with strict prefix
  - TerminalRouter.send/destroy        — dispatch by SessionTerminal.backend

App launch (AppDelegate.launchMainApp):
  - After GhosttyApp.shared.initialize(), if FeatureFlags.tmuxBackend AND
    a usable tmux ≥ 3.3 is found, call TmuxBackend.shared.configure().
    Per-app socket in $TMPDIR with PID suffix; v1 doesn't keep the server
    alive across launches (spec §12 calls that out as a future toggle).

new-terminal RPC:
  - Manager terminal stays on Ghostty (special case, special-purpose).
  - Other new terminals: when flag is on AND TmuxBackend is configured,
    set backend=.tmux, call registerTerminal, persist the binding on the
    SessionTerminal row. On registerTerminal failure, fall back to
    Ghostty cleanly so a tmux outage doesn't break new-session creation.

send RPC:
  - Routes through TerminalRouter.send(terminal, text:), which uses
    load-buffer + paste-buffer for tmux-backed terminals (PROD #3) and
    ghostty_surface_text for ghostty-backed.
  - Recovery branch (pre-init surface if missing) only runs for ghostty;
    tmux windows are created synchronously by registerTerminal.
  - Falls back to TerminalManager.shared.send for unknown terminal IDs
    (legacy compat for any caller that hasn't updated).

close-terminal RPC:
  - TerminalRouter.destroy dispatches to TerminalManager.destroy or
    TmuxBackend.destroyTerminal as appropriate.

TerminalSurfaceView:
  - Optional `backend: TerminalBackend = .ghostty` parameter, default
    keeps every existing call site working.
  - For .tmux terminals, fetches the shared cockpit surface from
    TmuxBackend instead of a per-id surface, and fires makeActive(id:)
    on tab switch — the "tab switch is just a tmux select-window" model
    from spec §5.

SessionService.wireTerminalReadiness:
  - Adds TmuxBackend.shared.onReadinessChanged subscriber alongside the
    existing TerminalManager.onStateChanged. Both feed the same
    TerminalReadiness state machine, so launchClaude works regardless of
    backend without backend-specific branches.

Tests:
  - FeatureFlagsTests: locks the canonical "true" spellings (1/true/yes/on)
    and confirms default-off when env var is unset.
  - TmuxDiscoveryTests: version parser accepts ≥3.3 (incl. 3.6a, 3.4-rc1,
    whitespace variants), rejects too-old, rejects wrong prefixes.

336 tests across 10 packages pass. Crow target builds clean.

Out of scope for this commit (follow-up work in this same PR):
  - ClaudeLauncher's direct TerminalManager.send calls (still ghostty-only)
  - Manager terminal on the tmux backend
  - Hydrate-state path for .tmux terminals on app restart
  - The preInitialize call sites for review / orphan / VS Code / clone flows

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…raction

The `chmod 644 *.o 2>/dev/null` and `ar r ... 2>/dev/null` swallowed real
errors during the per-library extraction loop. Some objects in Zig's
cache extract as read-only on certain hosts; the silent chmod left
permissions wrong, and the subsequent `ar r` then silently failed to
add those objects to the fat library. The result: an under-linked
libghostty-fat.a missing symbols like _sentry_uuid_*, hwy::Warn, and the
ImGui C++ constructors — only surfacing at consumer link time with
"Undefined symbols" errors that point nowhere helpful.

Caught this while building the tmux backend's Demo target during the
#198 spike (had to manually re-extract every dep library). Same drop
of `2>/dev/null` would have made the original failure obvious.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Spec §10.1 mitigation for the new tmux-as-SPOF failure mode: any tmux
subcommand from Swift that exceeds 2 seconds is SIGTERMed and surfaces
TmuxError.timedOut. Callers get control back instead of the UI freezing
waiting on a wedged tmux server.

Implementation:
  - DispatchSourceTimer scheduled at process spawn fires p.terminate()
    if the process is still running at the deadline.
  - TimeoutFlag (NSLock-guarded box) records whether the timer fired so
    we can distinguish a normal non-zero exit from a watchdog kill.
  - Default timeout is 2.0s — well above the spike's measured p95 (~74ms).
  - run() takes an optional timeout: parameter for cases that need a
    different budget (e.g. attaching to a session may take longer).

Adds TmuxError.timedOut(args:after:) case with a clean description.

Test:
  - timeoutSurfacesError verifies a normal-fast command (display-message)
    completes well within a tight 1s budget — locks in that the timeout
    plumbing doesn't accidentally kill normal operations.
  - The timer-fires path is exercised by the underlying
    DispatchSourceTimer + Process.terminate() machinery, both
    well-trodden. A stub-binary integration test for the kill-on-timeout
    path is filed as future work; not gating since the primitive's whole
    point is to bound the wait, not to verify the kill-precision.

The watchdog callback that offers the user "Restart tmux server" via
NSAlert (spec §10.1) is deferred to a follow-up commit. The current
change is the load-bearing one: bounding the wait prevents UI freeze;
the UI alert is polish.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Spec §14 PROD #6: track time-to-first-prompt, tab-switch latency, server
crash count. Crow doesn't have an internal metrics framework yet
(CrowTelemetry is OTLP-receiver-only for Claude Code analytics), so this
lands as structured NSLog statements with a consistent prefix that's
easy to grep / parse / re-route to a real pipeline later.

Three event types emitted from TmuxBackend:

  [CrowTelemetry tmux:first_prompt_ms=<n> terminal=<uuid>]
    Fired by SentinelWaiter when the bundled wrapper writes its sentinel
    file on first precmd. Replaces the time-to-shell-ready signal that
    used to be approximated by the 5s sleep at TerminalManager.swift:113.

  [CrowTelemetry tmux:first_prompt_timeout terminal=<uuid> budget_ms=5000]
    Fired when the readiness wait exceeds its 5s budget. Paired with
    the success counter to compute timeout rate per session.

  [CrowTelemetry tmux:tab_switch_ms=<n> terminal=<uuid>]
    Fired by makeActive on each select-window. Spike Phase 2a measured
    74ms p95; production should track this via real users.

  [CrowTelemetry tmux:server_shutdown bindings=<n>]
    Fired by shutdown(). Combined with start logs, gives session-length
    + cleanly-closed-vs-crashed signals. (PROD #5 watchdog will add a
    `tmux:server_unhealthy` event when the timeout-restart UI fires.)

Same shape as Crow's existing `[Crow]` log prefix — single-line, KEY=val
pairs, suitable for `awk '/CrowTelemetry/'` or shipping to OSLog. No
behavior changes; pure instrumentation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Spec §11 / PROD #4: when CROW_TMUX_BACKEND=1 is set but TmuxDiscovery
finds no tmux ≥ 3.3 on the host, surface a native NSAlert explaining
the situation and offering three actions:

  - "Copy `brew install tmux`" — puts the command on the clipboard so
    the user can paste it into their terminal.
  - "Open tmux install guide" — launches the upstream installation
    wiki in their default browser.
  - "Continue" — dismisses the sheet; Crow falls through to the
    standard Ghostty-per-session backend.

The alert intentionally doesn't block app launch — Crow still works
without tmux; the user just doesn't get the backend they asked for.
This matches the spec's "fallback is safe" property: a tmux outage
or missing-tool should never break the core app.

The alert text mentions explicitly that Crow won't modify the user's
dotfiles, addressing a real concern raised in the original ticket
discussion ("does this change my zsh config?"). The wrapper sources
their config rather than replacing it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tier-A item 1 of the deferred work in this PR. SessionService.launchClaude
fires when a managed work-session terminal hits .shellReady — that signal
now comes from both backends (Ghostty 5s sleep AND tmux SentinelWaiter,
wired in the prior commit). The launch's claude-command write was still
going through TerminalManager.shared.send, which has no surface registered
for tmux-backed terminals.

Look up the SessionTerminal up front, pull the launch text out into a
local so the if/else branches are about routing-vs-text-construction
(they were nested before), and dispatch through TerminalRouter when the
row is known. Falls back to the direct TerminalManager call for unknown
ids — same defensive pattern the `crow send` RPC uses.

For .tmux terminals this means the launch text travels via load-buffer +
paste-buffer to the right tmux window, instead of disappearing into a
non-existent Ghostty surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tier-A item 5a: the call sites that create / destroy terminals via
SessionService (addTerminal, addGlobalTerminal, recoverSession,
createReviewSession, closeTerminal, closeGlobalTerminal) all bypassed
TerminalRouter and called TerminalManager.shared.* directly. Result:
even with CROW_TMUX_BACKEND=1, only the new-terminal RPC path actually
honored the flag — UI-driven "+", auto-review, and orphan-recovery
flows still landed on Ghostty.

Adds private SessionService.prepareTerminal(_, trackReadiness:) helper
that mirrors the dispatch logic from AppDelegate's new-terminal RPC:
decide useTmux based on the flag + configured backend + non-Manager
session, register a tmux window or pre-init a Ghostty surface, set
backend/tmuxBinding on the row, and return it for persistence.
Manager terminal stays force-pinned to Ghostty (Tier-C deferred).

closeTerminal and closeGlobalTerminal route destruction through
TerminalRouter.destroy(terminal). closeGlobalTerminal falls back to
TerminalManager.shared.destroy when the row isn't found in AppState
(legacy compat — same defensive pattern as the send RPC).

Net effect: with the flag on, every Crow surface other than the Manager
terminal lands on the tmux backend. Without the flag, behavior is
unchanged. 337 tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tier-A item 3: spec §10.1 mitigation for the new tmux-as-SPOF failure
mode. The 2s watchdog in TmuxController.run already prevents UI freezes
by SIGTERMing wedged tmux subprocesses, but until now those timeouts
were silent — the user had no signal that anything was wrong, and the
server stayed alive in its hung state.

Two changes:

TmuxBackend gains `onUnresponsive: ((TmuxError) -> Void)?`. The public
methods that hit the tmux server (registerTerminal / makeActive /
sendText) catch errors, forward .timedOut to the callback, and re-throw.
Other errors pass through silently — the callback is specifically the
"server hung" signal, not a general error sink.

AppDelegate wires it after TmuxBackend.configure to a native NSAlert
with two actions:
  - "Restart tmux server" → TmuxBackend.shared.shutdown(), which kills
    the server. The next backend call respawns a fresh server from
    scratch via ensureRunningServer + the persisted SessionTerminal
    rows. Logs `[CrowTelemetry tmux:server_restart_by_user]` for the
    operator-greppable counter.
  - "Continue without restart" → dismiss the alert and let the user
    decide what to do next.

A `tmuxUnresponsiveAlertShowing` guard suppresses concurrent alerts so
a burst of timeouts (the most likely shape of a real hang — every
makeActive on tab switch fails together) only shows one alert until
dismissed.

The watchdog log path also fires `[CrowTelemetry tmux:server_unresponsive]`
when reportIfTimeout fires, paired with the existing
`tmux:server_shutdown` event so dashboards can compute hang vs.
clean-quit ratios.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tier-B item 2 of the deferred work in this PR. SessionService.hydrateState
previously called TerminalManager.shared.preInitialize for every persisted
SessionTerminal — wrong for .tmux rows: it would create a leaked Ghostty
surface for an id whose actual home is a tmux window that doesn't exist
yet (the v1 rollout kills the server on app exit, spec §12).

New per-terminal helper rehydrateTerminalSurface dispatches by backend:

  .ghostty: existing path (trackReadiness if managed + preInitialize).

  .tmux:    v1 doesn't keep the server alive across launches, so we
            re-register a fresh window via TmuxBackend.registerTerminal
            and update the row's tmuxBinding.windowIndex with the new
            index returned by tmux. Future work (server-survives-launch)
            would call adoptTerminal here instead.

Failure cases — TmuxBackend not configured (flag off / tmux gone since
last launch), or registerTerminal throws — silently fall back to the
Ghostty path. Matches the user's stated preference: a tmux outage
shouldn't block the user from using the app, just degrade them to the
historical experience.

Persistence: any row whose backend or tmuxBinding changed during
re-hydration (i.e. fell back to .ghostty) is written back to the store
in a single batched mutate, so a subsequent crash doesn't keep retrying
a doomed re-register on every launch.

The trackReadiness call previously made in the hydrate loop is now
folded into rehydrateTerminalSurface — both backends get the right
readiness wiring as a side effect of re-creating the surface/window.

Out of scope (future work):
  - Adoption of an already-running tmux server (requires keeping the
    server alive across launches; spec §12 toggle).
  - Watchdog wiring for hydrate-time timeouts (currently logs only).

337 tests across 10 packages still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@dgershman dgershman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code & Security Review

Critical Issues

1. Missing tmux shutdown on app exit — orphaned server processes
Sources/Crow/App/AppDelegate.swiftapplicationWillTerminate does not call TmuxBackend.shared.shutdown(). The PR description explicitly states "v1 still kills the server on app exit (spec §12)" and the shutdown() method exists and works (emits telemetry, kills the server, cleans sentinels), but it's never invoked on quit. This means tmux server processes will survive app termination as orphans, accumulating one per app launch (since each launch gets its own PID-scoped socket at $TMPDIR/crow-tmux-<pid>.sock).

Fix: add TmuxBackend.shared.shutdown() in applicationWillTerminate, before GhosttyApp.shared.shutdown().

2. deleteSession bypasses TerminalRouter — tmux windows leak on session delete
Sources/Crow/App/SessionService.swift:425-428:

for terminal in terminals {
    TerminalManager.shared.destroy(id: terminal.id)
}

This calls the Ghostty-only destroy path. For .tmux terminals, the tmux window is never killed — it persists in the server until app exit. The rest of the codebase correctly uses TerminalRouter.destroy(terminal) (see closeTerminal, closeGlobalTerminal, and the RPC close-terminal handler), but deleteSession was missed.

Fix: change to TerminalRouter.destroy(terminal).

Security Review

Strengths:

  • Process spawning uses Process.arguments arrays throughout (no shell string interpolation), eliminating command injection vectors in TmuxController.
  • shellQuote() in TmuxBackend uses proper POSIX single-quote escaping for the one place a shell string is constructed (cockpitSurface()'s attach command).
  • Feature flag is read-only from environment — no runtime mutation surface.
  • Sentinel files use internally-generated UUIDs for paths, not user-controlled input.
  • crow-tmux.conf correctly runs unbind-key -a to strip all default tmux bindings, preventing users from accidentally reaching tmux's command mode through the embedded surface.
  • crow-shell-wrapper.sh validates CROW_SENTINEL is set before proceeding and uses add-zsh-hook for composable hook installation rather than overwriting precmd.
  • Path traversal protection already exists for RPC commands (isPathWithinDevRoot), and the new tmux paths don't introduce new external-input-to-path flows.

Concerns:

  • None blocking. The tmux binary path is resolved from a fixed allowlist (TmuxDiscovery.searchPaths), not from $PATH, which is correct.

Code Quality

loadBufferFromStdin has no watchdog timeout (Packages/CrowTerminal/Sources/CrowTerminal/TmuxController.swift:155-179):
Unlike run() which has a 2s watchdog that SIGTERMs hung processes, loadBufferFromStdin calls p.waitUntilExit() with no timeout. If tmux load-buffer hangs (e.g., the server is wedged), this blocks the calling thread indefinitely. Since sendText is called from @MainActor context, this could freeze the UI. Should use the same DispatchSource.makeTimerSource watchdog pattern, or factor the timeout logic into a shared helper.

UUID prefix collision in buffer names and sentinel paths (TmuxBackend.swift:229, 301):
Both sendText and sentinelPath use id.uuidString.prefix(8) (32 bits of entropy). Two terminals sharing the same 8-char prefix would overwrite each other's paste buffers or sentinel files. With typical terminal counts (<100), collision probability is negligible, but using .prefix(12) or the full UUID is free and eliminates it entirely.

Test coverage is solid: backward-compat decoder tests, integration tests gated behind tmux availability (.enabled(if: discoveredTmuxBinary != nil)), sentinel waiter tests with mid-wait appearance, and version parsing edge cases are all well-structured.

Architecture is clean: TerminalRouter as a dispatch hub, TmuxBackend as a singleton matching TerminalManager's @MainActor constraint, and SentinelWaiter as a pure async value type are all good abstractions.

Summary Table

Priority Issue
🔴 Red applicationWillTerminate doesn't call TmuxBackend.shared.shutdown() — tmux servers orphan on every quit
🔴 Red deleteSession calls TerminalManager.shared.destroy(id:) instead of TerminalRouter.destroy(terminal) — tmux windows leak
🟡 Yellow loadBufferFromStdin lacks the watchdog timeout that run() has — could freeze UI on hung server
🟡 Yellow prefix(8) UUID truncation in buffer/sentinel names — avoidable collision risk
🟢 Green Temp ZDOTDIR dirs from shell wrapper accumulate until reboot (acceptable — uses $TMPDIR)

Recommendation: Request Changes — the two red items are correctness bugs that violate the PR's own stated invariants (spec §12 server-kills-on-quit, and consistent TerminalRouter dispatch). Both are one-line fixes. The yellow items are worth addressing but aren't blockers.

dhilgaertner and others added 2 commits April 30, 2026 23:07
Resolves the four findings (two reds, two yellows) from @dgershman's
code review.

Wire TmuxBackend.shared.shutdown() into applicationWillTerminate before
GhosttyApp.shared.shutdown() so the per-launch tmux server is reaped on
quit, matching the PR description's "v1 kills the server on app exit
(spec §12)" intent. Without this, each launch left an orphaned tmux
process bound to $TMPDIR/crow-tmux-<pid>.sock — accumulating one per
launch and never reusable since the socket path is PID-scoped.

Switch SessionService.deleteSession from TerminalManager.shared.destroy(id:)
to TerminalRouter.destroy(terminal). The Ghostty-only direct call left
.tmux windows alive in the cockpit session whenever a session containing
them was deleted; the router dispatches per backend so kill-window runs.

Extract a ProcessWatchdog helper in TmuxController and apply it to
loadBufferFromStdin, which previously had no timeout and could freeze
the UI thread (called from @mainactor) on a wedged server. Both run()
and loadBufferFromStdin now share the same SIGTERM-on-timeout pattern;
loadBufferFromStdin gains a timeout: parameter (defaults to
defaultTimeout) and converts a write-side EPIPE into TmuxError.timedOut
when the watchdog had already fired before the stdin pipe closed.

Drop the .prefix(8) UUID truncation in TmuxBackend's buffer names and
sentinel paths. Full UUID eliminates the (small but free-to-fix)
collision risk between concurrent terminals that happen to share an
8-char prefix.

Out of scope (separate ticket): tmux server survival across app
restarts (spec §12 future toggle; requires keepServerAlive setting
plus adoptTerminal plumbing in SessionService.rehydrateTerminalSurface).

All package tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dhilgaertner dhilgaertner requested a review from dgershman May 1, 2026 04:10
@dhilgaertner dhilgaertner self-assigned this May 1, 2026
Copy link
Copy Markdown
Collaborator

@dgershman dgershman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review: All findings addressed

All four findings from the initial review have been resolved in 93ca9fc:

Finding Status
🔴 Missing TmuxBackend.shared.shutdown() in applicationWillTerminate Fixed — added before GhosttyApp.shared.shutdown()
🔴 deleteSession bypassing TerminalRouter Fixed — routes through TerminalRouter.destroy(terminal)
🟡 loadBufferFromStdin missing watchdog timeout Fixed — extracted ProcessWatchdog helper shared by both run() and loadBufferFromStdin, with EPIPE-after-watchdog handling
🟡 prefix(8) UUID truncation Fixed — uses full UUID for both buffer names and sentinel paths

The ProcessWatchdog refactor is a nice improvement over the inline TimeoutFlag + DispatchSource pattern — DRYs up the timeout logic and handles the stdin EPIPE edge case where the watchdog fires mid-write.

LGTM.

dhilgaertner and others added 8 commits May 1, 2026 14:47
… cockpit

TerminalSurfaceView's `backend:` parameter defaults to `.ghostty`, and none
of the four UI call sites (SessionDetailView's manager / non-managed /
ReadinessAwareTerminal cases, plus GlobalTerminalView) passed the
SessionTerminal row's backend value through. Result: even when the
persisted row was `.tmux` and TmuxBackend.registerTerminal had created
the cockpit window correctly, the visible tab routed through the
`.ghostty` branch of surfaceForBackend(), spawned a brand-new login
shell via TerminalManager.shared.surface, and the cockpit Ghostty
surface was never created or attached.

User-visible symptom (caught by manual smoke testing of #229): create
a session via /crow-workspace with the flag on; claude actually runs
in tmux window 1 (verifiable via `tmux -S … list-windows`), and hook
events post permission badges to the sidebar — but the Claude Code
tab shows a fresh "Last login: …" zsh prompt with no claude in sight,
because the visible surface is a leaked per-terminal Ghostty.

Diagnosis trail in the user's log:

  [TerminalManager] surface(for: D6F9E577-…) — creating NEW view
  [SessionService] onStateChanged: terminal=D6F9E577, state=created
  [CrowTelemetry tmux:first_prompt_timeout terminal=D6F9E577]

The tmux backend correctly waited on the sentinel, the wrapper
correctly fired the precmd in the tmux window, but the visible
Ghostty surface was a different process tree entirely — so the
sentinel timed out at the 5s budget and the user saw a wrong
surface. No `[TerminalSurfaceView] tmux cockpitSurface failed:`
line ever appeared, confirming we weren't even falling into the
catch — the UI just chose `.ghostty` directly.

Fix: pass `backend: terminal.backend` at the four call sites that
have a real SessionTerminal row. The empty-state instantiation at
SessionDetailView.swift:228 has no row to read from and keeps the
default `.ghostty` (correct — there's nothing to render).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two tmux clients were ending up attached to the same crow-cockpit session
on every cold render, observable via:

  $ tmux -S "$SOCKET" list-clients
  /dev/ttys021: crow-cockpit (attached,focused)
  /dev/ttys022: crow-cockpit (attached,focused)

Two distinct GhosttySurfaceView instances were being created with the
attach-session command. The smoking gun in the user's NSLog tee:

  14:49:04.860  [Ghostty] createSurface() succeeded, hasCallback=false
  14:49:04.969  [Ghostty] createSurface() succeeded, hasCallback=false

(TerminalManager surfaces always set onSurfaceCreated, so hasCallback=false
is the cockpit's signature.) 109ms apart, both spawned by Crow's process.

Two contributing causes, both fixed:

1. cockpitSurface() did `addSubview(view)` BEFORE `sharedSurface = view`.
   addSubview synchronously triggers viewDidMoveToWindow → createSurface
   → ghostty_surface_new, which can pump the main runloop briefly while
   libghostty's renderer registers. A re-entrant cockpitSurface() call
   during that window saw `sharedSurface == nil`, fell through the
   `if let existing` short-circuit, and spawned a second view. Reorder:
   cache `sharedSurface = view` before addSubview, so any nested call
   sees the cached view and short-circuits.

2. TerminalSurfaceView.existingSurfaceForBackend() called cockpitSurface()
   for the .tmux case — which CREATES if not existing, despite the
   "existing" name. updateNSView (which calls existingSurfaceForBackend)
   could race with makeNSView (which calls cockpitSurface directly).
   Add a side-effect-free `existingCockpitSurface` accessor on TmuxBackend
   that returns the cached value or nil, and route existingSurfaceForBackend
   through it. updateNSView now correctly no-ops when the cockpit hasn't
   been created yet, instead of triggering creation as a side effect.

Result: one cockpit surface per app launch, one tmux client per cockpit,
matches the singleton design intent. Doesn't change behavior for callers
that genuinely want lazy creation (makeNSView still uses cockpitSurface
directly).

20 CrowTerminal + 38 root tests still pass. Functional verification
deferred to next manual smoke test (tmux list-clients should show
exactly one row after navigating multiple .tmux session tabs).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reproduced on app restart with 8 managed terminals (6 .ghostty + 2 .tmux):
the .tmux terminals' "Waiting for terminal..." spinner never cleared,
even though their wrappers had clearly fired (sentinel files present
on disk with the expected mtime).

Two layered problems in startReadinessWatch:

1. The 5-second SentinelWaiter budget is fine for a cold-creation of a
   single terminal but too tight for hydrate-time, where N managed
   terminals all start their shells simultaneously and contend for CPU.
   Heavy zshrc setups (oh-my-zsh, asdf, nvm, starship) routinely push
   first-prompt latency past 5s under that load.

2. When the budget did blow, the timeout branch logged a telemetry line
   and DID NOTHING ELSE. The comment said "caller can decide via their
   own watchdog" — but no caller has one. The readiness state stays
   .uninitialized → ReadinessAwareTerminal's overlay shows the spinner
   forever → launchClaude is never auto-fired → user is permanently
   stuck on a fresh-but-empty pane.

Fix:

  - Bump the budget to 30s (covers contended hydrate scenarios).
  - On a genuine 30s timeout, advance to .shellReady ANYWAY. Three
    rationales:
      * The shell is almost certainly alive but slow; the paste-buffered
        `claude --continue` queues at the tmux pane and runs when the
        shell finally drains its tty.
      * If the shell IS dead (rare — would require the wrapper to crash
        in startup), the pane stays empty, which is the truthful state
        and lets the user see what's happening instead of a misleading
        spinner.
      * "Spinner forever" is strictly worse than "advance and let the
        downstream paste be best-effort" for any realistic failure mode.
    The telemetry line still records the timeout for operator visibility.

20 CrowTerminal + 38 root tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…oject

User reported opening crow-230's tab and seeing a different project's
claude session — specifically the PR #229 conversation rather than
crow-230's own /plan transcript. Investigation showed:

  - Cockpit was correctly attached to crow-230's tmux window (window 2,
    confirmed via tmux list-windows showing the * marker on window 2).
  - Window 2's pane content was claude, but its prompt header read
    "PR #229" — not the crow-230 ticket.

Root cause: TmuxBackend.registerTerminal was setting PWD env on the new
tmux window but not actually setting the start-directory. tmux's
new-window inherits the SERVER's cwd unless given -c <dir>. Crow's tmux
server is started from the Crow process's launch directory (typically
the crow-tmux-backend worktree). So every .tmux window's shell came up
in /Users/.../crow-tmux-backend/ regardless of which session/worktree it
was supposed to live in.

When the auto-launched `claude --continue` ran in that wrong cwd, claude
matched it against ~/.claude/projects/-Users-…-crow-tmux-backend/ —
which is THIS PR #229 conversation's project — and resumed it instead of
the crow-230 transcript that lives under -Users-…-crow-230-quick-action-
buttons/. Setting PWD env without actually cd'ing the shell does not
help: PWD is just an env var, the shell process is still spawned in
tmux's cwd.

Fix:

  - Add `cwd:` to TmuxController.newWindow. When set, append `-c <cwd>`
    to the args. tmux honors -c as the start-directory for the new
    window's spawned process.
  - Pass terminal.cwd from TmuxBackend.registerTerminal. Falls through
    cleanly when the caller didn't set a cwd (rare — only the cockpit
    anchor doesn't, and that's intentional).

Result: each .tmux terminal's shell starts in its own worktree, so
`claude --continue` resumes the right project transcript.

20 CrowTerminal + 38 root tests still pass. Existing newWindow tests
keep working because the new parameter is optional.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Surface CROW_TMUX_BACKEND as a user-facing preference instead of an
env-var-only flag. Most users will never relaunch Crow from a shell
with an env prefix, so the rollout currently has zero discoverability.

Wiring (5 files, ~120 lines net):

  Packages/CrowCore/.../AppConfig.swift
    Add experimentalTmuxBackend: Bool = false. Same back-compat decoder
    pattern as every other field in this struct (decodeIfPresent ?? false),
    so existing config.json files continue to load unchanged.

  Sources/Crow/App/FeatureFlags.swift
    Add a static `tmuxBackendConfigOverride: Bool` (nonisolated(unsafe)
    since the value is set once at launch on the main thread, then read-
    only). FeatureFlags.tmuxBackend now returns
      envFlag OR tmuxBackendConfigOverride
    so the env var keeps working as a CI/dev override AND the new UI
    toggle works for end users. No changes needed at the three existing
    read sites (AppDelegate's launch + new-terminal RPC, SessionService's
    addTerminal) — they all benefit transparently.

  Sources/Crow/App/AppDelegate.swift
    Reorder launchMainApp to load AppConfig BEFORE the tmux configure
    block, then set FeatureFlags.tmuxBackendConfigOverride from the
    loaded config.experimentalTmuxBackend. Ordering is load-bearing:
    the configure block reads FeatureFlags.tmuxBackend which now ORs in
    the override.

  Packages/CrowUI/.../ExperimentalSettingsView.swift  [NEW]
    Mirrors AutomationSettingsView's shape: Form { Section { Toggle } }
    with a single binding + onSave closure. Section header reads
    "tmux backend" with caption "Experimental — see #198 for context".
    Body caption explains the trade (faster session switching, less
    memory, requires tmux ≥ 3.3) and notes "Takes effect on next app
    launch" — same pattern as the existing Manager Terminal section.

  Packages/CrowUI/.../SettingsView.swift
    One TabView entry appended after Notifications, using the `flask`
    SF Symbol. Position last per macOS convention for advanced /
    experimental settings.

Tests:
  - AppConfigTests: round-trip test for experimentalTmuxBackend +
    default-false assertion in the empty-JSON decode test.
  - FeatureFlagsTests: tmuxBackendOnWhenConfigOverrideOn verifies the
    OR-merge with the config override, plus state-leak guard in the
    existing tmuxBackendDefaultIsOff test.

CrowCore 141 / root 39 / CrowUI 22 tests still pass.

Out of scope (deliberate):
  - Auto-relaunch button in the Experimental tab. User can ⌘Q + relaunch
    from their dev shell during testing, which is what they need anyway
    to keep the tee'd log going.
  - Removing FeatureFlags entirely. Env var is still useful for CI and
    for dev iteration without touching ~/Library/Application Support/crow/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
User-facing affordance for the tmux-backend rollout: at a glance you
can tell which sessions are running through the cockpit vs which fell
back to per-terminal Ghostty.

A session shows the badge iff any of its SessionTerminal rows has
backend == .tmux. Sessions whose managed terminal fell back to .ghostty
(via rehydrateAsGhosttyFallback or the new-terminal RPC's catch path)
have ALL terminals as .ghostty and so get no badge — matches the user's
explicit ask: "If it falls back the ghosty, it should not have a T."

Visual: monospaced bold "T" at 9pt, bottom-trailing of the row, in a
subtle rounded background using the same palette as existing badges
(textSecondary fg, bgDeep@70% fill, borderSubtle stroke). Hover tooltip
reads "This session's terminals are backed by tmux".

Reactive — the `isTmuxBacked` computed prop reads `appState.terminals`
which is @observable, so the badge appears/disappears the instant a
terminal's backend changes (e.g., a .tmux terminal getting
destroy()ed leaves the session with only .ghostty rows → badge
disappears next render).

No new state, no schema changes, no test updates.
22 CrowUI tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three related polish items from continued dogfood of the tmux backend:

== 1. Orphan tmux server reaper ==

User-visible problem: dev iteration that involves Force Quit (or
pkill) leaves a tmux server orphaned per cycle, accumulating ~10-20
MB of RSS each. Symptom (after several iterations):

  $ ls $TMPDIR/crow-tmux-*.sock
  crow-tmux-{20084,50095,60538,67250,78087,…}.sock

  $ pgrep -afl 'tmux.*crow-tmux'
  (5 live tmux servers, only 1 of which belongs to the running Crow)

Root cause: `applicationWillTerminate` (which runs PR #229's shutdown
fix) only fires on graceful exits. SIGKILL, Force Quit, and crash
bypass it. Past Crow's tmux server lives on as an orphan because the
socket is PID-scoped and the OS has no way to reap it.

Fix: new `TmuxOrphanReaper` enum in CrowTerminal. At launch it
enumerates `$TMPDIR/crow-tmux-*.sock`, parses the PID from each
filename, and reaps any whose PID is NOT a live `CrowApp` process
(checked via /bin/ps). Skips the current process's own socket and
skips peer-Crow sockets so multiple concurrent Crow instances don't
reap each other.

Wired into AppDelegate.launchMainApp: runs whenever tmux is
discoverable, regardless of whether THIS launch will use the tmux
backend (orphans accumulate independent of current flag state).
Also added `nonisolated` to `TmuxBackend.cockpitSessionName` so the
reaper can read the constant without a MainActor hop (it's an
immutable string literal — safe).

Telemetry: each reaped socket logs
`[CrowTelemetry tmux:orphan_reaped pid=… socket=…]` plus a summary
line. Idempotent — costs ~50ms when there's nothing to do.

== 2. Settings window: wider + app-modal ==

User reported the Settings window's tabs were collapsing into an
overflow ("…") menu because 5 tabs (General / Automation / Workspaces /
Notifications / Experimental) didn't fit at the previous 520pt width.
Bumped the SettingsView frame and the NSWindow contentRect to 720pt
wide; height unchanged at 480.

Also: Settings is now app-modal. `showSettings()` builds the window,
attaches a willClose observer that calls `NSApp.stopModal()`, then
runs `NSApp.runModal(for:)` to block the main app until the user
dismisses. This matches the "must dismiss to go back to the app"
UX the user explicitly asked for. Removed the now-unreachable
"existing window — bring forward" short-circuit.

== 3. Gear icon in sidebar toolbar ==

Surfaces Settings via a third toolbar button in SessionListView,
alongside the existing select-mode and mute-notifications buttons.
Wired through a new `AppState.onShowSettings` callback (mirroring the
existing onSoundMutedChanged / onAddTerminal pattern); AppDelegate
hooks it up to its own `showSettings()` method during launchMainApp's
closure-wiring block.

== Tests ==

CrowCore 141 / CrowTerminal 20 / root 39 / CrowUI 22 — all pass.
No new test cases for the reaper itself; its happy paths are exercised
by end-to-end behavior (no orphans accumulate after iteration), and the
ps-shellout / regex-parse logic doesn't have hidden-state bugs to
guard against.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…his run

Bug reported during dogfood: user launched Crow with the experimental
flag default-off (after the new Settings → Experimental tab landed but
before they explicitly enabled it). On hydrate, all their persisted
.tmux rows got auto-downgraded to .ghostty AND the change was written
back to the store. Toggling the flag back ON didn't restore them — the
rows were permanently demoted, no T badge, no tmux backing.

Root cause: rehydrateTerminalSurface's `guard !tmuxBinary.isEmpty`
branch can't tell "user toggled flag off" apart from "tmux uninstalled
between runs" — both look like an unconfigured backend. The original
fallback logic from 04fdd69 assumed a tmux-less backend means tmux is
gone forever, so it persisted .ghostty to avoid retrying every launch.
That assumption breaks the moment we surface the toggle in the UI.

Fix:

  - rehydrateTerminalSurface .tmux case: when the backend isn't
    configured, pre-initialize a Ghostty surface for the SessionTerminal
    (so the UI can render its tab) but RETURN THE ROW UNCHANGED. The
    persisted backend stays .tmux. Next launch with the flag on, the
    row hydrates as tmux normally — T badge restored, cockpit window
    re-registered, claude --continue auto-launches as before.

  - The catch path (registerTerminal actually threw despite a configured
    backend) keeps the old "permanent downgrade" behavior. That case is
    a real failure that DOES warrant persisting .ghostty to avoid loops.

Supporting change in TmuxBackend.ensureRunningServer:

  - Replaced the `precondition(!tmuxBinary.isEmpty, ...)` /
    `precondition(!socketPath.isEmpty, ...)` pair with a thrown
    `TmuxBackendError.notConfigured`. This is necessary because the new
    rehydrate behavior leaves rows as backend=.tmux even when the
    backend isn't configured. Without the throw, TerminalSurfaceView's
    surfaceForBackend → cockpitSurface → ensureRunningServer chain
    would crash on first render. With the throw, surfaceForBackend's
    existing catch falls back cleanly to TerminalManager.shared.surface
    (which finds the surface we pre-initialized in rehydrate).

Existing data note: rows that were already auto-downgraded to .ghostty
by the buggy code path can't be recovered from the store — the
tmuxBinding was nilled out. Affected users need to delete and recreate
those sessions to get the .tmux marking back. This commit only prevents
future cases.

20 CrowTerminal + 39 root tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dhilgaertner dhilgaertner merged commit 42dc67f into main May 1, 2026
2 checks passed
@dhilgaertner dhilgaertner deleted the feature/crow-tmux-backend branch May 1, 2026 21:36
dhilgaertner added a commit that referenced this pull request May 4, 2026
…xes (#238)

Closes #235

## Summary

- Adds `docs/automation.md` as the canonical guide to Settings →
Automation and the full auto-flow lifecycle.
- Updates `docs/architecture.md` with the dual Ghostty/tmux backend
(#229), the new `TerminalRouter` dispatch, the Settings tab split
(#228), and the Review Board surface (#188, #205, #207, #210, #212,
#220, #226, #231).
- Adds `crow rename-terminal` (#206) to `docs/cli-reference.md`.
- Adds troubleshooting rows for tmux backend missing, Ghostty surface
retry (#218), GitLab nested groups (#233), `GITLAB_HOST` silent skip
(#215), auto-respond not firing, and the silent no-op `hook-event`
behavior (#234).
- Adds `CROW_TMUX_BACKEND` and `CROW_HOOK_DEBUG` to the
`docs/configuration.md` env-var table.
- Backfills `CHANGELOG.md` `[Unreleased]` with every PR from #137
through #234, grouped by theme.
- Updates `README.md` features list and docs index for the new
automation suite, review board, terminal renaming, and tmux opt-in.
- Appends an "Implementation Status (2026-05)" footer to
`docs/terminal-runtime-research.md` noting #229 shipped the headless-PTY
backend recommended in the original research.

The audit checklist called out as deliverable #1 in the issue is posted
as a [comment on
#235](#235 (comment)).

## Test plan

- [ ] `git diff --stat main` shows only the listed `docs/`,
`CHANGELOG.md`, and `README.md` files
- [ ] Render each modified doc on GitHub and confirm anchors /
cross-links resolve
- [ ] Confirm `crow --help` matches the command list in
`docs/cli-reference.md` (only `rename-terminal` was missing pre-PR)
- [ ] Walk every PR number in the CHANGELOG against `git log
--since=2026-04-15 main --oneline` and confirm each one resolves
- [ ] Re-export `docs/crow-screenshot.jpeg` against the current
Settings/Review-Board UI — **deferred to a follow-up**, called out in
the audit comment

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants