Harden terminal tab re-parenting against rapid switches#243
Harden terminal tab re-parenting against rapid switches#243dhilgaertner merged 2 commits intomainfrom
Conversation
Issue #240 reports an intermittent app crash when switching between a Claude Code tab and a shell tab with `CROW_TMUX_BACKEND=1` enabled. No stack trace was captured. In tmux mode every visible tab re-parents the same shared `GhosttySurfaceView` into its own SwiftUI container, and the old re-parenting path stacked deferred closures against that shared surface on each switch — the most plausible culprit for an intermittent crash on rapid tab toggling. This change tightens the path so the race surface is gone, and adds an uncaught NSException handler so the next reproduction (here or elsewhere) lands a usable stack trace in Console.app instead of an abort. `TerminalSurfaceView`: - Drop the explicit `removeFromSuperview()` before `addSubview` — `addSubview` re-parents atomically; the explicit call only added a spurious `viewDidMoveToWindow(nil)` and `set_focus(false)` per switch. - Drop the deferred `setFrameSize` calls in both `makeNSView` and `updateNSView`. Autolayout already drives `setFrameSize` via the edge constraints, so the manual call was redundant and racing against autolayout. - Replace the `DispatchQueue.main.asyncAfter(deadline: .now() + 0.05/0.1)` first-responder closures with a single `Task @MainActor` that guards on `nsView.window != nil && surface.superview === nsView` — so stale closures from prior switches no longer steal focus from the currently-visible tab. - Drop the duplicate `TmuxBackend.shared.makeActive(id:)` call from `makeNSView` — `updateNSView` fires immediately after make per Apple's NSViewRepresentable contract. - Extract `attach(surface:to:)` so make and update share one constraint setup and can't drift. `TerminalManager`: - `surface(for:)`, `destroy(id:)`, and `retry(id:)` now call `removeFromSuperview()` before `destroy()` so a stale view isn't left dangling in `offscreenWindow.contentView.subviews` after its underlying ghostty surface is freed. `AppDelegate`: - Install `NSSetUncaughtExceptionHandler` in `applicationDidFinishLaunching` to NSLog name + reason + `callStackSymbols` for any uncaught NSException. Mach exceptions (SIGSEGV) still go to the system crash reporter; this just covers the ObjC-exception gap that currently lets AppKit / libghostty wrappers abort silently. Closes #240 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dgershman
left a comment
There was a problem hiding this comment.
Code & Security Review
Critical Issues
None.
Security Review
Strengths:
- The uncaught-exception handler in
AppDelegateonly logs to NSLog (Console.app / unified log) — no external network calls, no user data leakage. - No new inputs, IPC surface, or permission changes — the attack surface is unchanged.
Concerns:
- None identified.
Code Quality
TerminalSurfaceView — well-structured simplification:
-
attach(surface:to:)extraction — DRYs up the 4-constraint block that was duplicated betweenmakeNSViewandupdateNSView. Thesurface.superview !== nsViewguard inupdateNSView:58prevents duplicate constraint stacking on repeated calls. Sound. -
Dropping explicit
removeFromSuperview()beforeaddSubview— correct per AppKit semantics:addSubviewatomically re-parents, so the explicitremoveFromSuperviewwas triggering a spuriousviewDidMoveToWindow(nil)→ghostty_surface_set_focus(false)round-trip on the shared tmux surface. This was likely the primary culprit for the intermittent crash on rapid tab toggling. -
Replacing
DispatchQueue.main.asyncAfterwithTask { @MainActor }— the old 50/100 ms closures could stack on rapid tab switches because each closure capturedsurfacestrongly and had no staleness guard. The newTaskversion uses[weak surface]and validatessurface.superview === nsView && surface.window === windowbefore acting, so stale closures from prior tab switches are correctly discarded. Good fix for the race. -
Dropping manual
setFrameSizecalls — autolayout edge-pinning already drives frame sizing. The manual call was redundant and racing with the layout pass. Correct removal. -
Removing duplicate
TmuxBackend.shared.makeActive(id:)frommakeNSView—updateNSViewfires immediately aftermakeNSViewper Apple'sNSViewRepresentablecontract, so the call inmakeNSViewwas always a double-fire.
TerminalManager — removeFromSuperview() before destroy():
Adding removeFromSuperview() in surface(for:) (nil-surface discard path), destroy(id:), and retry(id:) ensures stale views aren't left dangling in offscreenWindow.contentView.subviews after the underlying ghostty surface is freed. This is a necessary complement to the TerminalSurfaceView changes — without it, the offscreen window could hold a reference to a destroyed view.
Minor observations (non-blocking):
-
The
Task { @MainActor [weak surface] }inupdateNSViewfires on everyupdateNSViewcall, not just on re-parenting. This meansmakeFirstResponderis attempted even when nothing changed. It's harmless (the guards ensure correctness andmakeFirstResponderon the already-first-responder is a no-op), but you could move it inside theif surface.superview !== nsViewblock if you wanted to reduce no-op work. Low priority — leave as-is if you want focus recovery on non-re-parenting updates. -
The
NSSetUncaughtExceptionHandlerclosure doesn't callabort()orfatalError()after logging. This means the app continues running after the ObjC exception is logged. In practice, AppKit's default behavior re-raises the exception anyway, so the process still terminates — but if you ever want to guarantee termination after logging, you'd need an explicitabort(). Current behavior is fine for diagnostic purposes.
Summary Table
| Priority | Issue |
|---|---|
| Green | makeFirstResponder Task fires on every updateNSView, not just re-parenting — harmless, consider scoping if desired |
| Green | Exception handler doesn't explicitly terminate — acceptable for crash diagnostics |
Recommendation: Approve — the changes correctly eliminate the race condition in terminal tab re-parenting, the constraint-duplication guard is sound, and the exception handler adds useful crash diagnostics without introducing risk.
dgershman
left a comment
There was a problem hiding this comment.
Code & Security Review
Critical Issues
None found.
Security Review
Strengths:
- The uncaught exception handler in
AppDelegateonly logs to NSLog (Console.app / unified log), which is an appropriate destination — no sensitive data is sent externally. - No new external inputs, network calls, or untrusted data processing introduced.
removeFromSuperview()beforedestroy()inTerminalManageris a defensive pattern that prevents use-after-free scenarios with the underlying ghostty surface.
Concerns:
- None. The changes are purely internal AppKit view-lifecycle hardening.
Code Quality
Positive:
- Extracting
attach(surface:to:)eliminates constraint-setup duplication betweenmakeNSViewandupdateNSView— good DRY improvement. - Replacing
DispatchQueue.main.asyncAfter(50ms/100ms) withTask { @MainActor }guarded by staleness checks (surface.superview === nsView,surface.window === window) is a well-reasoned fix for the stale-closure stacking hypothesis. The guard conditions ensure that if the surface was re-parented between scheduling and execution, the task becomes a no-op. - Dropping the redundant
TmuxBackend.shared.makeActive(id:)frommakeNSView(sinceupdateNSViewfires immediately after per Apple's contract) removes a double-fire oftmux select-window. removeFromSuperview()beforedestroy()in all threeTerminalManagerpaths prevents a dangling subview reference in the offscreen window.
Minor observation (not blocking):
attach(surface:to:)is called fromupdateNSViewwhensurface.superview !== nsView, meaning the surface was in a different container. WhileaddSubviewatomically re-parents, the old container (now empty) retains orphaned constraints targeting its own anchors. These are harmless (deactivated when the surface leaves) but could be explicitly removed for hygiene. This is a nit — not actionable since autolayout handles it correctly.- The
Task { @MainActor }inupdateNSViewfires on everyupdateNSViewcall, not just re-parenting. In practice this means redundantmakeFirstRespondercalls when SwiftUI re-evaluates the view body without a tab switch. The guards prevent actual damage, but if profiling later shows excessive responder-chain churn, consider moving the Task inside theif surface.superview !== nsViewblock. Low priority.
Summary Table
| Priority | Issue |
|---|---|
| Green | Consider gating the first-responder Task on the re-parent condition to avoid redundant calls |
| Green | Orphaned constraints on old container are harmless but could be cleaned up |
Recommendation: Approve — the changes correctly eliminate the race surface identified in #240 (stale asyncAfter closures stacking on a shared surface during rapid tab switches), add proper cleanup ordering in TerminalManager, and install diagnostic logging for future crash investigations. The approach is sound and the net effect is a reduction in code complexity.
`tmuxBackendDefaultIsOff` and `tmuxBackendOnWhenConfigOverrideOn` both
mutate `FeatureFlags.tmuxBackendConfigOverride` (a `nonisolated(unsafe)
static var`). Swift Testing runs `@Test` functions in parallel by
default, so the two interleave and one test's `= false` reset can land
between the other's `= true` set and its read of `FeatureFlags.tmuxBackend`.
The flake hit this PR's CI run as a `tmuxBackendOnWhenConfigOverrideOn`
failure ("FeatureFlags.tmuxBackend → false"), and previously appeared
locally as `tmuxBackendDefaultIsOff`. Adding `.serialized` to the suite
forces those two writers to run sequentially while leaving the rest of
the test run parallel.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
CROW_TMUX_BACKEND=1enabled. No stack trace was captured.GhosttySurfaceViewinto its own SwiftUI container, and the old re-parenting path stacked deferred closures against that shared surface on each switch — the most plausible culprit for an intermittent crash on rapid tab toggling.TerminalSurfaceViewremoveFromSuperview()beforeaddSubview—addSubviewre-parents atomically; the explicit call only added a spuriousviewDidMoveToWindow(nil)andset_focus(false)per switch.setFrameSizecalls in bothmakeNSViewandupdateNSView. Autolayout already drivessetFrameSizevia the edge constraints, so the manual call was redundant and racing against autolayout.DispatchQueue.main.asyncAfterfirst-responder closures with a singleTask @MainActorguarded onnsView.window != nil && surface.superview === nsView— so stale closures from prior switches no longer steal focus from the currently-visible tab.TmuxBackend.shared.makeActive(id:)call frommakeNSView—updateNSViewfires immediately after make per Apple'sNSViewRepresentablecontract.attach(surface:to:)so make and update share one constraint setup and can't drift.TerminalManagersurface(for:),destroy(id:), andretry(id:)now callremoveFromSuperview()beforedestroy()so a stale view isn't left dangling inoffscreenWindow.contentView.subviewsafter its underlying ghostty surface is freed.AppDelegateNSSetUncaughtExceptionHandlerinapplicationDidFinishLaunchingto NSLog name + reason +callStackSymbolsfor any uncaught NSException. Mach exceptions (SIGSEGV) still go to the system crash reporter; this just covers the ObjC-exception gap that currently lets AppKit / libghostty wrappers abort silently.Closes #240
Test plan
make build— clean compile, no warnings.make test— 46/46 tests in main app suite pass on a clean re-run (tmuxBackendDefaultIsOffis a pre-existing flake from concurrent@Testwrites toFeatureFlags.tmuxBackendConfigOverride; passes on retry, unrelated to this PR).CROW_TMUX_BACKEND=1:.ghostty), to confirm the per-tab surface path isn't regressed.NSException(name: .genericException, reason: "test", userInfo: nil).raise()somewhere reachable, launching, observing[CrowCrash]in Console.app, then removing the test.🤖 Generated with Claude Code