fix(shell): add timeout guards for pre-spawn pipeline and post-SIGKILL cleanup#28780
Open
sergey-tihon wants to merge 2 commits into
Open
fix(shell): add timeout guards for pre-spawn pipeline and post-SIGKILL cleanup#28780sergey-tihon wants to merge 2 commits into
sergey-tihon wants to merge 2 commits into
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds defensive timeouts around potentially hanging Effects in shell scanning and child-process termination flows to prevent indefinite waits.
Changes:
- Add a 10-second timeout fallback around a shell scanning/asking Effect.
- Add a bounded (5-second) wait when escalating to
SIGKILLso termination doesn’t hang indefinitely.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| packages/opencode/src/tool/shell.ts | Wrapes an Effect chain with timeoutOrElse to avoid hanging during scan/ask. |
| packages/core/src/cross-spawn-spawner.ts | Adds a timeout/ignore around waiting for the process “signal” Deferred after sending SIGKILL. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue for this PR
Closes #28654
Related: #28697, #25306, #1955
//cc @jlongster @kommander — this addresses the same class of bash tool hangs you're working on in #28654/#25664. Would appreciate your review when you get a chance.
Type of change
What does this PR do?
Two timeout guards for two distinct hang scenarios in the bash tool:
1.
shell.ts(pre-spawn pipeline): Theparse→collect→askblock before subprocess spawn has no timeout. If it hangs, the tool blocks forever — no subprocess exists, so the 2-mindefaultTimeoutnever kicks in. Fix: wrap withEffect.timeoutOrElse({ duration: "10 seconds", orElse: () => Effect.void }). On timeout, permission scanning is skipped and execution proceeds torun(). On real errors (parse failure, permission denied), they propagate normally.We observed this in a CI environment (headless,
opencode run) where the bash tool hung for 20+ minutes on heredoc commands. The failing commands were:and:
Both hung identically. The DB state for these tool calls shows:
{ "state": { "status": "running", "input": { "command": "cat > /tmp/finalize_sections.sh << 'EOF'\n..." }, "time": { "start": 1746621747000 } } }Note the missing
metadatafield. In the normal code path,ctx.metadata({ metadata: { output: "" } })is called atshell.ts:472— BEFOREspawner.spawn()at line 482. The absence ofmetadatain the DB proves execution never reachedrun(). The hang is somewhere in the pre-spawn pipeline:parse()(tree-sitter WASM),collect()(AST walk + path resolution), orask()(permission scan).On Linux,
collect()only doespath.resolve()(pure string ops) andfs.isDir()(stat).ask()is auto-approved in headless mode. That leavesparse()— the tree-sitter WASM.parse(command)call — as the most likely candidate, though we cannot definitively rule out an Effect runtime scheduling issue.The same session successfully parsed ~50 other bash commands before these two hung, so it's not a cold-start WASM loading issue. Something about long heredoc input + extended session runtime triggers it.
2.
cross-spawn-spawner.ts(post-SIGKILL await): After sending SIGKILL,Deferred.await(signal)waits for thecloseevent. When child processes inherit stdout/stderr and outlive the killed parent (gradle daemons, nohup'd servers, Chrome subprocesses), the pipes stay open andclosenever fires. The tool hangs forever despite the target process being dead. Fix:Deferred.await(signal).pipe(Effect.timeout("5 seconds"), Effect.ignore). After SIGKILL, the kernel guarantees the process is dead — ifclosehasn't fired in 5s it's a pipe-inheritance issue, not a process-alive issue.Why these work: the pre-spawn pipeline normally completes in <500ms (10s is 20x margin). After SIGKILL,
closenormally fires in <100ms (5s is 50x margin). Both are last-resort guards that only activate in degenerate cases.How did you verify your code works?
metadatafield, proving the pre-spawn pipeline is the hang locationEffect.timeoutOrElseonly handles timeout (real errors still propagate) — matches existing usage incross-spawn-spawner.tslines 337, 394, 435Effect.timeout + Effect.ignoreafter SIGKILL matches the pattern already used (line 403 alreadyEffect.ignores the entire escalated kill)DB forensic evidence:
Screenshots / recordings
N/A — no UI change.
Checklist