Skip to content

fix(hooks): tolerate stale mgrep watch pid files#140

Merged
joeldierkes merged 3 commits intomixedbread-ai:mainfrom
Co-Messi:codex/sessionstart-stale-pid
Apr 8, 2026
Merged

fix(hooks): tolerate stale mgrep watch pid files#140
joeldierkes merged 3 commits intomixedbread-ai:mainfrom
Co-Messi:codex/sessionstart-stale-pid

Conversation

@Co-Messi
Copy link
Copy Markdown
Contributor

@Co-Messi Co-Messi commented Mar 26, 2026

Problem

The SessionStart hook currently exits with status 1 whenever the per-session PID file already exists, even if the old mgrep watch process is gone. That turns a normal resume into a hook error and leaves users stuck cleaning up stale PID files by hand.

This PR also makes the SessionEnd hook idempotent when the PID file is already missing, since that is the other half of the same cleanup path.

What changed

  • SessionStart now checks whether the PID in the file is still alive
  • If the watcher is still running, the hook exits cleanly with 0 instead of surfacing an error
  • If the PID file is stale or malformed, the hook removes it and starts a fresh watcher
  • SessionEnd now exits cleanly when the PID file is already missing
  • Added focused regression tests for the SessionStart stale-PID case and the SessionEnd missing-PID case

Why this fixes it

The bug was not the existence of the PID file itself; it was treating any existing PID file as an error condition. Checking liveness before deciding how to exit makes the hook resilient to abnormal shutdowns while still avoiding duplicate watcher processes.

Verification

  • pnpm format:check
  • pnpm typecheck
  • pnpm test --filter-tags '!long-running' --filter 'Session(Start|End) hook'\n\nFixes SessionStart hook returns exit(1) on resume when stale PID file exists #134\nRelated: [ISSUE] Claude Code Plugin - SessionEnd #126\n\n## Self-critique\n\n1. This keeps the fix tightly scoped to stale PID handling and does not attempt to redesign the hook lifecycle more broadly.\n2. The tests cover the stale/missing PID cases directly, but they do not try to exercise every Claude Code integration path.\n3. I also made SessionEnd idempotent because it is the same cleanup surface; if you prefer, that portion can be split out, but keeping them together made the hook behavior internally consistent.

Note

Low Risk
Low risk: narrow changes to session hook process/PID handling plus new regression tests; main risk is behavior differences in when a watcher is restarted vs reused.

Overview
Makes the mgrep SessionStart/SessionEnd hooks resilient to stale or missing PID files. SessionStart now reuses an existing live mgrep watch process when the PID file points to one, otherwise deletes stale/malformed PID files and starts a new watcher (respecting cwd), and SessionEnd becomes idempotent when the PID file is missing/unreadable.

Adds shared pid_utils.py helpers for PID parsing/liveness/process validation and introduces Bats tests covering stale PID cleanup, already-running watcher handling, unrelated PID replacement, and missing PID file cleanup.

Written by Cursor Bugbot for commit c535939. This will update automatically on new commits. Configure here.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 24f0fe7623

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +32 to +36
def pid_file_path(session_id: str | None) -> str:
return f"/tmp/mgrep-watch-pid-{session_id}.txt"


def read_pid(pid_file: str) -> int | None:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep hook syntax compatible with Python 3.9

The new type annotations in pid_file_path/read_pid use PEP 604 unions (str | None, int | None), but this hook is invoked as plain python3 without any runtime version gate. On machines where python3 is 3.9 or older, the script now fails to parse with a SyntaxError, so SessionStart never runs. Please use pre-3.10-compatible annotations (or explicitly enforce a 3.10+ interpreter) to avoid breaking hook startup.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated this too. I replaced the PEP 604 unions with Optional[...] so the hook remains compatible with older python3 runtimes, which matters because the plugin invokes plain python3 without a version gate.

Comment on lines +61 to +63
if existing_pid is not None and is_pid_alive(existing_pid):
debug_log(f"mgrep watch already running with pid {existing_pid}, skipping")
sys.exit(0)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Validate PID ownership before skipping watcher startup

The early-return path assumes any live PID from the pidfile means mgrep watch is already running. If a stale pidfile points to a PID that has been reused by an unrelated process, is_pid_alive returns true and this branch exits successfully without starting a watcher, leaving indexing inactive for that session. The liveness check should also verify the process identity (e.g., command line or recorded metadata) before deciding to skip startup.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

@joeldierkes joeldierkes merged commit cf58dde into mixedbread-ai:main Apr 8, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants