Fix exporter state machine: stuck exporters on lease-end during hooks#349
Fix exporter state machine: stuck exporters on lease-end during hooks#349raballew wants to merge 4 commits intojumpstarter-dev:mainfrom
Conversation
- Wait for beforeLease hook to complete before running afterLease cleanup, preventing stuck exporters when lease ends during hook execution (jumpstarter-dev#236) - Report OFFLINE status before shutdown on onFailure:exit hook failures, preventing new lease assignment to dying exporters (jumpstarter-dev#245) - Add 13 tests covering lease-end during hooks, unused lease timeout, consecutive lease ordering, idempotent cleanup, and warn+lease-end Fixes jumpstarter-dev#236, jumpstarter-dev#237, jumpstarter-dev#241, jumpstarter-dev#245 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
✅ Deploy Preview for jumpstarter-docs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughwaits for in-flight before-lease hooks during lease teardown. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Exporter
participant BeforeHook
participant AfterHook
Client->>Exporter: Lease ends / premature end
Exporter->>BeforeHook: await before_lease_hook.wait()
Note right of Exporter: wait for in-flight before-lease hook to finish
BeforeHook-->>Exporter: before hook completes
Exporter->>AfterHook: start after_lease cleanup
AfterHook-->>Exporter: after cleanup done
Exporter->>Client: report status (AVAILABLE / OFFLINE on exit)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
The exporter now reports "Exporter shutting down due to beforeLease hook failure" instead of "beforeLease hook failed", causing regex mismatches in B2, B3, and C2 tests. Broaden the patterns to match both "hook failed" and "hook failure" via the prefix "hook fail", and add "Exporter shutting down" as an alternative match. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@e2e/tests-hooks.bats`:
- Line 239: The test named onFailure=endLease is currently permitting "Exporter
shutting down" in the assert_output regex which can hide regressions in hooks
B2; update the assertion in the assert_output call (the one using the regex
"(beforeLease hook fail|Exporter shutting down|Connection to exporter lost)") to
remove "Exporter shutting down" so it only accepts "(beforeLease hook
fail|Connection to exporter lost)", and ensure the surrounding test setup for
hooks B2 and exporter lifecycle (the onFailure=endLease test) explicitly
prepares/reset exporter state so the test is deterministic and does not rely on
prior suite runs.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 34fe890a-19c2-49d7-be60-874330fbaed7
📒 Files selected for processing (1)
e2e/tests-hooks.bats
| assert_failure | ||
| # endLease may drop connection before status propagates to client | ||
| assert_output --regexp "(beforeLease hook failed|Connection to exporter lost)" | ||
| assert_output --regexp "(beforeLease hook fail|Exporter shutting down|Connection to exporter lost)" |
There was a problem hiding this comment.
Do not accept shutdown output in the onFailure=endLease test.
Line 239 currently allows "Exporter shutting down", which can mask a real regression in hooks B2 because the restart loop may recover the exporter and still pass later checks.
Suggested fix
- assert_output --regexp "(beforeLease hook fail|Exporter shutting down|Connection to exporter lost)"
+ assert_output --regexp "(beforeLease hook fail|Connection to exporter lost)"
+ refute_output --partial "Exporter shutting down"Based on learnings: E2E tests in bats should be runnable in isolation. Do not rely on prior suite setup or shared state across tests. Each test (or setup/teardown blocks) should explicitly prepare its prerequisites, reset global state, and ensure deterministic results.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| assert_output --regexp "(beforeLease hook fail|Exporter shutting down|Connection to exporter lost)" | |
| assert_output --regexp "(beforeLease hook fail|Connection to exporter lost)" | |
| refute_output --partial "Exporter shutting down" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@e2e/tests-hooks.bats` at line 239, The test named onFailure=endLease is
currently permitting "Exporter shutting down" in the assert_output regex which
can hide regressions in hooks B2; update the assertion in the assert_output call
(the one using the regex "(beforeLease hook fail|Exporter shutting
down|Connection to exporter lost)") to remove "Exporter shutting down" so it
only accepts "(beforeLease hook fail|Connection to exporter lost)", and ensure
the surrounding test setup for hooks B2 and exporter lifecycle (the
onFailure=endLease test) explicitly prepares/reset exporter state so the test is
deterministic and does not rely on prior suite runs.
There was a problem hiding this comment.
Fixed in commit 69de0d2 -- the regex now matches both "hook failed" and "hook failure" via the prefix "hook fail", plus "Exporter shutting down" as an additional match.
There was a problem hiding this comment.
@raballew, thanks for the explanation. To clarify, the concern I raised wasn't about the "hook fail" prefix (that broadening makes sense to handle both "hook failed" and "hook failure" variants) — it was specifically about keeping "Exporter shutting down" in the onFailure=endLease test (hooks B2).
The "Exporter shutting down" message is newly emitted on the onFailure=exit path (as described in the PR), so allowing it in the endLease test could silently pass even if endLease accidentally triggers a shutdown regression. If a future bug caused endLease to behave like exit, the test would still pass by matching "Exporter shutting down" instead of the expected endLease output.
That said, if you've confirmed through testing that this message can legitimately appear even in the endLease path (e.g., a race between the restart loop reporting shutdown and the endLease path), I understand the pragmatic trade-off. I'll defer to your judgment here.
🧠 Learnings used
Learnt from: mangelajo
Repo: jumpstarter-dev/jumpstarter PR: 180
File: e2e/tests.bats:203-204
Timestamp: 2026-02-03T15:49:06.716Z
Learning: E2E tests in bats should be runnable in isolation. Do not rely on prior suite setup or shared state across tests. Each test (or setup/teardown blocks) should explicitly prepare its prerequisites, reset global state, and ensure deterministic results. Consider adding per-test setup and teardown, or refactor to avoid shared state in e2e/tests.bats and related files.
Learnt from: mangelajo
Repo: jumpstarter-dev/jumpstarter PR: 173
File: e2e/run-e2e.sh:101-103
Timestamp: 2026-02-02T09:17:04.720Z
Learning: In the jumpstarter e2e tests (e2e/run-e2e.sh and e2e/tests.bats), keep tracing enabled (-x) unconditionally because there are no secrets in the bats tests and it makes debugging easier with more understandable logs.
test_exec_bash: Simplify bash-specific syntax from array operations to substring expansion to reduce PTY timing sensitivity on macOS 3.11. e2e C2 (afterLease onFailure=exit): The shell command can complete before the afterLease hook runs, making assert_failure unreliable. The important behavior is that the exporter exits and goes offline, which is still asserted. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixes #236, #237, #241, #245