XY-126: add daemon retry queue and capped backoff#9
XY-126: add daemon retry queue and capped backoff#9yvette-carlisle wants to merge 9 commits intomainfrom
Conversation
…:"add daemon retry queue and capped backoff","intent":"implement XY-126 retry scheduling and queued claim handling","impact":"maestro now queues continuation and failure retries in daemon memory with a repo-owned backoff cap and aligned docs","breaking":false,"risk":"medium","authority":"linear","delivery_mode":"closeout","refs":[{"system":"linear","id":"XY-126","role":"authority"}]}
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ee5c2f4b3d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…"exclude continuation retries from failure budget","intent":"address PR feedback on XY-126 failure retry accounting","impact":"failure backoff and retry exhaustion now count only failed attempts instead of all prior continuation runs","breaking":false,"risk":"low","authority":"linear","delivery_mode":"closeout","refs":[{"system":"linear","id":"XY-126","role":"authority"}]}
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ebefc71c35
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…"align failure writeback with failed-attempt budget","intent":"address self-review finding on XY-126 retryable failure classification","impact":"retryable failures after clean continuations now stay on automatic retry and retry comments separate total attempts from failure budget","breaking":false,"risk":"low","authority":"linear","delivery_mode":"closeout","refs":[{"system":"linear","id":"XY-126","role":"authority"},{"system":"github","repo":"hack-ink/maestro","number":9,"role":"mirror"}]}
…"close interrupted-exit and blocked-claim retry gaps","intent":"address remaining XY-126 review findings on retry-budget accounting and queued redispatch retention","impact":"interrupted child exits now consume retry budget and active due retries stay queued when the dispatch slot is temporarily occupied","breaking":false,"risk":"low","authority":"linear","delivery_mode":"closeout","refs":[{"system":"linear","id":"XY-126","role":"authority"},{"system":"github","repo":"hack-ink/maestro","number":9,"role":"mirror"}]}
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6d0fd40d5c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…"release queued retry claims for non-active issues before ready_at","intent":"address PR feedback so future retry entries do not stall the single dispatch slot after the queued issue becomes non-active during backoff","impact":"the daemon now revalidates queued future retries and releases the claim immediately when the issue disappears or leaves the active retry policy, while preserving blocking behavior for still-active queued retries","breaking":false,"risk":"low","authority":"linear","delivery_mode":"closeout","refs":[{"system":"linear","id":"XY-126","role":"authority"},{"system":"github","repo":"hack-ink/maestro","number":9,"role":"mirror"}]}
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: eb6b9c4ee2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…"scope retry redispatch to active issues in the configured project","intent":"address remaining PR #9 findings so queued retries only re-dispatch active issues from the configured tracker project","impact":"retry runs now reject cross-project issue ids, queued retries release when an issue leaves the active lane, and tracker issue refreshes carry project slug metadata for enforcement","breaking":false,"risk":"low","authority":"linear","delivery_mode":"closeout","refs":[{"system":"linear","id":"XY-126","role":"authority"},{"system":"github","repo":"hack-ink/maestro","number":9,"role":"mirror"}]}
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a626a93419
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…"resolve child-exit retries against persisted run attempts","intent":"address PR #9 feedback so daemon child-exit retry scheduling and orphan cleanup can recover the actual persisted run attempt when the child uses a different run_id than the parent's dry-run plan","impact":"child-exit retry scheduling and orphan cleanup now fall back to the persisted issue attempt when planned and actual run ids diverge, preserving retry/backoff behavior after daemon child exits","breaking":false,"risk":"low","authority":"linear","delivery_mode":"closeout","refs":[{"system":"linear","id":"XY-126","role":"authority"},{"system":"github","repo":"hack-ink/maestro","number":9,"role":"mirror"}]}
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a4ce080332
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…"retain retry claims until child run attempts persist","intent":"address PR #9 feedback so daemon-owned retry claims are not dropped before a retry child persists its run attempt, preventing retry lanes from stalling after early child exits","impact":"retry-spawned child runs now retain their queued claim until child-exit reconciliation, and non-dry-run issue preparation records a starting run attempt immediately after lease acquisition so early exits can still be recovered and requeued","breaking":false,"risk":"low","authority":"linear","delivery_mode":"closeout","refs":[{"system":"linear","id":"XY-126","role":"authority"},{"system":"github","repo":"hack-ink/maestro","number":9,"role":"mirror"}]}
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e177886945
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…"respect terminal guard state during retry dispatch","intent":"address PR #9 feedback so daemon retry eligibility does not requeue issues that remain in In Progress only as a terminal manual-attention guard when the needs-attention label is unavailable","impact":"terminal guard failures now persist a dedicated run-attempt status, retry dispatch consults state-store guard state before requeueing or explicit retry redispatch, and the retry budget keeps counting guarded terminal failures","breaking":false,"risk":"low","authority":"linear","delivery_mode":"closeout","refs":[{"system":"linear","id":"XY-126","role":"authority"},{"system":"github","repo":"hack-ink/maestro","number":9,"role":"mirror"}]}
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f0d4331cfe
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if let Some(run_attempt) = state_store.run_attempt(child.run_id)? { | ||
| return Ok(Some(run_attempt)); | ||
| } | ||
|
|
||
| state_store.run_attempt_for_issue_attempt(child.issue_id, child.attempt_number) |
There was a problem hiding this comment.
Match child exits to exact run ID before lease cleanup
resolve_child_exit_run_attempt falls back to run_attempt_for_issue_attempt(issue_id, attempt_number) when the planned run_id is missing. If another maestro process starts the same issue between planning and child exit, this fallback can resolve that other process’s run, and clear_orphaned_daemon_child_state will then mark it interrupted and clear its lease as if this child owned it. In that concurrency window, a healthy in-flight run can be canceled and redispatched, so exit reconciliation should require ownership by the exiting child rather than matching only issue+attempt.
Useful? React with 👍 / 👎.
Summary
Verification