Skip to content

XY-126: add daemon retry queue and capped backoff#9

Open
yvette-carlisle wants to merge 9 commits intomainfrom
x/maestro-xy-126
Open

XY-126: add daemon retry queue and capped backoff#9
yvette-carlisle wants to merge 9 commits intomainfrom
x/maestro-xy-126

Conversation

@yvette-carlisle
Copy link
Member

Summary

  • add an explicit in-memory retry queue to the daemon and allow queued redispatch of active issues
  • separate clean continuation retries from failure retries with a capped backoff from WORKFLOW.md
  • align workflow/runtime docs and plan authority with the new retry semantics

Verification

  • cargo make lint-fix
  • cargo make fmt
  • cargo make test
  • cargo make lint
  • cargo make fmt-rust-check
  • cargo make fmt-toml-check
  • git diff --check

…:"add daemon retry queue and capped backoff","intent":"implement XY-126 retry scheduling and queued claim handling","impact":"maestro now queues continuation and failure retries in daemon memory with a repo-owned backoff cap and aligned docs","breaking":false,"risk":"medium","authority":"linear","delivery_mode":"closeout","refs":[{"system":"linear","id":"XY-126","role":"authority"}]}
@yvette-carlisle
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ee5c2f4b3d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…"exclude continuation retries from failure budget","intent":"address PR feedback on XY-126 failure retry accounting","impact":"failure backoff and retry exhaustion now count only failed attempts instead of all prior continuation runs","breaking":false,"risk":"low","authority":"linear","delivery_mode":"closeout","refs":[{"system":"linear","id":"XY-126","role":"authority"}]}
@yvette-carlisle
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ebefc71c35

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…"align failure writeback with failed-attempt budget","intent":"address self-review finding on XY-126 retryable failure classification","impact":"retryable failures after clean continuations now stay on automatic retry and retry comments separate total attempts from failure budget","breaking":false,"risk":"low","authority":"linear","delivery_mode":"closeout","refs":[{"system":"linear","id":"XY-126","role":"authority"},{"system":"github","repo":"hack-ink/maestro","number":9,"role":"mirror"}]}
…"close interrupted-exit and blocked-claim retry gaps","intent":"address remaining XY-126 review findings on retry-budget accounting and queued redispatch retention","impact":"interrupted child exits now consume retry budget and active due retries stay queued when the dispatch slot is temporarily occupied","breaking":false,"risk":"low","authority":"linear","delivery_mode":"closeout","refs":[{"system":"linear","id":"XY-126","role":"authority"},{"system":"github","repo":"hack-ink/maestro","number":9,"role":"mirror"}]}
@yvette-carlisle
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6d0fd40d5c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…"release queued retry claims for non-active issues before ready_at","intent":"address PR feedback so future retry entries do not stall the single dispatch slot after the queued issue becomes non-active during backoff","impact":"the daemon now revalidates queued future retries and releases the claim immediately when the issue disappears or leaves the active retry policy, while preserving blocking behavior for still-active queued retries","breaking":false,"risk":"low","authority":"linear","delivery_mode":"closeout","refs":[{"system":"linear","id":"XY-126","role":"authority"},{"system":"github","repo":"hack-ink/maestro","number":9,"role":"mirror"}]}
@yvette-carlisle
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: eb6b9c4ee2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…"scope retry redispatch to active issues in the configured project","intent":"address remaining PR #9 findings so queued retries only re-dispatch active issues from the configured tracker project","impact":"retry runs now reject cross-project issue ids, queued retries release when an issue leaves the active lane, and tracker issue refreshes carry project slug metadata for enforcement","breaking":false,"risk":"low","authority":"linear","delivery_mode":"closeout","refs":[{"system":"linear","id":"XY-126","role":"authority"},{"system":"github","repo":"hack-ink/maestro","number":9,"role":"mirror"}]}
@yvette-carlisle
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a626a93419

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…"resolve child-exit retries against persisted run attempts","intent":"address PR #9 feedback so daemon child-exit retry scheduling and orphan cleanup can recover the actual persisted run attempt when the child uses a different run_id than the parent's dry-run plan","impact":"child-exit retry scheduling and orphan cleanup now fall back to the persisted issue attempt when planned and actual run ids diverge, preserving retry/backoff behavior after daemon child exits","breaking":false,"risk":"low","authority":"linear","delivery_mode":"closeout","refs":[{"system":"linear","id":"XY-126","role":"authority"},{"system":"github","repo":"hack-ink/maestro","number":9,"role":"mirror"}]}
@yvette-carlisle
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a4ce080332

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…"retain retry claims until child run attempts persist","intent":"address PR #9 feedback so daemon-owned retry claims are not dropped before a retry child persists its run attempt, preventing retry lanes from stalling after early child exits","impact":"retry-spawned child runs now retain their queued claim until child-exit reconciliation, and non-dry-run issue preparation records a starting run attempt immediately after lease acquisition so early exits can still be recovered and requeued","breaking":false,"risk":"low","authority":"linear","delivery_mode":"closeout","refs":[{"system":"linear","id":"XY-126","role":"authority"},{"system":"github","repo":"hack-ink/maestro","number":9,"role":"mirror"}]}
@yvette-carlisle
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e177886945

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…"respect terminal guard state during retry dispatch","intent":"address PR #9 feedback so daemon retry eligibility does not requeue issues that remain in In Progress only as a terminal manual-attention guard when the needs-attention label is unavailable","impact":"terminal guard failures now persist a dedicated run-attempt status, retry dispatch consults state-store guard state before requeueing or explicit retry redispatch, and the retry budget keeps counting guarded terminal failures","breaking":false,"risk":"low","authority":"linear","delivery_mode":"closeout","refs":[{"system":"linear","id":"XY-126","role":"authority"},{"system":"github","repo":"hack-ink/maestro","number":9,"role":"mirror"}]}
@yvette-carlisle
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f0d4331cfe

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +554 to +558
if let Some(run_attempt) = state_store.run_attempt(child.run_id)? {
return Ok(Some(run_attempt));
}

state_store.run_attempt_for_issue_attempt(child.issue_id, child.attempt_number)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Match child exits to exact run ID before lease cleanup

resolve_child_exit_run_attempt falls back to run_attempt_for_issue_attempt(issue_id, attempt_number) when the planned run_id is missing. If another maestro process starts the same issue between planning and child exit, this fallback can resolve that other process’s run, and clear_orphaned_daemon_child_state will then mark it interrupted and clear its lease as if this child owned it. In that concurrency window, a healthy in-flight run can be canceled and redispatched, so exit reconciliation should require ownership by the exiting child rather than matching only issue+attempt.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant