Skip to content

docs(multitask): add TIMEOUT_LIFECYCLE.md and update reference docs#4186

Open
rysweet wants to merge 3 commits intomainfrom
feat/issue-4183-after-the-investigate-timeout-lifecycle-workstream
Open

docs(multitask): add TIMEOUT_LIFECYCLE.md and update reference docs#4186
rysweet wants to merge 3 commits intomainfrom
feat/issue-4183-after-the-investigate-timeout-lifecycle-workstream

Conversation

@rysweet
Copy link
Copy Markdown
Owner

@rysweet rysweet commented Apr 3, 2026

Closes #4183

Summary

Retcon documentation for the multitask orchestrator timeout/recovery lifecycle, produced by the investigate-timeout-lifecycle + implement-defect-fixes workstreams.

Changes

  • TIMEOUT_LIFECYCLE.md (new): Documents the full lifecycle state machine (pending → running → completed/failed_resumable/timed_out_resumable/etc.), both timeout policies (interrupt-preserve and continue-preserve), workdir cleanup eligibility, the resumable state model specifying exactly what data is persisted, and session recovery instructions.
  • reference.md: Added timeout_policy and max_runtime to the workstream config table.
  • SKILL.md: Minor cross-reference update.

Investigation findings

The investigation (ws-4181) confirmed:

  • 7200s is the default max_runtime; fires via interrupt-preserve sending SIGTERM→SIGKILL then saving timed_out_resumable state.
  • Active-progress detection uses checkpoint existence at state-write time.
  • Workdirs are kept for all resumable states; only completed, failed_terminal, and abandoned are cleanup-eligible.

PR stack validation (ws-4182)

Confirmed PRs #4074, #4075, #4076 are merged on main with no rebase conflicts. dep_check.py scoped import validation and timeout preservation logic are both present on HEAD.

Tests

All 19 multitask/timeout tests pass.

Copilot and others added 3 commits April 3, 2026 01:19
Documents the timeout lifecycle state machine, interrupt-preserve and
continue-preserve timeout policies, workstream resumption, state file
schema, log management cap, environment variables, and security
controls (delegate allowlist, path sanitisation, shlex quoting).

Updates reference.md Timeouts section to describe both policies and
link to the new doc. Adds TIMEOUT_LIFECYCLE.md to the docs table in
SKILL.md.

No code changes — all 77 existing tests continue to pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…_LIFECYCLE.md

completed is in CLEANUP_ELIGIBLE_LIFECYCLE_STATES — the table and prose
paragraph had it wrong. Aligns with the frozenset at orchestrator.py:69-71.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…nd config docs

- Required fields table: issue type now shows int|"TBD" with note about auto-create
- Optional fields table: add timeout_policy and max_runtime with defaults
- JSON config example: show the two optional timeout fields
- Methods list: add max_runtime/timeout_policy kwargs to add() signature

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 3, 2026

Repo Guardian - Passed

All 3 changed files are durable technical reference documentation. No ephemeral content detected.

File Assessment
.claude/skills/multitask/SKILL.md Durable skill doc — minor table update linking to new lifecycle reference
.claude/skills/multitask/TIMEOUT_LIFECYCLE.md Durable architecture reference — lifecycle state machine, timeout policies, resumption behavior, state file schema, security details
.claude/skills/multitask/reference.md Durable API reference — adds timeout_policy and max_runtime field documentation

No action required.

Generated by Repo Guardian for issue #4186 ·

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 3, 2026

PR Triage Report

Risk: 🟢 Low | Priority: Low | Type: Documentation

Assessment

Adds TIMEOUT_LIFECYCLE.md documenting the full multitask orchestrator timeout/recovery lifecycle state machine, updates reference.md with timeout_policy/max_runtime config fields, and minor SKILL.md cross-reference update. Investigation-driven (ws-4181 confirmed 7200s default, interrupt-preserve policy, resumable state model).

⚠️ Base drift: Base SHA 0d7507c6 is significantly behind current main (9a218fe7). Rebase required before merge.

Test coverage: 19 multitask/timeout tests pass.

Recommendation

Low risk; rebase required. Documentation-only changes with good test validation. Rebase to current main and merge.

Generated by PR Triage Agent ·

This was referenced Apr 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

implement-defect-fixes

1 participant