Run approved AI repo changes one item at a time in isolated git worktrees, with verification, dual audits, explicit human/external stop points, and an additive supervisory control plane for truthful live monitoring and bounded automatic recovery.
Plan Orchestrator stays intentionally boring:
- one reviewed
markdown_playbook_v1file as the public input contract - one orchestrator-owned worktree per item attempt
- verification before either audit lane
- Codex + Claude auditing the same frozen packet
- deterministic findings merge before triage
- explicit
passed,awaiting_human_gate,blocked_external, andescalatedterminals - local/offline-first defaults and no agent-owned git operations
The new supervisory layer wraps that kernel; it does not replace it.
The supervisory control plane adds three new commands:
python automation/run_plan_orchestrator.py supervise run ...
python automation/run_plan_orchestrator.py supervise resume ...
python automation/run_plan_orchestrator.py supervise status --run-id <RUN_ID>Those commands add:
- a long-lived parent supervisor around the current kernel
- nonce-based live-attachment probes with fail-closed liveness claims
- schema-validated supervision artifacts under the run root
- bounded diagnose →
doctor --fix-safe→resumeautomation for recoverable non-manual stops - truthful waiting observation for manual gates and blocked external evidence
- a separate supervisory status plane that does not redefine
statusordoctor
The package preserves these kernel invariants:
run_state.jsonremains the sole authoritative kernel-state filestatusremains a snapshot status viewdoctorremains the deterministic safe-repair surface- runtime policy remains a separate provenance/tuning plane
- worktree-per-attempt isolation remains unchanged
- verification still happens before audit
awaiting_human_gateremains the only human-only stop
python automation/run_plan_orchestrator.py list-items \
--playbook examples/launch_demo_playbook/playbook.md
python automation/run_plan_orchestrator.py show-item \
--playbook examples/launch_demo_playbook/playbook.md \
--item 01 \
--format textpython automation/run_plan_orchestrator.py status \
--run-id RUN_20260325T120000Z_deadbeef \
--format json
python automation/run_plan_orchestrator.py doctor \
--run-id RUN_20260325T120000Z_deadbeef \
--format jsonpython automation/run_plan_orchestrator.py supervise run \
--playbook examples/launch_demo_playbook/playbook.md \
--item 01
python automation/run_plan_orchestrator.py supervise status \
--run-id RUN_20260325T120000Z_deadbeef \
--format json \
--exit-codeSee docs/supervision-guide.md for the full supervision contract.
See docs/operations-book.md for live deployment procedure, safe agent briefing, and human-gate handling.
These commands keep their existing meanings:
runresumestatusdoctormark-manual-gaterefresh-run
Use them when you want direct kernel execution, snapshot inspection, or the authoritative manual-gate write boundary.
Use these when you want real operator/live-run truth:
supervise runsupervise resumesupervise status
supervise status reports both planes:
- kernel plane — from
run_state.jsonandstatus - supervision plane — from fresh heartbeats, probe evidence, and current bridge state
- not a planner that invents work
- not a generic chat shell
- not a web-browsing agent
- not a replacement kernel state machine
- not a second authoritative state plane
- not an auto-approver for manual gates
awaiting_human_gate remains the only human-only stop.
Operational note: today this is a workflow boundary, not a strong authenticated write boundary. Treat mark-manual-gate as a privileged human-held command and brief worker agents accordingly. See docs/operations-book.md.
Humans still own:
- approving or rejecting the gate
- recording that decision with
mark-manual-gate - deciding whether to operate outside the normal workflow when the supervisor parks a non-recoverable case
The supervisor may only continue after a human decision is already recorded and resume semantics remain truthful.
Run-control artifacts:
.local/automation/plan_orchestrator/runs/<RUN_ID>/
Model JSON reports:
.local/ai/plan_orchestrator/runs/<RUN_ID>/
Per-item worktrees:
.local/automation/plan_orchestrator/worktrees/<RUN_ID>/item-<ITEM_ID>-attempt-<N>/
New supervision artifacts:
.local/automation/plan_orchestrator/runs/<RUN_ID>/supervision/
Supervision contents:
bridge_registration.json
active_stage.json
probe_request.json
probe_ack.json
control.lock
heartbeats/<SEQ>_<TIMESTAMP>.json
interventions/<SEQ>_<ACTION>.json
invocations/<KERNEL_INVOCATION_ID>.stdout.log
invocations/<KERNEL_INVOCATION_ID>.stderr.log
Use the right surface for the right question:
-
What does the authoritative saved run say?
Usestatus,doctor, andrun_state.json. -
Can the operator loop still prove fresh live attachment right now?
Usesupervise status.
If the supervisor cannot prove fresh attachment, it downgrades to attachment_unproven or snapshot_only. It does not keep claiming live supervision from stale evidence.
The no-credential inspection path only needs Python and the repo checkout.
A full run, resume, mark-manual-gate, supervise run, or supervise resume walkthrough expects:
- Python 3.10, 3.11, or 3.12
git,bash,codex, andclaudeavailable inPATH- Git identity configured for checkpoint commits
- a clean tracked checkout
- no unreviewed ambient agent configuration unless intentionally acknowledged
The runtime remains reproducibility-first and local/offline-first:
- no agent-owned git operations
- no implicit web browsing by execution, audit, triage, fix, or remediation
- no destructive reset/clean/rebase/squash automation
- no automatic manual-gate approvals or rejections
- no fabricated external evidence
- only a passed item, or an item later approved through a manual gate, can advance the run branch
docs/playbook-contract.md— public input contractdocs/operations-book.md— deployer runbook, safe agent briefing patterns, and human-gate protocoldocs/operator-guide.md— kernel and operator surfacedocs/troubleshooting.md— snapshot + supervision troubleshootingdocs/release-checklist.md— rollout checklistdocs/supervision-guide.md— live supervision contract, artifacts, and exit codesdocs/demo-run.md— existing kernel demo flowdocs/launch-proof.md— historical proof captures
After copying the repo into its standalone home, run:
python -m unittest discover -s automation/plan_orchestrator/tests -t .Capture that command's output as the package verification record for the extracted repo.
For supervisory-lane verification after apply, see docs/supervision-guide.md#15-post-apply-verification.