A framework for humans who orchestrate multiple AI coding agents.
Structured handoff · Trust calibration · Continuous improvement
English · 中文
| ⌨️ CLI | Claude Code Anthropic |
Codex CLI OpenAI |
Gemini CLI |
Kimi Code Moonshot AI |
| 🖥️ IDE | Cursor AI IDE |
Windsurf Codeium |
Antigravity Google DeepMind |
GitHub Copilot Chat |
| Aider Open Source |
OpenCode Open Source |
Continue Open Source |
+ any tool that reads/writes files |
Conductor is file-protocol based — any AI coding tool that can read and write files is compatible.
You're running Claude Code in one terminal, Cursor in another, maybe Codex for review — all on different projects. By end of day:
- 🤯 You can't remember what each agent decided
- 🔁 New sessions repeat mistakes from yesterday
- 📂 HANDOFF notes, devlogs, and error logs are scattered everywhere
- 🤔 You don't know which agent to trust for which task
Conductor brings order to this chaos.
We identified 12 dimensions that matter when a human works with multiple AI agents:
| # | Dimension | What It Covers |
|---|---|---|
| 1 | Handoff Management | Passing context between sessions without loss |
| 2 | Knowledge Capture | Recording decisions, errors, and QA pairs |
| 3 | Trust Calibration | Knowing when to verify vs. trust the agent |
| 4 | Cognitive Load | Managing your mental bandwidth across agents |
| 5 | Prompt Quality | Improving how you communicate with agents |
| 6 | Agent Profiling | Tracking each agent's strengths and weaknesses |
| 7 | Tool Selection | Picking the right agent for the right task |
| 8 | Feedback Loops | Turning errors into prevention rules |
| 9 | Attention Allocation | Deciding which project needs you right now |
| 10 | Disagreement Resolution | Handling conflicting agent advice |
| 11 | Cross-Agent Consistency | Keeping agents aligned on decisions |
| 12 | Energy Modeling | Adjusting oversight based on your fatigue |
Read more: docs/
| Template | Purpose |
|---|---|
HANDOFF.md |
Session-end context handoff |
CLAUDE.md |
Agent rules with handoff protocol |
ERROR_BOOK.md |
AI mistake tracker for trust calibration |
TRUST_PROFILE.md |
Agent reliability scorecard |
DESIGN.md |
UI design system for consistent agent output |
$ conductor status
🎵 Conductor · Project Status
┌─────────────┬────────────┬────────┬──────────────────────┐
│ Project │ Last Active│ Status │ Next Step │
├─────────────┼────────────┼────────┼──────────────────────┤
│ wenyuan │ 6h ago │ ✅ │ Refactor README │
│ network-opt │ 2h ago │ ✅ │ VPS setup │
│ conductor │ 30m ago │ ✅ │ Write tests │
│ old-project │ 3d ago │ 🔴 │ Archive or continue? │
└─────────────┴────────────┴────────┴──────────────────────┘
📅 2026-04-09 │ 4 projects │ 5 decisions │ 12 files Δ$ conductor digest ./my-project
🎵 Conductor · Project Digest
📁 my-project │ 📅 2026-04-01 → 2026-04-09 │ 🔄 5 sessions
📋 Decisions Made
┌────────────┬──────────────────────────────┐
│ 2026-04-09 │ Chose JWT over sessions │
│ 2026-04-08 │ Python 3.9+ compatibility │
└────────────┴──────────────────────────────┘
⚠️ Errors & Pitfalls
┌────────────┬──────────────────────────────┐
│ 2026-04-09 │ bcrypt 5.x broke hashes │
└────────────┴──────────────────────────────┘$ conductor memory add "Uses FastAPI + PostgreSQL" -t fact -g backend
✅ Memory #1 added (fact)
$ conductor memory search "FastAPI"
🔍 matches: #1 [fact] Uses FastAPI + PostgreSQL$ conductor verify .
Verification passed: /path/to/project
$ conductor verify --json . | python -m json.tool
{
"project_path": "/path/to/project",
"ok": true,
"errors": [],
"warnings": []
}| Command | Purpose |
|---|---|
conductor status --dir <path> |
Multi-project status dashboard |
conductor init [path] |
Scaffold the v1.0 file/SOP protocol files |
conductor digest [path] |
Extract decisions and errors from project history |
conductor retro [path] |
Run interactive agent retrospective |
conductor memory ... |
Persistent cross-session knowledge store |
conductor verify [path] |
Validate the Conductor file and SOP protocol |
conductor verify --json [path] |
Emit a machine-readable verification report for automation/CI |
conductor orbit ... |
Inspect Orbit dispatch files (print-only) |
- Copy
templates/HANDOFF.md.templateto your project asHANDOFF.md - Copy the handoff protocol from
templates/CLAUDE.md.templateinto your project'sCLAUDE.md - Tell your AI: "Read HANDOFF.md before starting. Update it before ending."
pip install conductor-ai
conductor init ./my-project
conductor statusFor existing projects, see the v1.0 migration guide.
Every session ends with a structured handoff:
## 2026-04-09
- done: Implemented user auth module
- decisions: Chose JWT over sessions (stateless, scales better)
- pitfall: bcrypt 5.x changed default rounds — broke existing hashes
- next: Add password reset flow500 token max. If you can't summarize it, you don't understand it.
Don't blindly trust or distrust your AI. Calibrate per domain:
| Layer | Method |
|---|---|
| L1 | Verify outcomes — does the code run? |
| L2 | Cross-verify — have another agent review |
| L3 | Progressive trust — try on one file first |
| L4 | Demand explanation — ask WHY, not just WHAT |
Not every task needs a full planning cycle:
| Size | Time | Process |
|---|---|---|
| S | < 30min | Just do it → test → commit → HANDOFF |
| M | 1-3h | Brief plan → execute → HANDOFF |
| L | > 3h | Brainstorm → plan → TDD → review → HANDOFF |
| vs. | Difference |
|---|---|
| CrewAI / LangGraph | They orchestrate agent-to-agent. We orchestrate human-to-agents. |
| OpenSpec | OpenSpec manages specs within one session. We manage across sessions and agents. |
| CLAUDE.md alone | CLAUDE.md is one file. We're a complete methodology + tools. |
| Nothing | You're losing decisions, repeating mistakes, and burning context window tokens. |
- v0.1 — Methodology docs + templates +
conductor status - v0.2 —
conductor digest— extract decisions/errors from project history - v0.3 —
conductor retro— interactive post-session agent review - v0.4 —
conductor memory— persistent cross-session knowledge store - v0.5 — File and SOP foundation:
BACKLOG.md,DEVLOG.md, normalizedHANDOFF.md,CONTEXT.md,.conductor/config.yaml,.conductor/sops/*.yaml,AGENTS.md+ projectedCLAUDE.md - v0.6 —
conductor verify [path]— read-only validator for the v0.5 protocol - v0.7 —
conductor verify --json [path]+ GitHub Actions verify workflow for CI use - v0.8 — Release hygiene and push readiness (complete locally; see
CHANGELOG.md) - v0.9 — Read-only repair suggestions in
conductor verify - v0.10 — Real Orbit task dogfooding
- v0.11 — Supervised development-loop runbook
- v1.0 — Stable protocol slice, verify JSON schema, v1.0 init scaffolding, migration guide, dogfood, and local release gate (complete locally; push pending explicit approval)
Full plan: docs/plans/2026-05-22-conductor-roadmap-v0.8-v1.0.md.
Conductor v0.5-v0.11 turn the methodology into a checkable file protocol and a supervised local development loop so multiple AI coding agents can read the same project state without re-explaining it.
- Adds
BACKLOG.md(task pool with fenced YAML metadata),DEVLOG.md(append-only execution history), normalizedHANDOFF.md(current recovery state only), andCONTEXT.md(stable background). - Adds
.conductor/config.yamlwith paths, sensitive-file globs, agent declarations, and instruction-target projections. - Adds five default SOP files under
.conductor/sops/:simple,tdd,spec-first,research-spike,review-fix. - Adds
AGENTS.mdas the canonical agent instruction entrypoint, with managed-block projections intoCLAUDE.md(and optionalGEMINI.md/ Copilot instructions). - File-only release: no agent execution, no daemon, no event stream.
- Schema reference:
docs/conductor-schema.md. - Design doc:
docs/plans/v0.5-file-and-sop-foundation.md.
A read-only validator that checks a Conductor project against the v0.5 file protocol. It validates:
- Required files exist (
BACKLOG.md,DEVLOG.md,HANDOFF.md,CONTEXT.md,AGENTS.md,.conductor/config.yaml, the five SOPs). .conductor/config.yamland SOP YAML files parse and follow the documented schema.- SOPs encode their distinguishing requirements (failing test, plan/spec artifact, cited feedback, etc.).
BACKLOG.mdtasks have valid ids and metadata.HANDOFF.mdhas the current-state headings.DEVLOG.mdentries have the expected shape.AGENTS.mdreferences the read-before-work artifacts.CLAUDE.md(and other projected instruction files) contain valid managed-block markers.
conductor verify .
# Verification passed: /path/to/projectExits non-zero when validation errors exist.
conductor verify --json [path]emits a machine-readable verification report so CI and other automations can consume the result without scraping terminal output..github/workflows/verify.ymlruns install, pytest, human-readable verify, and JSON verify on pull requests and pushes tomain. No publishing, no agent execution, no secrets.
conductor verify --json . | python -m json.tool- Public docs, release notes, and push checklist now describe the v0.5-v0.7 protocol work waiting for manual push.
docs/push-readiness.mdrecords the tracked-file sensitive scan, workflow review, verification commands, and manual push gate.- Package version remains
0.4.0; protocol milestones are not automatic PyPI releases.
conductor verifyissues now include deterministic read-only repair suggestions in human and JSON output.- Suggestions do not modify files automatically and preserve existing issue codes and exit-code behavior.
- The real Orbit dispatch file was dogfooded with
TASK-20260523-001, a local Conductor push-readiness audit task. conductor orbit status,conductor orbit next, andconductor orbit launch <task_id>were verified against that task.orbit launchremains print-only; the printed command is not executed by Conductor.
docs/runbooks/development-loop.mddefines the local long-running workflow.- Current role split: Codex App supervises/reviews/integrates, Codex CLI implements one bounded milestone at a time, and Claude Code is review-only for milestone gates.
- The runbook preserves local-first boundaries: no daemon, no hosted service, no automatic executor, and no automatic push.
For feature implementation against this Conductor protocol project the current split is:
- Codex App as supervisor, reviewer, and integrator — selects one approved milestone, reviews worker output, reruns verification independently, and records integration state.
- Codex CLI as implementation worker — reads
AGENTS.md,HANDOFF.md,CONTEXT.md,BACKLOG.md,.conductor/config.yaml, and the selected SOP; works in an isolated worktree; implements exactly one bounded milestone; commits and stops. - Claude Code as review-only gate — reviews the just-completed milestone and reports accepted/rejected with findings, verification evidence, and residual risk. It is not the default implementation worker.
See AGENTS.md for the full protocol.
The v1.0 protocol slice is complete locally. It includes the stable protocol
document, verify JSON schema, v1.0 conductor init scaffolding, migration
guide, clean-project dogfood, and the local release gate.
Pushing remains separate: reviewed commits may be pushed only after explicit user approval or a standing push policy recorded in project state.
The local branch in this repository runs ahead of GitHub by several protocol milestones (v0.5 through v1.0) that have not been pushed yet.
- Release notes for those milestones live in
CHANGELOG.md. - The push checklist (sensitive-file scan, verification commands, manual confirmation) lives in
docs/push-readiness.md. - Quick sensitive-file scan against tracked files:
git ls-tree -r --name-only HEAD | grep -E "(\.env|secret|token|key)" && echo "REVIEW REQUIRED" || echo "clean"Pushing to origin/main is a manual step that must happen only after the checklist passes and after explicit user confirmation, unless a standing push policy is explicitly recorded in project state. Automations must not push on their own.
"If you just drive the AI to work and walk away, you'll never know what you don't know. The original sin of AI-assisted development is not reviewing, not reflecting, not improving."
Conductor is built on three principles:
- Structure over ceremony — Lightweight protocols that actually get followed, not heavy processes that get skipped.
- Observe, then trust — Build trust through data (error books, trust profiles), not assumptions.
- The human improves too — It's not just about making AI better. It's about making you better at working with AI.
Contributions welcome! Please read the methodology docs first to understand the philosophy.