diff --git a/plugins/orchestrator/skills/getting-started/SKILL.md b/plugins/orchestrator/skills/getting-started/SKILL.md index d848ca1..da225ea 100644 --- a/plugins/orchestrator/skills/getting-started/SKILL.md +++ b/plugins/orchestrator/skills/getting-started/SKILL.md @@ -35,7 +35,26 @@ If for any reason you cannot find your session_id in the startup context, ask th ## Step 1 — Briefing -Call `briefing({ event: "startup", session_id: "" })` to get the session orientation (open threads, recent decisions, work items, user profile, last checkpoint, cross-session activity from sibling sessions, AND a `curation_candidates` section surfacing stale notes worth maintaining). Default output covers all sections; pass `sections: [...]` to narrow. Scan it internally - including `curation_candidates` - and schedule maintenance opportunities alongside your task. Do NOT dump the full briefing to the user - only mention items directly relevant to their task. +Call `briefing` with a tight `sections` filter to keep bootstrap-token consumption low: + +``` +briefing({ + event: "startup", + session_id: "", + sections: ["work_items","open_threads","decisions","checkpoint","user_model","cross_session"] +}) +``` + +This returns the session orientation (open threads, recent decisions, work items, user profile, last checkpoint, and cross-session activity from sibling sessions). Scan it internally and only mention items directly relevant to the user's task — do NOT dump the full briefing. + +**Why the `sections` filter?** Default rendering (no `sections`) also includes `neglected_areas` (tag soup, often ~3K tokens of stale tags), `drift`, and `curation_candidates` — these inflate routine bootstrap by 5-15K tokens without serving your immediate goal. Fetch them on explicit maintenance turns instead: + +``` +// Maintenance call — only when you're explicitly doing KB hygiene +briefing({ event: "startup", session_id: "", sections: ["neglected","drift","curation_candidates"] }) +``` + +The `orchestrator:reflect` skill calls these maintenance sections automatically; you typically don't need them at routine bootstrap. On the first startup of a week (seven days since the last maintenance pass), the briefing may be prepended with a `## Auto-Retro` section. That's automatic maintenance: the orchestrator inline-invokes `retro` on a 7-day cadence so stale signal decays, orphans get flagged, and the knowledge base stays coherent without requiring the agent to remember. This is expected, not a surprise - scan the summary for anything actionable (broken code_refs, revalidation queue) and fold it into your maintenance plan. @@ -45,12 +64,12 @@ If the Cross-Session Activity section is non-empty, note anything that affects y Check `process.env.ORCHESTRATOR_AGENT_ROLE` (or the legacy `SPAWNBOX_AGENT_ROLE`): -- `prime` → You are the **PrimeAgent** for this project. Run `/pa-bootstrap` next (it sets `/model claude-opus-4-7`, `/effort max`, reads sessions.json, loads `agents/prime-agent.md`). Do not proceed past the bootstrap until that's done. +- `prime` → You are the **PrimeAgent** for this project. Run `/pa-bootstrap` next (it sets `/model claude-opus-4-7`, `/effort max`, reads the agent-channel SQLite DB, loads `agents/prime-agent.md`). Do not proceed past the bootstrap until that's done. - `subordinate` (or unset) → You are a **Subordinate Agent (SA)**. The project's CLAUDE.md and the orchestrator plugin's CLAUDE.md describe your operating contract: PA's directives addressed to you (`@SA-` or unaddressed PA dialogue) are treated as the user's voice unless you're under `/pa-pause`. Address peers via `@PA` / `@SA-` / `@all` in your terminal output - the agent-channel filewatcher routes via `notifications/claude/channel`. **No `send_message` tool exists in 0.29.0+** - communication is purely terminal-output + filewatcher routing. ## Step 3 — Broadcast your task to peers -If your briefing showed any active sibling sessions, OR if the user's request touches code that's likely to overlap with parallel work, call `update_session_task("")` now. This writes your `current_task` into `session_registry` (and into `sessions.json` for the agent-channel filewatcher) so: +If your briefing showed any active sibling sessions, OR if the user's request touches code that's likely to overlap with parallel work, call `update_session_task("")` now. This writes your `current_task` into `session_registry` (and into the agent-channel DB for the filewatcher) so: - Peer sessions see what you're working on as the `from_task` field on every channel notification you generate. - Their next briefing's Cross-Session Activity surfaces your task. diff --git a/plugins/orchestrator/skills/pa-bootstrap/SKILL.md b/plugins/orchestrator/skills/pa-bootstrap/SKILL.md index 6723585..0adf356 100644 --- a/plugins/orchestrator/skills/pa-bootstrap/SKILL.md +++ b/plugins/orchestrator/skills/pa-bootstrap/SKILL.md @@ -1,6 +1,6 @@ --- name: pa-bootstrap -description: Bootstrap the PrimeAgent (PA) session. Run as the first command after pa-start.bat launches PA. Sets model/effort, confirms role=prime, reads active sessions from sessions.json, verifies agent-channel is wired, and outputs a readiness status line. Idempotent. +description: Bootstrap the PrimeAgent (PA) session. Run as the first command after pa-start.bat launches PA. Sets model/effort, confirms role=prime, reads active sessions from the SQLite agent-channel DB, verifies agent-channel is wired, and outputs a readiness status line. Idempotent. --- # Bootstrap the PrimeAgent @@ -21,18 +21,38 @@ These are per-session slash commands; the launcher .bat cannot pre-set them. ### 2. Confirm role and identity -Read `$CLAUDE_PROJECT_DIR/.orchestrator-state/agent-channel/sessions.json`: +The agent-channel state lives in SQLite under +`$ORCHESTRATOR_PROJECT_ROOT/.orchestrator-state/agent-channel/agent_channel.db` +(WAL-mode). The launchers set `$ORCHESTRATOR_PROJECT_ROOT` for you; use +that env var rather than `$CLAUDE_PROJECT_DIR` (which is not propagated +to PA sessions in current Claude Code). + +Query your own session and verify `role` is `prime`. The same query +also lists every active session (used in Step 3): ```bash -cat "$CLAUDE_PROJECT_DIR/.orchestrator-state/agent-channel/sessions.json" +python3 <<'PY' +import os, sqlite3, datetime +db = f"{os.environ['ORCHESTRATOR_PROJECT_ROOT']}/.orchestrator-state/agent-channel/agent_channel.db" +c = sqlite3.connect(db); c.row_factory = sqlite3.Row +now = datetime.datetime.now(datetime.timezone.utc) +print("=== Sessions ===") +for r in c.execute("SELECT id8, role, name, current_task, last_heartbeat_at FROM sessions ORDER BY last_heartbeat_at DESC"): + hb = datetime.datetime.fromisoformat(r['last_heartbeat_at'].replace('Z', '+00:00')) + age = (now - hb).total_seconds() + status = "ACTIVE" if age < 90 else f"stale {age:.0f}s" + print(f"[{status}] {r['id8']} role={r['role']} name={r['name']} task={r['current_task'] or '(none)'}") +PY ``` -Find your own session entry (matches `$CLAUDE_SESSION_ID` env var, or the -session_id you can see in your environment). Verify `role` is `prime`. +Your own row should appear with `role=prime` and a fresh heartbeat +(< 60s old). That row IS the wiring confirmation — agent-channel is +running because the MCP server is writing your heartbeat to the table. **If role is `subordinate` despite env being correct**, you're likely fighting an impostor MCP — another orphaned bun process from a prior -session that's heartbeating your entry with the wrong role. Diagnose: +session that's heartbeating your entry with the wrong role. Diagnose +(Windows): ```powershell Get-CimInstance Win32_Process -Filter "Name = 'bun.exe'" | @@ -41,7 +61,7 @@ Get-CimInstance Win32_Process -Filter "Name = 'bun.exe'" | ``` For each `bun`, walk the parent chain to find the host `claude.exe`. If a -bun's `started_at` in sessions.json matches your session_id but its +bun's recent heartbeat in the DB matches your session_id but its ancestor `claude.exe` is launching a DIFFERENT session (e.g., the wrong `--resume `), kill that bun — it's an impostor. Your legitimate MCP's next heartbeat (~30s) restores the correct entry. @@ -56,8 +76,8 @@ broken). ### 3. Read active subordinates -From sessions.json, list every session with `role=subordinate` whose -`last_heartbeat_at` is within the last 90 seconds. +The Step 2 query already lists every active session. SAs are rows +where `role=subordinate` and the heartbeat age is `ACTIVE` (< 90s). Output a status block to terminal: @@ -72,11 +92,11 @@ If zero SAs active, output `No SAs currently active. Run sa-start.bat to spin on ### 4. Verify agent-channel is wired The MCP server should have logged `agent-channel: started as prime ...` -to its stderr when this session started. Check by tailing the orchestrator -plugin's recent log output OR by confirming sessions.json contains your -own entry with role=prime + a fresh heartbeat (< 60s old). +to its stderr when this session started. The Step 2 query already +confirms wiring — if your own row is present with a fresh heartbeat, +the MCP is running and writing. -If you can't confirm agent-channel is running, surface the failure +If the query returns no row for your session_id, surface the failure verbatim and abort - PA without agent-channel is useless. ### 5. Load PA's operating contract @@ -108,20 +128,23 @@ user_pattern notes in the global DB encode that user's preferences, work habits, communication style, decision biases, and values. They persist across every project. -Run a targeted lookup to load recent + high-confidence user_patterns -into your working context: +Run a targeted lookup to load recent user_patterns into your working +context. Use `output_mode: "summary"` — you need to know WHAT exists +so you can drill into specific notes later, not memorize full bodies: ``` lookup({ type: "user_pattern", limit: 25, + output_mode: "summary", }) ``` Skim the returned notes and internalize them. They're the rules of engagement for how you act on this user's behalf. The briefing's cross_project section also surfaces some, but this explicit lookup -guarantees you're loaded. +guarantees you're loaded. Re-call any specific note in full mode +(`lookup({id: ""})`) when an SA's task intersects it. If a user-pattern lookup returns zero results, that's normal for a fresh user (first project, no patterns captured yet). Your job is then @@ -138,10 +161,15 @@ macro model so SAs don't make tree-level decisions that break the forest. Load the project's architecture + recent decisions into working context so you can apply them during SA coordination. +All four lookups use `output_mode: "summary"` — same reasoning as +5.5. You need the IDs and shapes, not the full bodies; full-mode +follow-up happens when an SA's task intersects a specific note. + ``` lookup({ type: "architecture", limit: 15, + output_mode: "summary", }) ``` @@ -149,6 +177,7 @@ lookup({ lookup({ type: "decision", limit: 15, + output_mode: "summary", }) ``` @@ -156,6 +185,7 @@ lookup({ lookup({ type: "convention", limit: 10, + output_mode: "summary", }) ``` @@ -163,6 +193,7 @@ lookup({ lookup({ type: "anti_pattern", limit: 15, + output_mode: "summary", }) ``` @@ -203,10 +234,10 @@ Scan for multi-repo references: ```bash # CLAUDE.md typically captures the project's repo structure -grep -i 'repo\|repository' "$CLAUDE_PROJECT_DIR/CLAUDE.md" | head -20 +grep -i 'repo\|repository' "$ORCHESTRATOR_PROJECT_ROOT/CLAUDE.md" | head -20 # docs/ may have architecture overviews -ls "$CLAUDE_PROJECT_DIR/docs/" 2>/dev/null | head +ls "$ORCHESTRATOR_PROJECT_ROOT/docs/" 2>/dev/null | head ``` Also lookup architecture notes for cross-repo references: @@ -216,6 +247,7 @@ lookup({ type: "architecture", query: "repo OR landing OR worker OR plugin OR cross-repo", limit: 10, + output_mode: "summary", }) ``` @@ -237,15 +269,24 @@ and apply it proactively. ### 6. Check for any existing global pause -Read `state.json` (same dir as sessions.json): +Query the `global_pause` table in the agent-channel DB: ```bash -cat "$CLAUDE_PROJECT_DIR/.orchestrator-state/agent-channel/state.json" +python3 <<'PY' +import os, sqlite3 +db = f"{os.environ['ORCHESTRATOR_PROJECT_ROOT']}/.orchestrator-state/agent-channel/agent_channel.db" +c = sqlite3.connect(db); c.row_factory = sqlite3.Row +rows = list(c.execute("SELECT * FROM global_pause WHERE active=1")) +if rows: + for r in rows: print("global pause:", dict(r)) +else: + print("(no global pause active)") +PY ``` -If `pa_global_pause.active` is true, mention it explicitly in the -readiness output and ask the user if you should clear it. Typically yes -(they just spawned a fresh PA), but their call. +If a global pause is active, mention it explicitly in the readiness +output and ask the user if you should clear it. Typically yes (they +just spawned a fresh PA), but their call. ### 7. Output readiness @@ -264,14 +305,15 @@ Override state: . - Do NOT call any messaging tool. `send_message`, `read_messages`, `peek_inbox` were deleted in 0.29.0. Cross-session communication is via terminal output + agent-channel notifications. -- Do NOT silently recover from a missing/malformed sessions.json. If the - state isn't sane at startup, surface the problem to the user and ask. - PA's job depends on accurate visibility into the project's session graph. +- Do NOT silently recover from a missing/malformed agent-channel DB. If + the state isn't sane at startup, surface the problem to the user and + ask. PA's job depends on accurate visibility into the project's + session graph. ## On idempotency Re-running `/pa-bootstrap` mid-session is safe: - `/model` and `/effort` calls are no-ops if already at the target. -- sessions.json read is read-only at this stage. -- state.json is only modified if you choose to clear an existing global +- Step 2 query is read-only. +- Step 6 only modifies state if you choose to clear an existing global pause (the user's call).