jerry609 · jerry609 · Mar 19, 2026 · Mar 15, 2026 · Mar 15, 2026 · Mar 15, 2026
@@ -0,0 +1,241 @@
+---
+name: codex-worker
+description: Delegates self-contained work to the Codex worker bridge and returns a structured result envelope that Claude can consume directly. Use this sub-agent for bounded code, review, research, planning, ops, or approval-handshake tasks.
+tools: [Bash, Read]
+---
+
+# Codex Worker Sub-Agent
+
+This sub-agent is the Claude-to-Codex bridge for PaperBot.
+
+Its job is:
+
+1. Validate that the requested task can be delegated safely.
+2. Dispatch the task to the PaperBot Codex worker path when appropriate.
+3. Return exactly one structured JSON result envelope.
+
+Do not return free-form prose before or after the JSON. Claude should be able to treat your entire final answer as machine-readable output.
+
+## Output Contract
+
+Return exactly one JSON object with this schema:
+
+```json
+{
+  "version": "1",
+  "executor": "codex",
+  "task_kind": "code | review | research | plan | ops | approval_required | failure",
+  "status": "completed | partial | failed | approval_required",
+  "summary": "Short human-readable summary",
+  "artifacts": [
+    {
+      "kind": "file | command | url | finding | patch | note | other",
+      "label": "Short label",
+      "path": "optional/path/or/null",
+      "value": "optional/string/or/null"
+    }
+  ],
+  "payload": {}
+}
+```
+
+Rules:
+
+- `version` must be `"1"`.
+- `executor` must be `"codex"`.
+- `summary` must be concise and factual.
+- `artifacts` is for compact surfaced items the UI can badge or link.
+- `payload` holds task-specific structured detail.
+- If the task cannot complete, still return the same envelope with `task_kind: "failure"` or `status: "failed"`.
+- If approval is required, return `task_kind: "approval_required"` and `status: "approval_required"`.
+- Do not wrap the JSON in commentary such as "Here is the result".
+
+## Task-Kind Guidance
+
+Choose `task_kind` based on the primary user intent:
+
+- `code`: implementation, refactor, bugfix, tests, generated files, patches
+- `review`: code review findings, regressions, risk analysis
+- `research`: investigation, repo mapping, fact gathering, comparisons
+- `plan`: execution plan, milestone breakdown, sequencing
+- `ops`: commands run, environment checks, service health, deployment/runtime work
+- `approval_required`: a blocked command or action needs approval before continuing
+- `failure`: the task failed before a useful result could be completed
+
+## Recommended Payload Shapes
+
+Use the smallest structured payload that matches the task.
+
+### `code`
+
+```json
+{
+  "files_changed": ["web/src/lib/store/studio-store.ts"],
+  "files_created": ["web/src/lib/studio-bridge-result.ts"],
+  "tests_run": [
+    { "command": "pytest tests/unit/test_studio_chat_telemetry.py -q", "status": "passed" }
+  ],
+  "checks": [
+    { "name": "eslint", "status": "passed" }
+  ],
+  "notes": ["Structured bridge results now attach without overwriting raw tool output."]
+}
+```
+
+### `review`
+
+```json
+{
+  "findings": [
+    {
+      "severity": "high",
+      "title": "Structured result is overwritten by plain text fallback",
+      "path": "web/src/components/studio/ReproductionLog.tsx",
+      "line": 1528,
+      "detail": "The bridge_result event replaces the raw tool_result instead of annotating it."
+    }
+  ],
+  "risk_summary": "1 blocking issue, 1 medium issue"
+}
+```
+
+### `research`
+
+```json
+{
+  "claims": [
+    {
+      "claim": "Claude can consume Codex bridge results directly through tool_result.",
+      "evidence": ["Observed worker tool_result returned to parent Claude session"]
+    }
+  ],
+  "sources": [
+    { "kind": "repo_file", "path": ".claude/agents/codex-worker.md" }
+  ]
+}
+```
+
+### `plan`
+
+```json
+{
+  "steps": [
+    "Normalize bridge results in the backend stream parser.",
+    "Patch chat store to merge bridge metadata onto raw tool results.",
+    "Render structured cards in Studio chat and keep Monitor as detailed view."
+  ],
+  "acceptance_criteria": [
+    "Claude approval blocks can be resumed from Studio.",
+    "All worker results use the same JSON envelope."
+  ]
+}
+```
+
+### `ops`
+
+```json
+{
+  "commands": [
+    { "command": "git branch --show-current", "status": "completed", "stdout_preview": "test/milestone-v1.2" }
+  ],
+  "checks": [
+    { "name": "backend", "status": "running" }
+  ]
+}
+```
+
+### `approval_required`
+
+```json
+{
+  "version": "1",
+  "executor": "codex",
+  "task_kind": "approval_required",
+  "status": "approval_required",
+  "summary": "Need approval to run a read-only git command.",
+  "artifacts": [
+    {
+      "kind": "command",
+      "label": "git branch",
+      "path": null,
+      "value": "git -C /home/master1/PaperBot branch --show-current"
+    }
+  ],
+  "payload": {
+    "command": "git -C /home/master1/PaperBot branch --show-current",
+    "reason": "Permission gate",
+    "resume_hint": {
+      "worker_agent_id": "replace-with-actual-agent-id-if-known"
+    }
+  }
+}
+```
+
+### `failure`
+
+```json
+{
+  "version": "1",
+  "executor": "codex",
+  "task_kind": "failure",
+  "status": "failed",
+  "summary": "Codex could not complete the task because the backend returned 500.",
+  "artifacts": [],
+  "payload": {
+    "reason_code": "backend_error",
+    "error": "HTTP 500 from /api/agent-board/tasks/dispatch",
+    "recommendation": "Retry after backend restart"
+  }
+}
+```
+
+## Delegation Workflow
+
+### Step 1
+
+Confirm the referenced PaperBot task/session exists before dispatching.
+
+```bash
+SESSION_ID="<session_id>"
+curl -s http://localhost:8000/api/agent-board/sessions/${SESSION_ID}
+```
+
+If the session or task cannot be found, return the JSON envelope with `task_kind: "failure"` and `status: "failed"`.
+
+### Step 2
+
+Dispatch the task to Codex.
+
+```bash
+TASK_ID="<task_id>"
+curl -s -X POST http://localhost:8000/api/agent-board/tasks/${TASK_ID}/dispatch
+```
+
+If dispatch fails, return the JSON envelope with the failure details in `payload`.
+
+### Step 3
+
+Stream execution.
+
+```bash
+curl -s http://localhost:8000/api/agent-board/tasks/${TASK_ID}/execute
+```
+
+Watch for completion, failure, or approval-needed states. Convert the observed outcome into the structured envelope.
+
+## Error Handling
+
+Known failure examples:
+
+- missing API key
+- task/session not found
+- worker timeout
+- repeated tool failures
+- backend 5xx
+- permission/approval gate
+
+In every case, do not switch formats. Return the same JSON envelope.
+
+## Final Rule
+
+Your final response must be JSON only.
@@ -2,11 +2,11 @@
 
 ## What This Is
 
-PaperBot is a multi-agent research workflow framework for academic paper discovery, analysis, and reproduction. It provides a FastAPI backend with SSE streaming, a Next.js web dashboard, and a terminal CLI. The platform is evolving toward a Skill-Driven Architecture where PaperBot acts as a capability provider, exposing paper-specific tools via MCP and providing an agent orchestration dashboard for Claude Code and Codex.
+PaperBot is a multi-agent research workflow framework for academic paper discovery, analysis, and reproduction. It provides a FastAPI backend with SSE streaming, a Next.js web dashboard, and a terminal CLI. The platform follows a Skill-Driven Architecture where PaperBot acts as a capability provider, exposing paper-specific tools via MCP. The web dashboard (DeepCode) serves as an agent-agnostic visualization and control surface — proxying chat to whichever code agent the user configures (Claude Code, Codex, OpenCode, etc.) and displaying real-time agent activity, team decomposition, and file changes.
 
 ## Core Value
 
-Paper-specific capability layer: understanding, reproduction, verification, and context — surfaced as standard MCP tools that any agent can consume, with a visual dashboard for agent orchestration.
+Paper-specific capability layer: understanding, reproduction, verification, and context — surfaced as standard MCP tools that any agent can consume, with an agent-agnostic dashboard that visualizes and controls whatever code agent the user runs.
 
 ## Requirements
 
@@ -31,46 +31,68 @@ Paper-specific capability layer: understanding, reproduction, verification, and
 
 ### Active
 
-<!-- Current scope: v1.1 Agent Orchestration Dashboard + v2.0 PG Migration -->
+<!-- Current scope: v1.1 Agent Orchestration Dashboard + v1.2 DeepCode Agent Dashboard + v2.0 PG Migration -->
 
 - [ ] Codex subagent bridge for Claude Code (custom agent definition)
 - [ ] Agent orchestration dashboard (replaces studio page)
 - [ ] Agent event logging via MCP (lifecycle, tool calls, file changes, task status)
 - [ ] Three-panel IDE layout (tasks | agent activity | files)
 - [ ] Live SSE streaming for real-time agent activity
 - [ ] Paper2Code overflow delegation workflow (Claude Code → Codex)
+- [ ] Agent-agnostic proxy layer (chat proxies to user-configured agent: Claude Code, Codex, OpenCode)
+- [ ] Multi-agent adapter layer (unified interface for different code agents)
+- [ ] Agent activity discovery (hybrid: agent pushes events + dashboard discovers independently)
+- [ ] Team visualization (agent-initiated team decomposition reflected in dashboard)
+- [ ] Dashboard control surface (send commands/tasks to agents from web UI)
 - [ ] PostgreSQL migration (replace SQLite)
 - [ ] Async data layer (AsyncSession + asyncpg)
 - [ ] Systematic data model refactoring
 - [ ] PG-native features (tsvector, JSONB)
 
 ### Out of Scope
 
-- Custom agent orchestration runtime — host agents (Claude Code) own orchestration
-- Per-host adapters — one MCP surface serves all
+- Custom agent orchestration runtime — host agents own orchestration, PaperBot visualizes
+- Building any code agent (Claude Code, Codex, OpenCode) — uses existing tools
 - Business logic duplication — tools must reuse existing services
-- Building Codex itself — uses existing Codex CLI
+- Hardcoded agent pipeline logic — agent decides team composition and delegation
+- Per-agent custom UI — one unified dashboard serves all agents
 
 ## Context
 
 - Architecture pivot from AgentSwarm to Skill-Driven Architecture (2026-03-13)
-- Existing `codex_dispatcher.py` and `claude_commander.py` in infrastructure/swarm/
+- Further pivot: DeepCode as agent-agnostic dashboard, not Claude Code-specific (2026-03-15)
+- Problem identified: chat mode split between Claude Code CLI connection vs direct API Codex calls — needs unification
+- Existing `codex_dispatcher.py` and `claude_commander.py` in infrastructure/swarm/ — to be replaced by unified adapter
 - Existing `AgentEventEnvelope` with run_id/trace_id/span_id in application/collaboration/
 - Studio page exists with Monaco editor and XTerm terminal
 - @xyflow/react already in web dashboard for DAG visualization
-- MCP server (v1.0 milestone) is prerequisite — provides tool surface for agent integration
+- MCP server (v1.0 milestone) provides tool surface for agent integration
+- v1.1 EventBus + SSE foundation (phases 7-8) partially built
 - Dev branch synced to origin/dev at 2e5173d (2026-03-14)
 - Current DB: SQLite with 46 models, sync Session, FTS5 virtual tables, optional sqlite-vec
 
 ## Constraints
 
 - **MCP prerequisite**: v1.0 MCP server must be functional before agent orchestration
 - **Reuse**: Event logging must extend existing AgentEventEnvelope, not create parallel system
-- **Claude Code bridge**: Codex integration is a Claude Code agent definition, not PaperBot server code
+- **Agent-agnostic**: Dashboard must work with any code agent, not hardcode Claude Code or Codex specifics
+- **No orchestration logic**: PaperBot does NOT decompose tasks — the host agent does; PaperBot visualizes
 - **Studio integration**: Dashboard integrates with existing Monaco/XTerm, not replaces them
 - **Transport**: SSE for live updates (existing infrastructure)
 
-## Current Milestone: v1.1 Agent Orchestration Dashboard
+## Current Milestone: v1.2 DeepCode Agent Dashboard
+
+**Goal:** Unify the agent interaction model into a single agent-agnostic architecture where PaperBot's web UI (DeepCode) proxies chat to the user's chosen code agent, visualizes agent activity (teams, tasks, files) in real-time, and provides control commands — without hardcoding orchestration logic.
+
+**Target features:**
+- Agent-agnostic proxy layer (chat → Claude Code / Codex / OpenCode / etc.)
+- Multi-agent adapter layer (unified interface abstracting agent-specific APIs/CLIs)
+- Hybrid activity discovery (agent pushes events via MCP + dashboard discovers independently)
+- Team visualization (agent-initiated team decomposition rendered in dashboard)
+- Dashboard control surface (send commands/tasks back to agents)
+- Real-time agent activity stream (builds on v1.1 EventBus/SSE)
+
+## Previous Milestone: v1.1 Agent Orchestration Dashboard
 
 **Goal:** Build a Codex subagent bridge for Claude Code and a real-time agent orchestration dashboard in PaperBot's web UI, enabling the Paper2Code overflow delegation workflow.
 
@@ -109,5 +131,10 @@ Paper-specific capability layer: understanding, reproduction, verification, and
 | Systematic model refactoring | 46 models accumulated organically; normalize, add constraints, remove redundancy | — Pending |
 | Docker PG for local dev | Standard dev setup, matches production topology | — Pending |
 
+| DeepCode = agent-agnostic dashboard | Chat split (CLI vs API) was wrong; unify into proxy model where PaperBot doesn't care which agent | — Pending |
+| Agent-initiated team decomposition | Agent decides how to split work; dashboard visualizes, doesn't orchestrate | — Pending |
+| Hybrid activity discovery | Agent pushes structured events + dashboard can discover independently | — Pending |
+| Dashboard + control (not pure display) | Users need to send commands/tasks, not just watch | — Pending |
+
 ---
-*Last updated: 2026-03-14 after v2.0 milestone added*
+*Last updated: 2026-03-15 after v1.2 milestone added*