CodeForge is a Go HTTP server that orchestrates AI-powered code work over git repositories. A task is a session over a repo — not a one-shot job. Each task tracks its workspace, conversation history (multi-turn), review results, and PR state.
The system receives task requests via REST API, clones repositories, runs AI CLI tools against them, streams progress in real time, and supports human-in-the-loop actions: review, instruct, create PR, post review comments. It handles automated PR reviews via webhooks, multiple CLI runners (Claude Code, Codex), and a tool system for extending AI capabilities.
Client (ScopeBot / curl)
│
├── POST /tasks ──────▶ ┌──────────────┐ ┌──────────────┐
│ │ HTTP Server │────▶│ Redis Queue │
│ │ (Chi) │ │ (BLPOP) │
│ └──────┬───────┘ └──────┬───────┘
│ │ │
│ │ ┌─────▼───────┐
└── GET /tasks/{id}/stream ──▶ │ │ Worker Pool │
│ │ (N workers) │
┌──────────────┘ └──────┬──────┘
▼ │
┌──────────────┐ ┌───────────────┼───────────────┐
│ SSE Handler │◀── Pub/Sub ─┤ │ │
│ (stream.go) │ ▼ ▼ ▼
└──────────────┘ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Git Clone│ │ CLI Run │ │ Webhook │
│ │ │(Claude/ │ │ Callback │
└──────────┘ │ Codex) │ └──────────┘
└──────────┘
- Chi router with middleware (auth, logging, rate limiting, metrics, tracing)
- Handlers for tasks, keys, MCP servers, tools, workspaces, workflows, and SSE streams
- Swagger UI at
/api/docswith embedded OpenAPI spec - Prometheus
/metricsand health endpoints (no auth required) - SSE stream endpoint bypasses
otelhttpand request timeout middleware (see Streaming below)
- CRUD operations on task state stored in Redis hashes
- State machine with validated transitions (see Task Lifecycle below)
- FIFO session queue via
RPUSH/BLPOP - Iteration tracking for multi-turn conversations
- PR service for commit/push/PR creation flow
- Review lifecycle methods (
StartReview,CompleteReview)
- Configurable concurrency (N goroutines)
- Each worker polls Redis queue with
BLPOP(5s timeout) - Per-task cancellable contexts for cancel support
- Executor orchestrates: clone -> run CLI -> diff -> report
Runnerinterface for pluggable AI tools- Claude Code runner:
--output-format stream-jsonparsing, supports MaxTurns and MaxBudgetUSD - Codex runner: JSONL stream parsing (
--json --sandbox danger-full-access),CODEX_API_KEYenv var. Usesdanger-full-accesssandbox because Codex's Landlock sandbox does not work inside Docker (missing kernel support / capabilities). The Docker container itself provides isolation. - Registry maps CLI names to Runner implementations
- Selected per-task via
config.clifield (default:claude-code) - Result extraction: prefers the
type: "result"event text; falls back to the lasttype: "assistant"message text
- Converts CLI-specific events into a common stream format
- Normalized event types:
thinking,text,tool_use,tool_result,result,error,system - Both Claude Code and Codex output gets normalized before being sent to SSE clients
- FE consumers only need to handle normalized event types
- Claude Code normalizer: maps
assistantblocks (thinking/text),tool_use/tool_result,result - Codex normalizer: maps
item.completedevents —agent_message→text,function_call→tool_use,function_call_output→tool_result,command_execution→tool_result,turn.completed→result
Worker side (internal/worker/stream.go):
- Events published to Redis Pub/Sub channels (
task:{id}:stream) - Dual-write to history list (
task:{id}:history) for reconnection - Event types: system, git, cli, stream, result
- Done signal on separate channel (
task:{id}:done)
SSE handler (internal/server/handlers/stream.go):
GET /api/v1/tasks/{id}/streamopens a long-lived SSE connection- Subscribes to Redis Pub/Sub before reading history to avoid missed events
- Replays full history, then streams live events
- Named events:
connected,done,timeout; keepalive comments every 15s - For terminal tasks (completed/failed/pr_created): replays history + sends
doneimmediately - Uses
http.ResponseControllerfor per-write deadlines (30s) instead of globalWriteTimeout - Auto-closes after 10 minutes
Middleware considerations for SSE:
- The SSE endpoint is excluded from the
chimw.Timeoutmiddleware group (long-lived connection) otelhttpwrapshttp.ResponseWriterwithouthttp.Flushersupport — SSE requests bypassotelhttpvia path-suffix check inserver.go- The PrometheusMetrics middleware's
responseWriterimplementsFlush()(delegates to underlying writer) andUnwrap()(forhttp.ResponseControllercompatibility) - Global
http.Server.WriteTimeoutis set to0(disabled) — SSE handler manages its own deadlines
- Four task types:
code(default),plan,review,pr_review— each with different behavior - code: no template wrapping, user prompt passed directly to CLI
- plan: wraps user prompt in
plan.mdtemplate — instructs AI to analyze repo and create implementation plan without modifying files - review: wraps user prompt in
review.mdtemplate — instructs AI to review code quality with structured JSON output, read-only - pr_review: wraps user prompt in
pr_review.mdtemplate — instructs AI to review a specific PR/MR diff viagit diff origin/{base}...HEAD, outputs structured JSON - Templates rendered via Go
text/templatewithembed.FS - Template rendering happens in the executor (
buildPrompt()) at runtime, NOT at task creation time Task.Promptalways stores the original user prompt (displayed in FE), template is applied only when running the CLIGET /api/v1/task-typesendpoint returns available types for FE toggle buttons
- User-triggered action, NOT an automatic step — user calls
POST /tasks/:id/review - Async via worker pool:
POST /reviewreturns 202, review enqueued to Redis FIFO queue, executed by worker (same path as instruct) - Executor's
executeReview()handles: resolve workspace → buildcode_reviewprompt → run CLI → parse output → storeReviewResult - Multi-strategy parser: direct JSON, markdown code block, heuristic brace matching, fallback
- Review result stored on
Task.ReviewResultfield in Redis - Full SSE streaming during review (events:
review_started, CLI output,review_completed) - Cancel support, worker pool concurrency control, configurable timeout — no HTTP timeout dependency
- Supports different CLI for review (e.g. Codex reviews Claude Code's output)
pr_reviewis a task type, NOT a separate system — reuses the entire task lifecycle- API trigger:
POST /api/v1/taskswithtask_type: "pr_review",config.pr_number, source/target branches - Webhook trigger: GitHub/GitLab webhooks auto-create
pr_reviewtasks (endpoints outside Bearer auth, verified via webhook secrets) - Clone strategy: clones target branch (non-shallow), fetches PR ref via
git fetch origin pull/{N}/head:pr-{N}, checks out local branch — handles fork PRs automatically - Completion: executor parses
ReviewResultfrom CLI output, stores on task; ifoutput_mode: "post_comments", automatically posts to GitHub/GitLab - Comment posting:
POST /tasks/:id/post-reviewendpoint for manual posting; uses GitHub Pull Request Reviews API (line-level comments, max 20) or GitLab Discussions API (position-based comments) - Comment formatting:
internal/review/format.go— severity labels (CRITICAL, MAJOR, MINOR, SUGGESTION), markdown summary body - GitHub review posting:
internal/tool/git/github_review.go— verdict mapping (approve→APPROVE, request_changes→REQUEST_CHANGES) - GitLab review posting:
internal/tool/git/gitlab_review.go— MR version SHAs for position-based comments, fallback to summary-only
- GitHub: HMAC-SHA256 signature verification via
X-Hub-Signature-256, handlespull_requestevents (opened, synchronize, reopened) - GitLab: constant-time
X-Gitlab-Tokencomparison, handlesMerge Request Hookevents (open, update, reopen) - Draft PR/MR filtering: skipped unless
code_review.review_draftsis true - Configuration:
code_review.webhook_secrets.github,code_review.webhook_secrets.gitlab,code_review.default_key_name,code_review.default_cli - Routes registered outside Bearer auth group:
POST /api/v1/webhooks/github,POST /api/v1/webhooks/gitlab
- High-level abstraction over MCP servers — users request tools by name, system handles MCP wiring
- Registry — SQLite-backed storage with scope (global / project-level)
- Catalog — 5 built-in tools: sentry (HTTP), jira, git, github, playwright (stdio)
- Resolver — lookup chain: project scope -> global scope -> built-in catalog
- Bridge — converts resolved tools to MCP server configs for
.mcp.jsongeneration - Validator — checks required config fields before task execution
- Per-task tool requests via
TaskConfig.Toolsfield
- Multi-step workflow orchestrator consuming from Redis FIFO queue (
BLPOP queue:workflows) - Three step types:
- fetch — HTTP request to external APIs (e.g., Sentry, GitHub Issues) with JSONPath output extraction
- task — creates and waits for a CodeForge task (clone + AI CLI run)
- action — built-in actions (e.g.,
create_pr,notify) that operate on previous step results
- Go
text/templateengine for step configuration:{{.Params.key}},{{.Steps.step_name.field}} - Built-in workflows:
sentry-fixer,github-issue-fixer,gitlab-issue-fixer,code-review,knowledge-update - Workflow definitions stored in SQLite (user-created + built-in, seeded on startup)
- Run state tracked in SQLite with per-step status records
- Streaming via Redis Pub/Sub (
workflow:{runID}:stream) with history replay, same SSE pattern as tasks
Workflow Run Lifecycle:
pending ──▶ running ──▶ completed
│ │
▼ ▼
(queue) failed
- Embedded SQLite database for persistent storage of workflow definitions, workflow runs, keys, tools, and MCP server configs
- Auto-migration on startup
- Default path:
/data/codeforge.db(configurable viaCODEFORGE_SQLITE__PATH)
- Clone with
GIT_ASKPASSfor token auth (never in URL or .git/config) - Provider detection from URL (GitHub, GitLab, custom domains)
- PR creation via GitHub/GitLab APIs
- Branch management, diff calculation
- Global and per-project MCP server registration
- Generates
.mcp.jsonconsumed by Claude Code at runtime - Server configs stored in SQLite
- AES-256-GCM encryption for sensitive fields in Redis
- Key registry with 3-tier resolution: inline token -> registry lookup -> env var
- HMAC-SHA256 webhook signatures
- Path traversal guards on workspace deletion
All keys use configurable prefix (default: codeforge:).
| Key Pattern | Type | Description |
|---|---|---|
task:{id} |
Hash | Task state (all fields) |
task:{id}:stream |
Pub/Sub | Live event stream |
task:{id}:history |
List | Event history for reconnection |
task:{id}:done |
Pub/Sub | Completion signal |
task:{id}:iterations |
List | Iteration records (JSON) |
queue:sessions |
List | FIFO session queue (RPUSH/BLPOP) |
key:{name} |
Hash | Encrypted access key |
keys:index |
Set | Index of all key names |
mcp:global:{name} |
Hash | Global MCP server config |
mcp:global:index |
Set | Index of global MCP servers |
mcp:project:{id}:{name} |
Hash | Per-project MCP server config |
mcp:project:{id}:index |
Set | Per-project MCP server index |
workspace:{id} |
Hash | Workspace metadata |
workspaces:index |
Set | Index of all workspaces |
webhook:dedup:{repo}:{pr}:{sha} |
String | Webhook dedup (SETNX + TTL) |
ratelimit:{token_hash} |
Sorted Set | Sliding window rate limit |
input:sessions |
List | Redis-based session input channel |
queue:workflows |
List | FIFO workflow run queue (RPUSH/BLPOP) |
workflow:{runID}:stream |
Pub/Sub | Live workflow event stream |
workflow:{runID}:history |
List | Workflow event history for reconnection |
workflow:{runID}:done |
Pub/Sub | Workflow run completion signal |
workflow:{runID}:context |
Hash | Step outputs for template interpolation |
pending ──▶ cloning ──▶ running ──▶ completed ──▶ creating_pr ──▶ pr_created
│ │ │ │ ▲ │
│ │ │ │ │ │
│ │ │ ▼ │ │
│ │ │ reviewing │
│ │ │ │
│ │ │ awaiting_instruction ◀─────────────┘
│ │ │ │ ▲
│ │ │ │ │
│ │ │ ▼ │
│ │ │ running (iteration N)
│ │ │
▼ ▼ ▼
failed failed failed
- Create → POST /tasks → pending → queue (RPUSH)
- Execute → worker BLPOP → cloning → running → completed
- Review → POST /tasks/:id/review → 202 → reviewing → queue → worker → completed (with ReviewResult)
- Instruct → POST /tasks/:id/instruct → awaiting_instruction → queue → worker → completed
- Create PR → POST /tasks/:id/create-pr → creating_pr → pr_created
- Cancel → POST /tasks/:id/cancel → context cancel → failed
Steps 3-5 are repeatable. All queue operations go through Redis FIFO (RPUSH + BLPOP). Review and instruct share the same worker pool — no separate execution path.
- pending: Task created, queued for processing
- cloning: Git repository being cloned
- running: AI CLI executing the prompt
- completed: CLI finished, results available — user can now review, instruct, or create PR
- reviewing: Code review in progress (enqueued via
POST /tasks/:id/review, runs in worker pool) - failed: Terminal state (clone/run/timeout/cancel failure)
- awaiting_instruction: Waiting for follow-up prompt (after
POST /tasks/:id/instruct) - creating_pr: PR/MR being created
- pr_created: PR/MR created successfully
| From | To |
|---|---|
| pending | cloning, failed |
| cloning | running, failed |
| running | completed, failed |
| completed | awaiting_instruction, creating_pr, reviewing |
| reviewing | completed, failed |
| awaiting_instruction | running, reviewing, failed |
| creating_pr | pr_created, failed |
| pr_created | awaiting_instruction, completed |
codeforge_tasks_total(counter) - tasks by statuscodeforge_tasks_duration_seconds(histogram) - execution timecodeforge_tasks_in_progress(gauge) - active taskscodeforge_queue_depth(gauge) - queue sizecodeforge_workers_active/total(gauge) - worker utilizationcodeforge_http_requests_total(counter) - HTTP requestscodeforge_http_request_duration_seconds(histogram) - HTTP latencycodeforge_webhook_deliveries_total(counter) - webhook outcomescodeforge_review_parse_failures_total(counter) - review output parse failures
- Spans:
task.execute,task.clone,task.run - Trace ID propagated through task lifecycle and webhook headers
- Configurable sampling rate, OTLP HTTP export
- HTTP instrumentation via
otelhttp
cmd/codeforge/ # Application entrypoint
internal/
apperror/ # Application error types
config/ # Configuration loading (koanf)
crypto/ # AES-256-GCM encryption
database/ # SQLite wrapper + migrations
keys/ # Access key registry + resolver
logger/ # Structured logging (slog)
metrics/ # Prometheus metric definitions
prompt/ # Prompt templates (embed FS, task types: code, plan, review, pr_review)
redisclient/ # Redis client wrapper
review/ # Code review types, output parser, comment formatting
server/ # HTTP server + handlers + middleware
handlers/ # Request handlers (tasks, webhook receiver, stream, etc.)
task/ # Task model, service, state machine
tool/ # Tool subsystem namespace (low-level)
git/ # Git operations (clone, branch, PR, review posting)
runner/ # CLI runner interface + implementations (Claude, Codex)
mcp/ # MCP server registry + installer
tools/ # Tool system (high-level: catalog, registry, resolver, bridge)
tracing/ # OpenTelemetry setup
webhook/ # Webhook sender with HMAC + retries
worker/ # Worker pool, executor, streamer, normalizer
workflow/ # Workflow orchestrator, step executors, templates
workspace/ # Workspace manager + cleanup
api/ # OpenAPI specification
deployments/ # Docker, docker-compose files
tests/ # Integration tests, mock CLI