A CLI tool that runs Claude Code in an autonomous loop to implement spec-driven tasks. Give it a spec and a plan, and it works through each task one commit at a time until everything is done.
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Spec │ │ Plan │ │ Claude │
│ (what to │ ──▶ │ (checklist │ ──▶ │ Code │ ──┐
│ build) │ │ of tasks) │ │ │ │
└─────────────┘ └─────────────┘ └─────────────┘ │
│
┌──────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Loop iteration: │
│ 1. Claude reads spec + plan │
│ 2. Picks ONE unchecked task │
│ 3. Implements it │
│ 4. Marks task [x] in plan │
│ 5. Commits with gritty │
│ 6. If all done → outputs <promise>COMPLETE</promise> │
│ Otherwise → next iteration │
└─────────────────────────────────────────────────────────┘
The loop continues until Claude outputs <promise>COMPLETE</promise> or hits the iteration limit.
git clone git@github.com:ohitslaurence/agent-loop.git ~/dev/personal/agent-loop
cd ~/dev/personal/agent-loop
./install.shThis symlinks loop and loop-analyze to ~/.local/bin/.
For system-wide install: ./install.sh --global (uses /usr/local/bin/, requires sudo).
| Dependency | Required | Purpose |
|---|---|---|
| Claude Code CLI | Yes | The AI agent that does the work |
| gritty | Yes* | AI-powered git commits (referenced in default prompt) |
| gum | No | Interactive spec picker, styled terminal output |
*You can use a custom prompt that doesn't require gritty.
Start the daemon once, then create runs with loopctl:
# Start daemon
loopd
# In a workspace (uses git root or cwd)
loopctl run --pick
# or
loopctl run specs/my-feature.md specs/planning/my-feature-plan.md
# Preview the prompt without running
loopctl prompt --pick
# Follow output
loopctl tail <run-id> --followNotes:
- The daemon binds to
http://127.0.0.1:7700by default; a second daemon on the same port will fail to start. loopauto-delegates toloopctlwhen the daemon is running. SetLOOP_USE_DAEMON=noto force the bash loop.
When you create a run, the daemon does the following:
- Resolve workspace root (git root or cwd) and load
.loop/config. - Resolve checkout provider (default
branch;auto→ Worktrunk ifwtis available, else git worktree). - Build run branch config and prepare the checkout (workspace branch or worktree, depending on provider).
- Run implementation → review → verification, with watchdog rewrites if needed.
- When completion is detected, optionally merge the run branch into the merge target (if configured).
- Optionally clean up separate worktrees when
worktree_cleanup=true.
| Feature | bin/loop (bash) | loopd + loopctl (daemon) |
|---|---|---|
| Plan/spec loop (implementation/review/verification) | ✅ | ✅ |
| Worktrees + Worktrunk provider | ✅ (git only) | ✅ (git + worktrunk) |
| Merge-to-target on completion | ✅ | ✅ |
| Pause/resume/cancel runs | ❌ | ✅ |
| SSE output streaming | ❌ | ✅ |
| Interactive spec picker (gum) | ✅ | ✅ (loopctl run --pick or loopctl run) |
| Experiment mode + measure_cmd | ✅ | ❌ |
| postmortem + summary.json | ✅ | ❌ |
| Custom prompt file + context_files | ✅ | ✅ |
Prompt preview (loop prompt) |
✅ | ✅ (loopctl prompt) |
If you’re deprecating bin/loop, the remaining gaps to consider are experiment mode and postmortem/summary.
Deprecation note:
bin/loopis deprecated. The default path isloopd+loopctl. UseLOOP_USE_DAEMON=noonly for temporary legacy fallback.
# 1. Set up your project
cd your-project
loop --init-config
# 2. Create a spec
cat > specs/user-auth.md << 'EOF'
# User Authentication
**Status:** Draft
**Last Updated:** 2025-01-22
## Overview
Add basic username/password authentication to the app.
## Requirements
- Login form with username and password fields
- Session management with secure cookies
- Logout endpoint that clears session
- Protected routes that require authentication
EOF
# 3. Create a plan
cat > specs/planning/user-auth-plan.md << 'EOF'
# User Auth Implementation Plan
## Tasks
- [ ] Create User model with password hashing
- [ ] Add login API endpoint
- [ ] Add logout API endpoint
- [ ] Create login form component
- [ ] Add session middleware
- [ ] Protect dashboard routes
- [ ] Add "logged in as" indicator to header
## Verification
- [ ] Can register new user
- [ ] Can login with valid credentials
- [ ] Invalid credentials show error
- [ ] Logout clears session
- [ ] Protected routes redirect to login
EOF
# 4. Run the loop
loop specs/user-auth.mdloop [command] [spec-path] [plan-path] [options]
By default, loop delegates to the daemon and requires loopd to be running. Use LOOP_USE_DAEMON=no only for temporary legacy fallback.
| Command | Description |
|---|---|
| (none) | Run the agent loop (default) |
prompt |
Show the prompt that would be sent to Claude, then exit |
# Preview the prompt without running
loop prompt specs/my-feature.md
# Run the loop
loop specs/my-feature.md| Argument | Description |
|---|---|
spec-path |
Path to the spec file. Optional if gum is available (shows interactive picker). |
plan-path |
Path to the plan file. Defaults to <plans_dir>/<spec-name>-plan.md. |
| Option | Default | Description |
|---|---|---|
--iterations <n> |
50 | Maximum loop iterations before stopping |
--model <name> |
opus | Claude model to use (opus, sonnet, haiku) |
--log-dir <path> |
logs/loop | Where to write run logs |
--completion-mode |
trailing | How to detect completion (see below) |
--mode <name> |
plan | Run mode (plan or experiment) |
--prompt <path> |
- | Custom prompt file (overrides .loop/prompt.txt) |
--verify-cmd <cmd> |
- | Verification command to run after each iteration (repeatable) |
--verify-timeout-sec <n> |
0 | Timeout per verification command (0 = none) |
--measure-cmd <cmd> |
- | Measurement command (experiment mode). Writes to LOOP_METRICS_OUT |
--measure-timeout-sec <n> |
0 | Timeout per measurement command (0 = none) |
--claude-timeout-sec <n> |
0 | Timeout per Claude iteration (0 = none) |
--claude-retries <n> |
0 | Retries per iteration on non-zero exit |
--claude-retry-backoff-sec <n> |
5 | Seconds to sleep between retries |
--no-postmortem |
- | Skip the post-run analysis |
--no-gum |
- | Disable gum UI, use plain output |
--no-wait |
- | Don't wait for keypress at completion |
--config <path> |
- | Load specific config file |
--init-config |
- | Create .loop/config in current project |
If you run loop without arguments and gum is installed, you get an interactive picker. The daemon path also supports this with loopctl run --pick (or loopctl run with no args).
$ loop
? Select a spec...
> [Draft] User Authentication (2025-01-22) - user-auth.md
[In Progress] API Rate Limiting (2025-01-20) - rate-limiting.md
[Complete] Database Schema (2025-01-15) - db-schema.md
The picker scans specs/*.md, extracts metadata from each file, and sorts by last updated date.
Run loop --init-config to create .loop/config:
# Directories
specs_dir="specs"
plans_dir="specs/planning"
log_dir="logs/loop"
# Execution
model="opus"
iterations=50
completion_mode="trailing"
mode="plan"
# Verification (optional)
# Use | to separate multiple commands.
# You can also pass --verify-cmd multiple times.
# verify_cmds="bun test|bun lint"
verify_cmds=""
verify_timeout_sec=0
measure_cmd=""
measure_timeout_sec=0
# Resiliency (optional)
# Set claude_timeout_sec=0 to disable timeouts.
claude_timeout_sec=600
claude_retries=0
claude_retry_backoff_sec=5
# Features
postmortem=true
summary_json=true
no_wait=false
no_gum=false
# Custom prompt (optional)
# prompt_file=".loop/prompt.txt"
# Additional context files to include in prompt (optional)
# context_files="specs/README.md specs/planning/SPEC_AUTHORING.md CLAUDE.md"If you configure verification commands, loop runs them after each successful agent iteration. If verification fails, loop writes a failure context into the run logs and instructs the next iteration to fix verification before advancing the plan.
Configure via config:
verify_cmds="bun test|bun lint"
verify_timeout_sec=600Or via CLI:
loop specs/my-feature.md --verify-cmd "bun test" --verify-cmd "bun lint"Note: command timeouts require timeout on PATH (commonly from GNU coreutils).
Experiment mode runs iterative attempts toward a goal instead of a checklist. It captures metrics per iteration and writes an experiment log that subsequent agents can read.
Configure measurement with measure_cmd, which should write metrics to the path provided by
LOOP_METRICS_OUT:
mode="experiment"
verify_cmds="npm run test:playwright"
measure_cmd="node scripts/measure-bundle.js --out $LOOP_METRICS_OUT"
measure_timeout_sec=120The runner exports:
LOOP_METRICS_OUT- file path for metrics outputLOOP_ITERATION- current iteration numberLOOP_RUN_DIR- run directoryLOOP_SPEC_PATH- spec path
Artifacts are saved under logs/loop/run-<id>/:
metrics/iter-XX.jsonsummaries/iter-XX.mdexperiment-log.md
The context_files option lets you include additional files as @path references in the prompt. This is useful for:
- Spec writing guidelines - How specs should be structured
- Coding standards - Project-specific conventions
- Architecture docs - Context about the codebase
- CLAUDE.md - Instructions for Claude
context_files="specs/README.md specs/planning/SPEC_AUTHORING.md CLAUDE.md"This generates a prompt starting with:
@specs/feature.md @specs/planning/feature-plan.md @specs/README.md @specs/planning/SPEC_AUTHORING.md @CLAUDE.md
Create .loop/prompt.txt to customize the agent's behavior. Use these placeholders:
| Placeholder | Replaced With |
|---|---|
SPEC_PATH |
Path to the spec file |
PLAN_PATH |
Path to the plan file |
@SPEC_PATH @PLAN_PATH @docs/ARCHITECTURE.md
You are an implementation agent working on a TypeScript/React codebase.
## Your Task
1. Read the spec and plan carefully
2. Pick ONE unchecked `[ ]` task from the plan
3. Implement it following our coding standards
4. Mark the task `[x]` when complete
5. Run `bun test` to verify
6. Commit using `gritty commit --accept`
## When You're Done
If ALL tasks are checked `[x]`, output exactly:
<promise>COMPLETE</promise>
Otherwise, output ONE line: "Completed [task name]. [N] tasks remain."
## Rules
- One task per iteration
- Don't modify unrelated code
- Don't skip tests
- Use existing patterns from the codebase
The built-in prompt instructs Claude to:
- Pick the highest-priority unchecked task
- Implement only that task
- Run relevant verification steps
- Update the plan checklist
- Make one atomic commit via gritty
- Output
<promise>COMPLETE</promise>when all tasks are done
It also includes guardrails for spec alignment, schema matching, and handling ambiguity.
The loop watches for <promise>COMPLETE</promise> in Claude's output.
| Mode | Behavior |
|---|---|
exact |
Entire response must be exactly <promise>COMPLETE</promise> |
trailing (default) |
Token must be the last non-empty line |
The trailing mode is more forgiving—Claude can include a brief message before the token.
Each run creates a directory: logs/loop/run-<YYYYMMDD-HHMMSS>/
logs/loop/run-20250122-143052/
├── run.log # Human-readable event log
├── report.tsv # Machine-parseable events (for analysis)
├── prompt.txt # The exact prompt used
├── summary.json # Run statistics
├── iter-01.log # Full output from iteration 1
├── iter-01.tail.txt # Last 200 lines of iteration 1
├── iter-02.log # Full output from iteration 2
├── iter-02.tail.txt # ...
└── analysis/ # Postmortem reports (if enabled)
├── spec-compliance.md
├── run-quality.md
└── summary.md
{
"run_id": "20250122-143052",
"start_ms": 1737556252000,
"end_ms": 1737557891000,
"total_duration_ms": 1639000,
"iterations_run": 7,
"completed_iteration": 7,
"avg_duration_ms": 234142,
"last_exit_code": 0,
"completion_mode": "trailing",
"model": "opus",
"exit_reason": "complete_trailing"
}When enabled (default), loop runs three analysis passes after completion:
- Spec Compliance - Did the implementation match the spec?
- Run Quality - Any anomalies, protocol violations, or issues?
- Summary - Root cause classification and actionable improvements
Reports are saved to logs/loop/run-<id>/analysis/.
Disable with --no-postmortem for faster runs during development.
You can also run analysis on any previous run:
# Analyze the most recent run
loop-analyze
# Analyze a specific run
loop-analyze 20250122-143052
# Analyze an experiment run
loop-analyze 20250122-143052 --experiment
# Actually run the analysis (not just print the prompt)
loop-analyze --run
# Run experiment analysis and write report
loop-analyze 20250122-143052 --experiment --runSpecs should clearly describe what to build:
# Feature Name
**Status:** Draft | In Progress | Complete
**Last Updated:** YYYY-MM-DD
## Overview
Brief description of the feature.
## Requirements
- Requirement 1
- Requirement 2
## Technical Details
Implementation specifics, schemas, APIs, etc.
## Out of Scope
What this spec explicitly does NOT cover.Plans are checklists of tasks to complete:
# Feature Name - Implementation Plan
## Tasks
- [ ] Task 1 description
- [ ] Task 2 description
- [ ] Task 3 description
## Verification
- [ ] Manual test 1
- [ ] Manual test 2
## Notes
Any context for the implementing agent.
## Blockers Discovered
| Type | Location | Description |
|------|----------|-------------|
| PROD_BUG | file:line | Brief description |
| TEST_INFRA | package/tool | What's missing |Task markers:
[ ]not started[x]complete[R]reviewed[~]blocked/partial (add entry to Blockers Discovered)[ ]?optional/manual QA (doesn't block completion)
| Variable | Description |
|---|---|
LOOP_CONFIG |
Path to config file (alternative to --config) |
- Be specific about data shapes, APIs, and behavior
- Include examples where helpful
- Define edge cases explicitly
- Mark ambiguous areas clearly
- Keep tasks atomic (one commit each)
- Order by dependency, not preference
- Include verification steps
- Add notes if context is needed
- Check
logs/loop/run-<id>/iter-NN.logfor the failing iteration - Look at
iter-NN.tail.txtfor the last 200 lines - Review
analysis/summary.mdif postmortem ran - Check if the spec was ambiguous or the plan too vague
- Use
--iterations 5when testing - Use
--no-postmortemduring development - Use
--model sonnetfor faster (cheaper) iterations on simpler tasks
| Code | Meaning |
|---|---|
| 0 | Completed successfully |
| 130 | Interrupted (SIGINT/Ctrl+C) |
| 143 | Terminated (SIGTERM) |
| Other | Claude CLI exit code |
The loopd daemon provides a more robust orchestration layer with:
- Persistent state via SQLite
- Concurrent run management (2-5 runs)
- Crash recovery and resumable runs
- HTTP API with SSE streaming
- Automatic worktree and branch management
# Build and install Rust binaries (requires cargo)
./install.sh --daemon
# Or build manually
cargo build --release
cp target/release/loopd target/release/loopctl ~/.local/bin/# Start the daemon
loopd
# In another terminal, use loopctl to manage runs
loopctl run specs/my-feature.md
# List runs
loopctl list
loopctl list --status RUNNING
# Inspect a run
loopctl inspect <run_id>
# Control runs
loopctl pause <run_id>
loopctl resume <run_id>
loopctl cancel <run_id>
# Stream live output
loopctl tail <run_id> -f| Command | Description |
|---|---|
loopctl run <spec> [plan] |
Start a new run |
loopctl list [--status] |
List runs |
loopctl inspect <run_id> |
Show run details |
loopctl pause <run_id> |
Pause a running run |
loopctl resume <run_id> |
Resume a paused run |
loopctl cancel <run_id> |
Cancel a run |
loopctl tail <run_id> |
Stream run output |
loopctl run specs/feature.md \
--name "my-feature" \ # Explicit run name
--name-source haiku \ # Auto-name via haiku or spec_slug
--base-branch main \ # Base branch for the run branch
--run-branch-prefix "run/" \ # Prefix for run branches
--merge-target agent/feature \ # Target branch to merge into
--merge-strategy squash \ # Merge strategy: none, merge, squash
--worktree-provider branch \ # branch (default), auto, worktrunk, git
--worktree-path-template "../{{ repo }}.{{ run_branch | sanitize }}"| Variable | Description |
|---|---|
LOOPD_ADDR |
Daemon address (default: http://127.0.0.1:7700) |
LOOPD_TOKEN |
Auth token for daemon API |
LOOPD_AUTH_TOKEN |
Auth token (daemon side) |
Data is stored at ~/.local/share/loopd/:
loopd.db- SQLite databaseruns/run-<id>/- Global artifact mirror
The original bin/loop bash script remains available and works standalone. The daemon is optional and provides additional features for managing multiple concurrent runs with persistent state. Both use the same .loop/config and artifact layouts.
- gritty - AI-powered git commits
- Claude Code - The underlying AI agent
- gum - Terminal UI toolkit
MIT