The embeddable agent engine that learns.
Zero-dependency AI agent orchestration with parallel DAG execution,
skill genetics, ontology routing, multi-stage review, and safety guards.
Quick Start • Why Agent Swarm • Features • Skill Packs • MCP • Architecture
from agent_swarm import Swarm, SubTask
result = await Swarm(llm=my_llm).run("Analyze competitors", tasks=[
SubTask(id="research", description="Find top 5 competitors", role="Researcher"),
SubTask(id="compare", description="Compare strengths", role="Analyst", dependencies=["research"]),
SubTask(id="report", description="Write recommendation", role="Writer", dependencies=["compare"]),
])pip install agent-swarm-core"I need agents that run in parallel, wait for approval, stay within budget, and get better over time."
Agent Swarm is built for workflows where:
- Multiple agents research, analyze, and write in parallel
- Some tasks need human approval before continuing
- You need budget caps so LLM costs don't spiral
- Outputs must pass schema validation before delivery
- The engine learns from every run and gets better over time
- A multi-stage review pipeline catches issues before shipping
- Safety guards block destructive operations automatically
| Use case | What Agent Swarm does |
|---|---|
| Competitor analysis | 3 agents research in parallel -> analyst compares -> writer produces report -> reviewer approves |
| Code review pipeline | Scanner finds issues -> reviewer prioritizes -> writer produces fix suggestions -> lead approves |
| Ship pipeline | Test -> multi-role review -> version bump -> changelog -> commit -> push with checkpoints |
| Product discovery | Researcher explores market -> analyst identifies opportunities -> strategist writes brief -> PM approves |
| QA health check | Scan codebase -> classify issues -> compute health score -> generate report with recommendations |
| Retrospective | Collect telemetry -> analyze patterns -> extract lessons -> generate action items |
|
Use Agent Swarm when you want a lightweight orchestrator embedded inside your product — not a platform. Use something else when you need 300+ integrations (LangGraph), visual no-code building (CrewAI Studio), or enterprise distributed runtime (LangGraph Cloud). |
pip install agent-swarm-coreNo ChromaDB, no Neo4j, no torch, no LangChain. Just Python.
OpenAI
from openai import AsyncOpenAI
client = AsyncOpenAI()
async def llm(prompt, tools=None):
r = await client.chat.completions.create(
model="gpt-4o-mini", messages=[{"role":"user","content":prompt}], max_tokens=1000)
return r.choices[0].message.contentClaude (Anthropic)
from anthropic import AsyncAnthropic
client = AsyncAnthropic()
async def llm(prompt, tools=None):
r = await client.messages.create(
model="claude-sonnet-4-20250514", max_tokens=1000,
messages=[{"role":"user","content":prompt}])
return r.content[0].textLocal models (Ollama, vLLM, llama.cpp)
async def llm(prompt, tools=None):
return your_model.generate(prompt)from agent_swarm import Swarm, SubTask
result = await Swarm(llm=llm).run(
"Research AI agent market and recommend strategy",
tasks=[
SubTask(id="research", description="Find top 5 AI agent frameworks",
role="Researcher"),
SubTask(id="compare", description="Compare strengths and weaknesses",
role="Analyst", dependencies=["research"]),
SubTask(id="recommend", description="Write startup recommendation",
role="Writer", dependencies=["compare"]),
]
)
print(result["final_output"])async def slack_approval(task_id, description, role):
return await ask_slack_channel(f"Approve '{description}' by {role}?")
result = await Swarm(llm=llm, approval_callback=slack_approval).run(
"Produce quarterly report",
tasks=[
SubTask(id="draft", description="Write Q1 report draft", role="Writer"),
SubTask(id="review", description="Review for accuracy", role="Reviewer",
dependencies=["draft"], requires_approval=True),
SubTask(id="publish", description="Finalize and publish", role="Publisher",
dependencies=["review"], requires_approval=True),
]
)from agent_swarm import Swarm, SkillGenetics, SkillBank
bank = SkillBank()
genetics = SkillGenetics(bank)
swarm = Swarm(llm=llm, skill_bank=bank, genetics=genetics)
# Run 10 times -- the engine extracts skills from successes and failures
for i in range(10):
result = await swarm.run("Research AI trends")
report = genetics.effectiveness_report()
print(f"Verdict: {report['verdict']}") # "effective" / "emerging"
print(f"Fitness delta: {report['fitness']['delta']:+.3f}")from agent_swarm import ReviewPipeline, ReviewStage, ReviewRole
pipeline = ReviewPipeline(
stages=[
ReviewStage(name="spec", gates=[ReviewRole.SPEC_COMPLIANCE], pass_threshold=0.8),
ReviewStage(name="security", gates=[ReviewRole.SECURITY], pass_threshold=0.9),
ReviewStage(name="quality", gates=[ReviewRole.CODE_QUALITY, ReviewRole.DESIGN]),
],
reviewers={...}, # map ReviewRole -> async reviewer function
)
result = await pipeline.run(run_id, proof)
print(f"Passed: {result.passed}, Score: {result.overall_score:.1%}")from agent_swarm import Swarm, CarefulGuard, FreezeGuard, GuardChain
guards = GuardChain([
CarefulGuard(), # blocks rm -rf, DROP TABLE, force-push, etc.
FreezeGuard(frozen_paths=["/production"]), # locks critical directories
])
swarm = Swarm(llm=llm, safety_guards=guards)|
Tasks run in topological waves with maximum concurrency. Attention Residuals pattern enables selective context access across waves. Any task can require approval. Rejected work stops cleanly. Full audit trail. Set Chain review stages with different roles (spec, security, quality, design, CEO). Conditional skip, auto-retry, human escalation. |
Skills mutate, crossover, face adversarial testing, compete in tournaments. Full lineage tracking across generations. SKOS-style vocabulary controls agent capabilities. 3 modes: SOFT (log), WARN (count), STRICT (block). Nested JSON schemas, Policy-based context filtering limits what each agent sees. Wave history, selective items, character budgets, role filters. |
Issue taxonomy (CRITICAL→INFO), weighted health score (100 - deductions), grade system (A-F), QA review gate integration.
Connect to Claude Desktop, Cursor, or any MCP client. Full CLI for quick runs. |
Eight new modules inspired by Superpowers and gstack:
| Module | What it does |
|---|---|
| ReviewPipeline | Multi-stage review with conditional skip, auto-retry, and human escalation |
| ContextFilter | Policy-based context isolation (wave history, character budgets, role filters) |
| SpecGate | HARD-GATE: validate specs before implementation begins |
| Safety Guards | Destructive command detection + directory freeze locks |
| QA System | Issue taxonomy, health scoring (0-100), QA review gate |
| Telemetry | Thread-safe JSONL event logging with aggregation |
| Retro | Retrospective reports from telemetry (success rate, failure patterns, suggestions) |
| Ship Pipeline | Checkpoint-based test→review→version→changelog→commit→push |
| Templates | Structured output rendering (QA Report, Design Review, Retro Report, TODO List) |
| SkillChain | Sequential skill chaining from playbooks |
| SkillSuggestor | Proactive skill/playbook recommendations from goal text |
Plus 8 new skills (brainstorm, review, qa, ship, retro, investigate, safety, deploy) bringing the total from 4 to 12 skills.
Install domain-specific skills and ontology terms with one command.
python -m agent_swarm --packs # List packs
python -m agent_swarm --add research-pack # Install
python -m agent_swarm --with-pack pm-pack "Define strategy" # Run with pack| Pack | Skills | What it adds |
|---|---|---|
research-pack |
4 | Source verification, quantitative extraction, competitive landscape, synthesis |
review-pack |
4 | Security scan, code quality, compliance check, fix suggestions |
pm-pack |
5 | Market discovery, positioning, PRD writing, launch planning, north star |
Programmatic usage
from agent_swarm import Swarm, PackManager, OntologyRegistry, CORE_ONTOLOGY, SkillGenetics
pm = PackManager()
pm.install("research-pack")
bank, bundles = pm.apply()
swarm = Swarm(
llm=my_llm,
skill_bank=bank,
genetics=SkillGenetics(bank),
ontology=OntologyRegistry([CORE_ONTOLOGY] + bundles),
)
result = await swarm.run("Research AI market trends")Connect Agent Swarm to Claude Desktop, Claude Code, Cursor, and any MCP-compatible tool.
python -m agent_swarm.mcp_server # Start MCP server
python -m agent_swarm.mcp_server --setup # Show setup guideClaude Desktop config
{
"mcpServers": {
"agent-swarm": {
"command": "python",
"args": ["-m", "agent_swarm.mcp_server"],
"env": {"OPENAI_API_KEY": "sk-your-key"}
}
}
}5 MCP tools: swarm_run | swarm_playbook | swarm_ontology | swarm_skills | swarm_status
CLI (__main__.py) MCP Server (mcp_server.py)
| |
+------+-----------------------+------+
| Swarm Orchestrator |
| (core.py) |
+--+--------+--------+--------+-------+
| | | |
+--------+ +----+---+ +-+------+ +--------+
| Skills | |Ontology| |Genetics| |Run |
| Bank | |Registry| |Engine | |Machine |
+--------+ +--------+ +--------+ +--------+
| | | |
+--------+ +----+---+ +-+------+ +--------+
|Review | |Context | |Safety | |Ship |
|Pipeline| |Filter | |Guards | |Pipeline|
+--------+ +--------+ +--------+ +--------+
| | | |
+--+--------+--------+--------+-------+
| Infrastructure Layer |
| Cache | Memory | Telemetry | Durable |
| QA | Retro | Templates | Tracing |
+-------------------------------------+
Full module breakdown (44 modules, 11.1K LOC)
| Module | Lines | Purpose |
|---|---|---|
core.py |
1,179 | Swarm engine, DAG executor, Attention Residuals |
run_machine.py |
650 | State machine runner, proof bundles, SpecGate |
mcp_server.py |
439 | MCP protocol server (zero deps) |
ontology.py |
416 | SKOS registry, 3-mode gate |
skills.py |
500 | SkillBank, 6-gate promotion, SkillChain, SkillSuggestor |
tools.py |
360 | 6 built-in tools, security hardened |
genetics.py |
330 | Crossover, adversarial, tournament |
tracing.py |
296 | Distributed execution tracing |
tracker.py |
295 | Webhook triggers, HMAC auth |
validation.py |
217 | Schema, $ref, cross-field rules |
review.py |
170 | Multi-stage review pipeline |
ship.py |
190 | Ship pipeline with checkpoints |
safety.py |
180 | Destructive command guards, directory freeze |
qa.py |
200 | Issue taxonomy, health scoring |
context_filter.py |
150 | Policy-based context isolation |
retro.py |
150 | Retrospective from telemetry |
templates.py |
120 | Structured output templates |
telemetry.py |
120 | JSONL event logging |
playbooks.py |
200 | 14 SOP playbooks |
... |
25 more modules |
pytest tests/ -q # 491 tests, 0 failures| Skill | Purpose |
|---|---|
scout |
Reconnaissance — define mission, decompose tasks |
guard |
Quality protection — tests, policy compliance |
evolve |
Continuous improvement — fixes, lessons, next cycle |
swarm-cycle |
Full Scout → Build → Guard → Evolve workflow |
brainstorm |
Multi-perspective idea exploration |
review |
Multi-role code/spec review with scoring |
qa |
Issue taxonomy + health score QA |
ship |
Test → review → version → commit → push |
retro |
Weekly retrospective with stats + lessons |
investigate |
Root cause analysis + evidence chain |
safety |
Destructive command warnings + directory locks |
deploy |
Deployment verification + rollback plan |
See examples/ for runnable scripts:
| Example | What it shows |
|---|---|
04_competitor_analysis.py |
5-agent parallel research with budget cap |
05_approval_workflow.py |
Content pipeline with human approval |
06_code_review_pipeline.py |
Code scan -> prioritize -> fix suggestions |
07_pack_pm_workflow.py |
PM discovery -> PRD -> launch with pm-pack |
with_openai.py |
Real OpenAI GPT-4 connection |
with_claude.py |
Real Anthropic Claude connection |
For teams that need dashboards, persistent storage, approval UI, eval framework, and hosted execution.
The core engine is MIT-licensed and always will be.
See CONTRIBUTING.md.
MIT