Skip to content

namanhere23/DevMatrix

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

46 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Multi-Agent Swarm Orchestration Β· Self-Correcting AI Β· Real-Time Observability

5 specialized agents. 4 AI providers. 1 unified pipeline. Zero compromises.

---

✦ What is NexusSentry?

NexusSentry is a production-grade multi-agent orchestration system that replaces single-model AI with a coordinated hive of specialized agents β€” each an expert in its domain, each talking to the best available LLM for its role.

Where most AI tools give you one brain solving everything, NexusSentry gives you an engineering team: a decomposer, a planner, an executor, a QA scorer, a quality gatekeeper, and an observability integrator β€” all wired together with a self-correcting feedback loop that iterates until the output is actually good.

Without NexusSentry          With NexusSentry
─────────────────            ─────────────────────────────────────────
User β†’ GPT-4 β†’ Result        User β†’ Scout β†’ Architect β†’ Builder
                                         ↑                    ↓
                             Architect ← Critic ← Verifier ← QA
                                         ↓
                             Integrator β†’ Tracer β†’ Dashboard

"5 specialized agents. 4 AI providers. 12+ tool calls. Direct Integrator-to-Tracer observability pipeline. 1 human approval. Under 90 seconds."


⚑ v3.0 β€” What's New

Feature Details
πŸ”— Integrator Agent New agent after Critic β€” maps approved output directly to Agent Tracer for structured, real-time observability
🧠 Swarm Memory Thread-safe shared context across all sub-tasks and agents
⚑ Parallel Execution Sub-tasks run concurrently via asyncio.gather β€” up to 4Γ— faster
πŸ“Š Enhanced Dashboard Provider analytics, interactive Critic score trends, live agent graph
πŸ”„ Smarter Fallback 4-provider chain with automatic degradation β€” never a dead end
🎭 Full Mock Mode Complete demo with zero API keys β€” no friction for evaluators

πŸ€– The Swarm

Five agents. Each a specialist. Each talking to the best model for its job.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         NEXUSSENTRY SWARM                           β”‚
β”‚                                                                     β”‚
β”‚  πŸ” Scout          πŸ—οΈ Architect      πŸ”§ Builder                     β”‚
β”‚  Task Decomposer   Technical Planner  Code Executor                 β”‚
β”‚  └─ Gemini         └─ OpenRouter      └─ Auto                       β”‚
β”‚                                                                     β”‚
β”‚  βœ… Verifier        πŸ“‹ Critic          πŸ”— Integrator                 β”‚
β”‚  QA Scorer         Quality Gate       Result Mapper                 β”‚
β”‚  └─ Grok           └─ Grok            └─ Auto β†’ Tracer              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Agent Role Responsibility Provider
πŸ” Scout Task Decomposer Breaks a goal into 3–5 precise, actionable sub-tasks πŸ’Ž Gemini
πŸ—οΈ Architect Technical Planner Designs the execution plan for each sub-task 🌐 OpenRouter
πŸ”§ Builder Executor Generates and runs code to fulfill the plan πŸ”„ Auto
βœ… Verifier QA Scorer Tests output against acceptance criteria with a numeric score 🧠 Grok
πŸ“‹ Critic Quality Gate Approves or rejects β€” feeds rejection reason back to Architect 🧠 Grok
πŸ”— Integrator Result Mapper Maps approved output and routes events directly to Agent Tracer πŸ”„ Auto

πŸ—οΈ Architecture

graph TB
    User["πŸ‘€ User<br/>(CLI / App)"]
 
    subgraph ProviderLayer["πŸ€– Multi-Provider AI Layer"]
        Gemini["πŸ’Ž Gemini"]
        Grok["🧠 Grok"]
        OpenRouter["🌐 OpenRouter"]
        Anthropic["πŸ€– Anthropic"]
    end
 
    subgraph AgentSwarm["🧠 Agent Swarm"]
        Scout["πŸ” Scout<br/>Task Decomposer"]
        Architect["πŸ—οΈ Architect<br/>Technical Planner"]
        Builder["πŸ”§ Builder<br/>Executor"]
        QAVerifier["βœ… QA Verifier<br/>Quality Scorer"]
        Critic["πŸ“‹ Critic<br/>Quality Gate"]
        Integrator["πŸ”— Integrator<br/>Result Mapper"]
    end
 
    subgraph Observability["πŸ“Š Observability"]
        Tracer["Agent Tracer<br/>JSONL Logs"]
        Dashboard["Web Dashboard<br/>Real-Time UI"]
    end
 
    User -->|"goal"| Scout
    Scout -->|"sub-tasks"| Architect
    Architect -->|"plan"| Builder
    Builder -->|"generated code"| QAVerifier
    QAVerifier -->|"score + issues"| Critic
    Critic -->|"approve βœ…"| User
    Critic -->|"reject + feedback"| Architect
    Critic -->|"approved output"| Integrator
    Integrator -->|"mapped result"| Tracer
 
    Scout -.->|"LLM call"| ProviderLayer
    Architect -.->|"LLM call"| ProviderLayer
    Critic -.->|"LLM call"| ProviderLayer
 
    Scout -.->|"events"| Tracer
    Architect -.->|"events"| Tracer
    Builder -.->|"events"| Tracer
    QAVerifier -.->|"events"| Tracer
    Critic -.->|"events"| Tracer
    Tracer -.->|"polls"| Dashboard
 
    style ProviderLayer fill:#1a1030,stroke:#a855f7,stroke-width:2px
    style AgentSwarm fill:#1a1040,stroke:#6366f1,stroke-width:2px
    style Observability fill:#101830,stroke:#06b6d4,stroke-width:2px
Loading

πŸ”„ Agent Flow (Per Sub-Task)

The self-correcting loop is what separates NexusSentry from basic LLM wrappers. Every sub-task goes through up to 3 full iterations before being accepted or passed through.

sequenceDiagram
    participant U as πŸ‘€ User
    participant S as πŸ” Scout
    participant P as πŸ€– Provider
    participant A as πŸ—οΈ Architect
    participant B as πŸ”§ Builder
    participant Q as βœ… Verifier
    participant C as πŸ“‹ Critic
    participant I as πŸ”— Integrator
 
    U->>S: Submit goal
    S->>P: Decompose (via Gemini)
    P-->>S: Sub-tasks JSON
    S->>A: Sub-task 1
 
    loop Max 3 attempts (retry if rejected)
        A->>P: Plan (via OpenRouter)
        P-->>A: Execution plan
        A->>B: Send plan
        B->>B: Generate code (LLM)
        B->>Q: Submit for scoring
        Q->>Q: Deterministic QA checks
        Q-->>C: QA score + issues
        C->>P: Review execution (via Grok)
        P-->>C: Verdict
 
        alt QA β‰₯ 70 AND Critic β‰₯ 72
            C-->>U: βœ… Approved
            C->>I: Pass approved output
            I->>I: Map result to Tracer
        else Score < threshold
            C-->>A: ❌ Rejected + QA+Critic feedback
            Note over A: Next attempt with improvements
        else All 3 attempts exhausted
            C-->>U: ⏭️ Best attempt (pass-through)
            C->>I: Pass best-attempt output
            I->>I: Map result to Tracer
        end
    end
Loading

πŸ”€ Multi-Provider AI

No single provider is the best at everything. NexusSentry routes each agent to its optimal model β€” and falls back automatically when a provider is down.

graph TB
    subgraph Agents["Agent Swarm"]
        Scout["πŸ” Scout"]
        Architect["πŸ—οΈ Architect"]
        Critic["πŸ“‹ Critic"]
        Builder["πŸ”§ Builder"]
        Integrator["πŸ”— Integrator"]
    end
 
    Provider["πŸ”€ LLM Provider<br/>Auto-Router"]
 
    Scout -->|"prefer: gemini"| Provider
    Architect -->|"prefer: openrouter"| Provider
    Critic -->|"prefer: grok"| Provider
    Builder -->|"prefer: auto"| Provider
    Integrator -->|"prefer: auto"| Provider
 
    subgraph Backends["Available Backends"]
        G["πŸ’Ž Gemini<br/>gemini-2.0-flash"]
        K["🧠 Grok<br/>grok-3-mini-fast"]
        O["🌐 OpenRouter<br/>100+ models"]
        A["πŸ€– Anthropic<br/>claude-sonnet"]
        M["🎭 Mock<br/>Demo Mode"]
    end
 
    Provider -->|"priority 1"| G
    Provider -->|"priority 2"| K
    Provider -->|"priority 3"| O
    Provider -->|"priority 4"| A
    Provider -->|"no keys"| M
 
    style Agents fill:#1a1040,stroke:#6366f1,stroke-width:2px
    style Backends fill:#0d1a2d,stroke:#06b6d4,stroke-width:2px
    style Provider fill:#2d1030,stroke:#a855f7,stroke-width:2px
Loading

If all providers are unavailable, Mock Mode activates automatically β€” the full demo still runs, every agent fires, the dashboard still populates. Zero dead demos.


πŸš€ Quick Start

Prerequisites

  • Python 3.11+
  • At least ONE LLM API key (Gemini recommended β€” it's free)

1 β€” Install

git clone https://github.com/namanhere23/DevMatrix
cd DevMatrix
 
python -m venv .venv
source .venv/bin/activate      # Windows: .venv\Scripts\activate
 
pip install -r requirements.txt
 
cp .env.example .env
# Open .env and drop in at least one API key

2 β€” Get a Key (Pick One)

Provider Link Cost
πŸ’Ž Gemini (recommended) aistudio.google.com/apikey Free tier
🧠 Grok console.x.ai Free credits
🌐 OpenRouter openrouter.ai/keys Pay-per-use
πŸ€– Anthropic console.anthropic.com Pay-per-use

3 β€” Run

# Interactive (recommended for first run)
python demo/run_demo.py
 
# Fully automated β€” great for live demos
python demo/run_demo.py --auto
 
# Custom task
python demo/run_demo.py --auto --goal "Refactor the auth module to use JWT"
 
# Direct
python -m nexussentry.main "Your goal here"

Dashboard

The moment the swarm starts, a real-time dashboard opens at http://localhost:7777

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  NexusSentry Dashboard  ● LIVE           β”‚
β”‚                                          β”‚
β”‚  Agents    β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘  4/5 active    β”‚
β”‚  Tasks     β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘  3/6 done      β”‚
β”‚  Score     β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  94 / 100      β”‚
β”‚                                          β”‚
β”‚  Scout     βœ“  Architect  βœ“  Builder  ●   β”‚
β”‚  Verifier  ●  Critic     β—‹  Integrator β—‹ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Features: live agent feed Β· task progress Β· approval counters Β· provider analytics Β· Critic score trend Β· architecture diagram


πŸ“‚ Project Structure

DevMatrix/
β”œβ”€β”€ nexussentry/
β”‚   β”œβ”€β”€ main.py                  # 🎯 Swarm orchestrator β€” start here
β”‚   β”œβ”€β”€ providers/
β”‚   β”‚   └── llm_provider.py      # πŸ”€ Gemini / Grok / OpenRouter / Anthropic router
β”‚   β”œβ”€β”€ agents/
β”‚   β”‚   β”œβ”€β”€ scout.py             # πŸ” Task decomposition      β†’ Gemini
β”‚   β”‚   β”œβ”€β”€ architect.py         # πŸ—οΈ  Technical planning      β†’ OpenRouter
β”‚   β”‚   β”œβ”€β”€ fixer.py             # πŸ”§ Code execution          β†’ Auto
β”‚   β”‚   β”œβ”€β”€ critic.py            # πŸ“‹ Quality review          β†’ Grok
β”‚   β”‚   └── integrator.py        # πŸ”— Result mapping          β†’ Agent Tracer
β”‚   β”œβ”€β”€ hitl/
β”‚   β”‚   └── user_permission.py   # πŸ‘€ Human-in-the-loop gate
β”‚   β”œβ”€β”€ observability/
β”‚   β”‚   β”œβ”€β”€ tracer.py            # πŸ“Š JSONL event log + provider tracking
β”‚   β”‚   β”œβ”€β”€ dashboard.py         # 🌐 Zero-dependency HTTP server
β”‚   β”‚   └── static/index.html    # ✨ Real-time dashboard UI
β”‚   └── utils/
β”‚       └── response_cache.py    # πŸ’Ύ MD5-keyed LLM response cache
β”œβ”€β”€ demo/
β”‚   └── run_demo.py              # 🎬 One-command demo runner
β”œβ”€β”€ .env.example                 # All provider keys, documented
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ Containerfile                # Docker-ready
└── README.md

🐳 Docker

docker build -f Containerfile -t nexussentry .
docker run --env-file .env -p 7777:7777 nexussentry

πŸ”‘ Core Technical Features

**πŸ”€ Multi-Provider AI Routing** 4 providers with agent-level preference and automatic fallback. No single point of failure.

πŸ”„ Self-Correcting Feedback Loop Critic rejects β†’ sends specific feedback β†’ Architect replans β†’ up to 3 iterations before pass-through.

πŸ”— Integrator β†’ Tracer Pipeline Every approved result is immediately mapped and routed to Agent Tracer β€” structured observability with zero manual wiring.

πŸ“Š Real-Time Dashboard Zero-dependency HTTP server. Glassmorphism UI. No external services needed.

**πŸ’Ύ Response Caching** MD5-keyed disk cache. API outage during a demo? Cached responses keep the show running.

βœ… Deterministic QA HTML/CSS selector validation + error detection before Critic review β€” catches structural failures before LLM review.

🎭 Mock Mode Full swarm runs with zero API keys. Every agent fires, the dashboard populates, the loop completes.

βš™οΈ Graceful Degradation Every component has a fallback path. Nothing crashes. The swarm always returns a result.

---

πŸ“Š Numbers

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                                            β”‚
β”‚   5    specialized agents                  β”‚
β”‚   4    LLM providers with auto-fallback    β”‚
β”‚  12+   tool calls per task                 β”‚
β”‚   3    max self-correction iterations      β”‚
β”‚   1    direct Integrator β†’ Tracer hop      β”‚
β”‚  <90s  end-to-end execution                β”‚
β”‚   0    external services required          β”‚
β”‚                                            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors