🧠 NeuralMind

Semantic code intelligence for AI coding agents — smart context retrieval + tool-output compression in one package.

NeuralMind turns a code repository into a queryable neural index. AI agents use it to answer code questions in ~800 tokens instead of loading 50,000+ tokens of raw source.

🆕 New in v0.9.0 — Enterprise-Ready. GHCR auto-built multi-platform container image (docker pull ghcr.io/dfrostar/neuralmind:latest), CycloneDX SBOM attached to every release, air-gapped install walkthrough, and a compliance one-pager consolidating NIST AI RMF + SOC 2 + GDPR claims. Release notes

v0.8.0 — Always-On. neuralmind watch + neuralmind serve run as first-class services with committed systemd + launchd templates + a Windows Task Scheduler walkthrough in the Scheduling Guide + a /healthz endpoint for Docker HEALTHCHECK and systemd ExecStartPost probes. Release notes

v0.7.0 — Install anywhere. Five install paths now in the README: pip, pipx, uv, Docker, and source. Same package every path; smoke-test verified. Release notes · Install matrix ↓

v0.6.0 — Obsidian-style graph view with a live activity feed. neuralmind serve streams synapse + file events to the canvas in real time, so you can watch the brain learning your codebase. Release notes · Graph view section ↓

🌐 Visit the landing page • 📖 Read the About page • ⚖️ Not affiliated with NeuralMind.ai

⚡ 30-second proof

Don't trust the headline number — reproduce it. One command on a freshly cloned checkout:

git clone https://github.com/dfrostar/neuralmind && cd neuralmind
bash scripts/demo.sh

The script creates an isolated venv, installs the deps, builds the index for the bundled fixture (tests/fixtures/sample_project/), and runs three real questions. Output looks like:

  Q: How does authentication work in this codebase?
     naive = 4,736 tok   neuralmind =  829 tok   reduction =   5.7×
  Q: What are the main API endpoints?
     naive = 4,736 tok   neuralmind =  923 tok   reduction =   5.1×
  Q: Explain the billing flow from a user perspective.
     naive = 4,736 tok   neuralmind =  826 tok   reduction =   5.7×

  Average reduction:   5.5×  across 3 queries
  Avg context size:    859 tokens  (vs 4,736 naive)
  Est. monthly saved:  ~$34.89  @ 100 queries/day on Claude 3.5 Sonnet
  Wall time:           0.85s

The fixture is intentionally small (~500 lines) — it catches regressions in CI. Real repos consistently hit 40–70× on the same pipeline (benchmarks · community submissions). Once the demo convinces you, run it on your own code:

pip install neuralmind graphifyy
cd /path/to/your-repo
graphify update . && neuralmind build .
neuralmind benchmark . --contribute

📊 The fact-based case

Two docs you should read before forming an opinion. Both are linked from this section so you can pick what you need:

docs/BUSINESS-CASE.md — the compelling pitch, with provable numbers. Every claim is a single command away from being verified on your own code. ROI math with assumptions you can change. Three concrete scenarios. Read this if you're evaluating whether to bring NeuralMind to your team.
docs/HONEST-ASSESSMENT.md — the skeptic's companion. When NeuralMind isn't worth installing. What "40–70×" actually means (and doesn't). Where the community-benchmark sample is too small to extrapolate. Read this if you want to know what could go wrong before adopting.

The headline you can stand on: retrieval reduction is measured in CI on every commit (open any closed PR in the PR list — each one has a sticky benchmark comment with current numbers) and reproduces in 30 seconds on a fresh clone via bash scripts/demo.sh. Real-world repos have submitted 46–66× but n=2 — your number comes from neuralmind benchmark . --contribute on your code.

🔒 Security & Compliance

For enterprises and regulated industries:

100% Local Processing – Your code never leaves your machine. All embeddings are generated and stored locally using ChromaDB.
No External APIs – NeuralMind runs completely offline. No cloud services, no telemetry, no data exfiltration.
Explainable AI – Every context decision is auditable. Know exactly which code was retrieved (Extracted) vs. inferred by the model.
Open-Source & MIT Licensed – Full transparency. No hidden clauses, no vendor lock-in. Audit the code yourself.
GDPR/HIPAA-Friendly – Process sensitive code without compliance concerns. All data stays under your control.

For CTOs & Security Teams:

✅ Zero external dependencies for code storage
✅ Runs behind your firewall or on-premise
✅ No collection of usage metrics or code patterns
✅ Predictable, reproducible results (not a black box)
✅ Works with proprietary and sensitive code

See full security policy

📌 For AI Coding Agents

If you are Claude Code, Codex, Cursor, Cline, or any AI agent with NeuralMind installed, start here.

Check if NeuralMind is active

neuralmind stats .

Expected output when ready:

Project: your-project
Built: True
Nodes: 241

If Built: False, run graphify update . && neuralmind build . first.

Startup workflow

Every new session, do this first:

neuralmind wakeup .

Or via MCP:

neuralmind_wakeup(project_path=".")

This returns ~365–600 tokens of structured project context:

Project name and description (from CLAUDE.md, mempalace.yaml, or README.md first line)
How many code entities and clusters are indexed
Architecture overview: top 10 code clusters with their entity types and sample names
Sections from graphify-out/GRAPH_REPORT.md if present

Use this output as your orientation before writing any code. It replaces reading the entire repository.

Decision tree — which tool to call

Need to understand the project?
  └─► neuralmind wakeup .               (MCP: neuralmind_wakeup)      ~400 tokens

Answering a specific code question?
  └─► neuralmind query . "question"     (MCP: neuralmind_query)       ~800–1100 tokens

About to open a source file?
  └─► neuralmind skeleton <file>        (MCP: neuralmind_skeleton)    ~5–15× cheaper than Read
      → Only fall back to Read when you need the actual implementation body
      → Use NEURALMIND_BYPASS=1 when you truly need raw source

Answering a complex, multi-part question?
  └─► neuralmind recursive-query . "q"  (MCP: neuralmind_recursive_query)  decomposes + synthesizes

Question about reference documents (PDFs, legal, clinical)?
  └─► neuralmind query-docs . "q"       (MCP: neuralmind_query_docs)       searches doc index only

Searching for a specific function/class/entity?
  └─► neuralmind search . "term"        (MCP: neuralmind_search)      ranked by semantic similarity

Made code changes and need to update the index?
  └─► neuralmind build .                (MCP: neuralmind_build)       incremental — only re-embeds changed nodes

Understanding the output

`wakeup` / `query` output format

## Project: myapp

Full-stack web app for task management. Uses React 18, Node.js, and PostgreSQL.

Knowledge Graph: 241 entities, 23 clusters
Type: Code repository with semantic indexing

## Architecture Overview

### Code Clusters
- Cluster 5 (45 entities): function — authenticate_user, hash_password, verify_token
- Cluster 12 (23 entities): class — UserController, AuthMiddleware, SessionStore
- Cluster 3 (18 entities): function — createTask, updateTask, deleteTask
...

## Relevant Code Areas        ← query only; absent from wakeup
### Cluster 5 (relevance: 1.73)
Contains: function entities
- authenticate_user (code) — auth.py
- verify_token (code) — auth.py

## Search Results             ← query only
- AuthMiddleware (score: 0.91) — middleware.py
- jwt_handler (score: 0.85) — auth/jwt.py

---
Tokens: 847 | 59.0x reduction | Layers: L0, L1, L2, L3 | Communities: [5, 12]

Layer meanings:

Layer	Name	Always loaded	Content
L0	Identity	✅ yes	Project name, description, graph size
L1	Summary	✅ yes	Architecture, top clusters, GRAPH_REPORT sections
L2	On-demand	query only	Top 3 clusters most relevant to the query
L3	Search	query only	Semantic search hits (up to 10)

`skeleton` output format

# src/auth/handlers.py  (community 5, 8 functions)

## Functions
L12   authenticate_user   — Validates credentials and issues JWT
L45   verify_token        — Checks JWT signature and expiry
L78   refresh_token       — Issues new JWT from a valid refresh token
L102  logout              — Revokes refresh token in DB

## Call graph (within this file)
authenticate_user → verify_token, hash_password
refresh_token → verify_token

## Cross-file
verify_token imports_from → utils/jwt.py (high 0.95)
authenticate_user shares_data_with → models/user.py (high 0.91)

[Full source available: Read this file with NEURALMIND_BYPASS=1]

Use skeleton to understand what a file does, how its functions relate, and which other files it depends on — without consuming tokens on the full source body.

`search` output format

1. authenticate_user (function) - score: 0.92
   File: auth/handlers.py  Community: 5

2. AuthMiddleware (class) - score: 0.87
   File: auth/middleware.py  Community: 5

3. hash_password (function) - score: 0.81
   File: utils/crypto.py  Community: 5

PostToolUse hooks — what happens automatically

If neuralmind install-hooks has been run for this project (check for .claude/settings.json), Claude Code automatically compresses tool outputs before you see them:

Tool	What happens	Typical savings
Read	Raw source → graph skeleton (functions, rationales, call graph)	~88%
Bash	Full output → error lines + warning lines + last 3 lines + summary	~91%
Grep	Unlimited matches → capped at 25 + "N more hidden" pointer	varies

This is fully automatic — you do not need to call any extra tools.

To bypass compression for a single command (e.g., when you need the full file body):

NEURALMIND_BYPASS=1 <your command>

After making code changes

The index does not auto-update unless a git post-commit hook was installed with neuralmind init-hook .. After significant code changes, rebuild manually:

neuralmind build .          # incremental — only re-embeds changed nodes
neuralmind build . --force  # full rebuild — re-embeds everything

MCP tool quick reference

Tool	When to call	Required params	Returns
`neuralmind_wakeup`	Session start	`project_path`	L0+L1 context string, token count
`neuralmind_query`	Code question	`project_path`, `question`	L0–L3 context string, token count, reduction ratio
`neuralmind_search`	Find entity	`project_path`, `query`	List of nodes with scores, file paths
`neuralmind_skeleton`	Explore file	`project_path`, `file_path`	Functions + rationales + call graph + cross-file edges
`neuralmind_recursive_query`	Complex question	`project_path`, `question`	Synthesized answer, sub-queries, gaps, sources
`neuralmind_query_docs`	Reference docs	`project_path`, `question`	Relevant doc chunks with source files and relevance scores
`neuralmind_stats`	Check status	`project_path`	Built status, node count, community count
`neuralmind_build`	Rebuild index	`project_path`	Build stats dict
`neuralmind_benchmark`	Measure savings	`project_path`	Per-query token counts and reduction ratios

⚡ Two-phase optimization

┌─────────────────────────────────────────────────────────────┐
│ Phase 1: Retrieval — what to fetch                          │
│   neuralmind wakeup .    →  ~365 tokens (vs 50K raw)        │
│   neuralmind query "?"   →  ~800 tokens (vs 2,700 raw)      │
│   neuralmind_skeleton    →  graph-backed file view          │
├─────────────────────────────────────────────────────────────┤
│ Phase 2: Consumption — what the agent actually sees         │
│   PostToolUse hooks compress Read/Bash/Grep output          │
│   File reads → graph skeleton (~88% reduction)              │
│   Bash output → errors + summary (~91% reduction)           │
│   Search results → capped at 25 matches                     │
└─────────────────────────────────────────────────────────────┘

Combined effect: 5–10× total reduction vs baseline Claude Code.

🎯 The Problem

You: "How does authentication work in my codebase?"

❌ Traditional: Load entire codebase → 50,000 tokens → $0.15–$3.75/query
✅ NeuralMind: Smart context → 766 tokens → $0.002–$0.06/query

💰 Realistic savings

The dollar figures depend on your workload. Run neuralmind benchmark . --contribute to get numbers for your codebase and query volume. Order-of-magnitude expectations:

You today	NeuralMind likely saves	Setup pays back in
<$50/mo on LLM, small repo	$5–15/mo	months — probably skip
$50–500/mo, 10K+ line repo	$20–200/mo	days
$500–5,000/mo team workload	hundreds–thousands/mo	hours
Already using prompt caching + long context	smaller marginal win	measure first

These are directional. The Honest Assessment explains why retrieval-token reduction (40–70×) ≠ end-to-end cost reduction (3–10× typical), and when NeuralMind is and isn't worth installing.

✅ Does it work on your code? Prove it in 5 minutes.

NeuralMind benchmarks itself in CI on every PR. But your codebase isn't our fixture. The only way to know what it does for you is to measure it on your code.

pip install neuralmind graphifyy
cd /path/to/your-project
graphify update . && neuralmind build .
neuralmind benchmark .

You'll get back your actual reduction ratio and per-query token count — typically 30–80× on real repos. No telemetry, nothing uploaded, nothing committed. If the numbers don't justify it, pip uninstall neuralmind and move on — 5 minutes lost.

Want the dollar figure for your team?

neuralmind benchmark . --contribute

That flag produces a ready-to-share JSON blob with your project's numbers, the exact command that produced them, and an estimated monthly savings at your query volume. Paste it into Slack, a design doc, a PR — or optionally contribute it to the public leaderboard.

Full walkthrough: Does NeuralMind work on your codebase?

🚨 When do I reach for NeuralMind?

Two ways to decide: start with what's annoying you (symptoms), or start with what you're trying to achieve (goals).

Symptoms — "This is happening to me"

What you notice	Reach for	Why it fixes it
Claude Code hits context limits mid-task	`neuralmind install-hooks .`	Auto-compresses Read/Bash/Grep before the agent sees them (~88–91%)
My monthly LLM bill is climbing	`neuralmind query` + hooks	40–70× fewer tokens per code question
I start every session re-pasting project structure	`neuralmind wakeup .`	~400 tokens of orientation; pipe into any chat
Agent reads a 2,000-line file to answer about one function	`neuralmind skeleton <file>`	Functions + call graph, no body; ~88% cheaper than `Read`
`grep` floods the agent with hundreds of matches	`neuralmind install-hooks .`	Caps at 25 matches with "N more hidden" pointer
The agent is confidently wrong about what my code does	Start session with `wakeup`; ask with `query`	Grounds the model in real structure instead of guessing
I want to query my codebase from ChatGPT / Gemini	`neuralmind wakeup . \| pbcopy`	Model-agnostic output; paste into any chat
Retrieval feels random across similar questions	`neuralmind learn .`	Cooccurrence-based reranking adapts to your patterns
Index feels out of date after a refactor	`neuralmind build .` (or `init-hook` once)	Incremental — only re-embeds changed nodes

Goals — "What am I trying to solve for?"

If your goal is…	Do this	Expected outcome
Cut LLM spend on code Q&A	`install-hooks` + use `query` for questions	5–10× total reduction vs baseline agent
Faster, more grounded agent responses	`wakeup` at session start → `query` / `skeleton` during	Fewer hallucinations; less re-exploration
Keep all code local (no SaaS, no telemetry)	Default install — no extra config	100% offline; nothing leaves the machine
Work across Claude + GPT + Gemini with one index	Build once, pipe output into any model	Same context quality, model-agnostic
Make retrieval adapt to how your team queries	Enable memory (TTY prompt) + `neuralmind learn .`	Relevance improves on repeat patterns
Measure savings for a manager or stakeholder	`neuralmind benchmark . --json`	Per-query tokens, reduction ratios, dollar estimate
Auto-refresh the index as code changes	`neuralmind init-hook .` (git post-commit)	Every commit rebuilds incrementally

Still not sure?

You probably don't need NeuralMind if:

Your codebase is under ~5K tokens total (just paste the whole thing in).
You don't use an AI coding agent.
You only want inline completions — use Copilot or Cursor directly.

You almost certainly want NeuralMind if any row above describes a recurring frustration, or if your LLM bill has crossed the point where a 40–70× reduction is worth 5 minutes of setup.

See the use-case walkthroughs for step-by-step guides matched to your situation.

🏢 For organizations evaluating NeuralMind

If you're building a pitch for your team — finance, healthcare, legal, government, internal-platform, or just a large engineering org with a climbing LLM bill — start with docs/BUSINESS-CASE.md for the fact-based ROI argument and docs/ENTERPRISE.md for the regulated/on-premise/multi-team scenarios.

Both docs ground every claim in something you can verify with one command on your own code.

⚖️ NeuralMind vs. Alternatives

Short answers to "why not just use X?". Each row links to a deeper page.

Compared against	Short verdict
Cursor `@codebase`	Works only in Cursor; NeuralMind works in any agent and adds tool-output compression
Aider repo-map	Aider is syntactic only; NeuralMind adds semantic retrieval and compression
Sourcegraph Cody	Cody is server-hosted and org-wide; NeuralMind is local and per-project
Continue / Cline	Those are agent runtimes; NeuralMind is the context/compression layer underneath
GitHub Copilot	Copilot is hosted completions; NeuralMind is local context for any agent
Windsurf / Codeium	Vertically integrated IDE; NeuralMind is editor- and model-agnostic
Claude Projects	Projects reload all files every turn; NeuralMind retrieves only what the query needs
Prompt caching	Caching amortizes a big prompt; NeuralMind makes the prompt small — combine both
LangChain / LlamaIndex for code	Frameworks you assemble; NeuralMind is the assembled default for code agents
Long context windows (1M/2M)	Possible ≠ cheap — NeuralMind gives ~60× cost reduction on the same model
Generic RAG over a codebase	Text chunking loses structure; NeuralMind keeps the call graph
Tree-sitter / ctags / grep	Deterministic but syntactic; use alongside NeuralMind, not instead of

Full comparison index: docs/comparisons/.

🚀 Quick Start (humans)

Install — pick your path

NeuralMind installs five ways. The CLI, semantic indexing, and the MCP server (for Claude Code, Cursor, Cline, Continue, and any MCP client) come in every path.

Method	Command	When to pick
pip	`pip install neuralmind graphifyy`	Default. Drops it in your active env.
pipx	`pipx install neuralmind && pipx inject neuralmind graphifyy`	Global CLI, no env pollution. Recommended if you want `neuralmind` available everywhere.
uv	`uv pip install neuralmind graphifyy`	Modern, fast Python tooling. ~10× faster install than pip.
Docker	`docker pull ghcr.io/dfrostar/neuralmind:latest && docker run --rm -v "$PWD:/project:ro" ghcr.io/dfrostar/neuralmind:latest neuralmind --help`	Containerized — no Python on the host. Multi-platform (`linux/amd64` + `linux/arm64`); auto-published to GHCR on every release since v0.9.0. To build locally instead: `docker build -t neuralmind:dev .` and substitute that tag.
From source	`git clone … && pip install -e .`	Hacking on NeuralMind itself.

Verify install:

neuralmind --help     # works for every install path

# For pip / uv / source (a Python env where neuralmind is importable):
python -c "import neuralmind; print(neuralmind.__version__)"

The python -c line is skipped for pipx and Docker — pipx isolates the package in its own venv, and Docker doesn't expose the in-container Python.

Walkthrough with pros/cons of each path: docs/use-cases/install-paths.md.

Index a project

# Install via any path above, then:

# Go to your project
cd your-project

# Generate knowledge graph (requires graphify)
graphify update .

# Build neural index
neuralmind build .

# (Optional) Install Claude Code PostToolUse compression hooks
neuralmind install-hooks .

# (Optional) Auto-rebuild on every git commit
neuralmind init-hook .

# Start using
neuralmind wakeup .
neuralmind query . "How does authentication work?"
neuralmind skeleton src/auth/handlers.py

# Or browse it: Obsidian-style graph view of your codebase + learned synapses
neuralmind serve .

🕸️ Graph view (`neuralmind serve`)

neuralmind serve opens a local web UI that makes the same index your AI agent queries inspectable by a human. Same ChromaDB index, same synapses.db, just made navigable.

Force-directed graph of code nodes coloured by community.
Structural edges (calls / imports) layered with the Hebbian synapse overlay — edges thicken as the brain learns which nodes co-activate.
Backlinks, outgoing links, and synaptic neighbours for any node you click, Obsidian-style.
Semantic quick-switcher — type a phrase, jump to the node.
Open in editor — click a node, opens $EDITOR (or --editor code/cursor/ vim/subl/idea) at the right file and line.
Local-first: stdlib HTTP server, vanilla-JS canvas, no CDN, per-session access token bound to 127.0.0.1 by default.

neuralmind serve .                       # opens http://127.0.0.1:8765/?token=…
neuralmind serve . --editor "code -n"    # override the editor
neuralmind serve . --no-auth             # skip the token (trusted hosts only)

Why it matters: the agent-facing brain has always been a black box — you couldn't see what NeuralMind retrieved, whether the graph was reasonable, or what the synapse layer had actually learned. The graph view exposes all three.

Coming next (graph-view Phase B): a replay-last-query overlay that highlights the L3 hits the agent received, edge tooltips + a min-weight synapse slider answering "why are these two nodes related?", pin UX, and a Cmd/Ctrl-K quick-switch. Then Phase C: a live activity feed of synapse co-activations. Full plan in ROADMAP.md.

🔧 How It Works

NeuralMind wraps a graphify knowledge graph (graphify-out/graph.json) in a ChromaDB vector store. When you query it, a 4-layer progressive disclosure system loads only the context relevant to your question.

┌─────────────────────────────────────────────────────────────┐
│ Layer 0: Project Identity (~100 tokens) — ALWAYS LOADED     │
│   Source: CLAUDE.md / mempalace.yaml / README first line    │
├─────────────────────────────────────────────────────────────┤
│ Layer 1: Architecture Summary (~500 tokens) — ALWAYS LOADED │
│   Source: Community distribution + GRAPH_REPORT.md          │
├─────────────────────────────────────────────────────────────┤
│ Layer 2: Relevant Modules (~300–500 tokens) — QUERY-AWARE   │
│   Source: Top 3 clusters semantically matching the query    │
├─────────────────────────────────────────────────────────────┤
│ Layer 3: Semantic Search (~300–500 tokens) — QUERY-AWARE    │
│   Source: ChromaDB similarity search over all graph nodes   │
└─────────────────────────────────────────────────────────────┘
Total: ~800–1,100 tokens vs 50,000+ for the full codebase

Prerequisites: NeuralMind requires graphify update . to have been run first. This produces:

graphify-out/graph.json — the knowledge graph (required)
graphify-out/GRAPH_REPORT.md — architecture summary (enriches L1, optional)
graphify-out/neuralmind_db/ — ChromaDB vector store (created by neuralmind build)

🖥️ Complete CLI Reference

`neuralmind build`

Build or incrementally update the neural index from graphify-out/graph.json.

neuralmind build [project_path] [--force]

Argument/Option	Default	Description
`project_path`	`.`	Project root containing `graphify-out/graph.json`
`--force`, `-f`	off	Re-embed every node even if unchanged

neuralmind build .
neuralmind build /path/to/project --force

Output: nodes processed, added, updated, skipped, communities indexed, build duration.

`neuralmind wakeup`

Get minimal project context for starting a session (~400–600 tokens, L0 + L1 only).

neuralmind wakeup <project_path> [--json]

neuralmind wakeup .
neuralmind wakeup . --json
neuralmind wakeup . > CONTEXT.md

`neuralmind query`

Query the codebase with natural language (~800–1,100 tokens, all 4 layers).

neuralmind query <project_path> "<question>" [--json]

neuralmind query . "How does authentication work?"
neuralmind query . "What are the main API endpoints?" --json
neuralmind query /path/to/project "Explain the database schema"

On first run from a TTY, you will be prompted once to enable local query memory logging. Disable with NEURALMIND_MEMORY=0.

`neuralmind search`

Direct semantic search — returns code entities ranked by similarity to the query.

neuralmind search <project_path> "<query>" [--n N] [--json]

Option	Default	Description
`--n`	10	Maximum number of results
`--json`, `-j`	off	Machine-readable JSON output

neuralmind search . "authentication"
neuralmind search . "database connection" --n 5
neuralmind search . "PaymentController" --json

`neuralmind skeleton`

Print a compact graph-backed view of a file without loading full source (~88% cheaper than Read).

neuralmind skeleton <file_path> [--project-path .] [--json]

Option	Default	Description
`--project-path`	`.`	Project root (where the index lives)
`--json`, `-j`	off	Machine-readable JSON output

neuralmind skeleton src/auth/handlers.py
neuralmind skeleton src/auth/handlers.py --project-path /my/project
neuralmind skeleton src/auth/handlers.py --json

Output: function list with line numbers and rationales, internal call graph, cross-file edges (imports, data sharing), and a pointer to the full source for when you need it.

`neuralmind benchmark`

Measure token reduction using a set of sample queries.

neuralmind benchmark <project_path> [--json]

neuralmind benchmark .
neuralmind benchmark . --json

`neuralmind stats`

Show index status and statistics.

neuralmind stats <project_path> [--json]

neuralmind stats .
neuralmind stats . --json   # {"built": true, "total_nodes": 241, "communities": 23, ...}

`neuralmind learn`

Analyze logged query history to discover module cooccurrence patterns. Improves future query relevance automatically.

neuralmind learn <project_path>

neuralmind learn .

Reads .neuralmind/memory/query_events.jsonl, writes .neuralmind/learned_patterns.json. The next neuralmind query applies boosted reranking automatically.

`neuralmind install-hooks`

Install or remove Claude Code PostToolUse compression hooks.

neuralmind install-hooks [project_path] [--global] [--uninstall]

Option	Description
`--global`	Install in `~/.claude/settings.json` (affects all projects)
`--uninstall`	Remove NeuralMind hooks only; preserves other tools' hooks

neuralmind install-hooks .                       # project-scoped
neuralmind install-hooks --global                # all projects
neuralmind install-hooks --uninstall             # remove project hooks
neuralmind install-hooks --uninstall --global    # remove global hooks

`neuralmind init-hook`

Install a Git post-commit hook that auto-rebuilds the index after every commit. Safe and idempotent — coexists with other tools' hook contributions.

neuralmind init-hook [project_path]

neuralmind init-hook .
neuralmind init-hook /path/to/project

🔌 MCP Server

NeuralMind ships a Model Context Protocol server (neuralmind-mcp) that exposes all tools to MCP-compatible agents.

Starting the server

neuralmind-mcp
# or
python -m neuralmind.mcp_server

Claude Desktop configuration

{
  "mcpServers": {
    "neuralmind": {
      "command": "neuralmind-mcp",
      "args": ["/absolute/path/to/project"]
    }
  }
}

Config file locations:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json

Claude Code / Cursor project-scoped auto-registration

Drop a .mcp.json at your project root:

{
  "mcpServers": {
    "neuralmind": {
      "command": "neuralmind-mcp",
      "args": ["."]
    }
  }
}

Hermes-Agent (Nous Research)

Hermes-Agent is a self-improving agent framework that supports MCP servers. NeuralMind has been verified end-to-end against Hermes-Agent v0.12.0 (build 2026.4.30) — the agent discovered all 11 NeuralMind tools (4-second handshake) when registered as shown below.

Prerequisite: install NeuralMind. The MCP server (neuralmind-mcp) ships with the default install:

pip install neuralmind

Older pip install "neuralmind[mcp]" commands still work — the mcp extra is preserved as a no-op for backwards compatibility.

Two ways to register the server. Both end up in ~/.hermes/config.yaml:

Option A — CLI (recommended for first-time setup):

hermes mcp add

Option B — edit the config directly (~/.hermes/config.yaml, add under the mcp_servers top-level key):

mcp_servers:
  neuralmind:
    command: "neuralmind-mcp"
    args: ["/absolute/path/to/project"]

Verify the server is registered and reachable:

hermes mcp list                     # neuralmind should appear, status ✓
hermes mcp test neuralmind          # ✓ Connected, ✓ Tools discovered: 11

If you haven't installed Hermes-Agent yet, the upstream installer is:

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
source ~/.bashrc

After editing the YAML directly, run /reload-mcp from the running hermes CLI to pick up the change without restarting (the hermes mcp add flow does this automatically). Both stdio (shown above) and HTTP transports are supported — see the upstream MCP integration docs for the full schema (command, args, env, url, headers, enabled, per-server tools filtering, timeout, connect_timeout).

v0.6.0 graph view works identically here. Run neuralmind serve in the same project and any tool call from Hermes-Agent will pulse the corresponding nodes on the canvas. The synapse store is shared with Claude Code, Cursor, OpenClaw, and any other agent pointed at this project — see docs/use-cases/multi-agent.md.

OpenClaw

OpenClaw is a personal AI assistant that registers MCP servers via its CLI. Verified against OpenClaw 2026.5.2 — mcp set / mcp list / mcp show round-trip the documented JSON schema into ~/.openclaw/openclaw.json exactly as expected.

Prerequisite: install NeuralMind (the MCP server ships with the default install):

pip install neuralmind

Register NeuralMind:

openclaw mcp set neuralmind '{"command":"neuralmind-mcp","args":["/absolute/path/to/project"]}'

Verify it landed:

openclaw mcp list                  # neuralmind should appear
openclaw mcp show neuralmind       # echoes the JSON you stored

Remove with openclaw mcp unset neuralmind. Definitions are stored under the mcp.servers key in ~/.openclaw/openclaw.json.

v0.6.0 graph view works identically here. Run neuralmind serve in the same project and any tool call from OpenClaw will pulse the corresponding nodes on the canvas. OpenClaw and Claude Code talking to the same project reinforce the same synapse store — see docs/use-cases/multi-agent.md.

If you haven't installed OpenClaw yet:

npm install -g openclaw@latest   # or: pnpm add -g openclaw@latest
openclaw onboard --install-daemon

OpenClaw's MCP support covers stdio (shown above), SSE, HTTP, and streamable-http transports — see the upstream MCP CLI reference for details on url/transport config and the inverse direction (openclaw mcp serve, which exposes OpenClaw's own channels as an MCP server to other clients).

Agent Zero

Agent Zero is a self-organising AI agent framework with first-class MCP support — both as a client (it consumes MCP servers) and as a server (it exposes its own tools to other MCP clients). NeuralMind plugs in via the standard MCP client path.

Prerequisite: install NeuralMind (the MCP server ships with the default install):

pip install neuralmind

Register NeuralMind via Agent Zero's Web UI:

Open Agent Zero → Settings → MCP/A2A → External MCP Servers → Open
Paste this into the JSON editor:

{
  "mcpServers": {
    "neuralmind": {
      "command": "neuralmind-mcp",
      "args": ["/absolute/path/to/your-project"]
    }
  }
}

Click Apply now. Agent Zero discovers NeuralMind's tools at handshake and registers them into the normal tool registry.

The schema is the standard MCP command / args / env shape — see the upstream MCP setup guide for HTTP/SSE transports, OAuth, and per-server tool filtering.

If you haven't installed Agent Zero yet, the upstream README has the Docker and Python install paths.

v0.6.0 graph view works identically here. Run neuralmind serve in the same project and any tool call from Agent Zero will pulse the corresponding nodes on the canvas. The synapse store is shared with Claude Code, Cursor, Cline, Continue, OpenClaw, Hermes-Agent, and any other agent pointed at this project — see docs/use-cases/multi-agent.md.

Coming soon — one-click install. NeuralMind is being submitted to the agent0ai/a0-plugins registry so users can discover and install it from inside Agent Zero's Plugin Hub. The manual JSON path above continues to work either way.

Skill (OpenClaw, Agent Zero, Hermes, any SKILL.md host)

The MCP server gives an agent the actions. The skill at skills/neuralmind/SKILL.md gives it the playbook — when to call neuralmind_query vs. neuralmind_skeleton vs. neuralmind_search, what the outputs look like, and which env-var escape hatches exist. It is a portable Anthropic-style SKILL.md (frontmatter + markdown body) so the same file works in any host that implements the spec.

OpenClaw. Drop the directory into your ClawHub local skills path, or ship it as part of an OpenClaw plugin by listing skills/ in openclaw.plugin.json:

cp -r skills/neuralmind ~/.openclaw/skills/
openclaw skills list   # neuralmind should appear

The skill is description-matched on triggers like "how does X work" or "find function Y", so you don't need to load it explicitly.

Agent Zero. Drop the same directory into the Agent Zero skills folder:

cp -r skills/neuralmind /path/to/agent-zero/skills/

Agent Zero auto-discovers SKILL.md files by description and tag, then uses its code_execution_tool to call the MCP tools the skill names in its allowed_tools frontmatter.

Hermes-Agent. Hermes has a first-class skills system that reads the same SKILL.md spec. Drop the directory into the category-organised tree:

mkdir -p ~/.hermes/skills/code-intelligence
cp -r skills/neuralmind ~/.hermes/skills/code-intelligence/

Hermes loads skills on demand based on the frontmatter description, so no further wiring is needed. You can also publish the directory as a Hermes tap (a GitHub repo of skill directories) for one-command install across machines. This layers on top of the MCP integration documented in the Hermes-Agent section above — the MCP server still does the work; the skill teaches Hermes when to call it.

Claude Code, Cursor. These already have richer integrations (lifecycle hooks for Claude Code, MCP wiring for Cursor), so the skill is optional. It still works as a portable "agent operating manual" if you want a single file that travels with the project.

The skill duplicates none of NeuralMind's logic — it points the agent at MCP tools that already exist. Edit it like documentation.

Troubleshooting

"Connection closed" / "Connection failed" right after register. Almost always means an old NeuralMind install (≤ 0.4.x) where the MCP server was gated behind the [mcp] extra. From 0.5.0 onward the MCP SDK is bundled. Fix:

pip install --upgrade neuralmind

Then re-run the host's verify step (hermes mcp test neuralmind or openclaw mcp list).

neuralmind-mcp: command not found. The package installed but the console script wasn't put on PATH — usually because pip installed into a user site-packages dir that isn't on PATH. Add ~/.local/bin to PATH or reinstall in a venv where the entry point is on PATH.

The host shows neuralmind in mcp list but no tools when you query. Run neuralmind build /path/to/project first — the index has to exist before the MCP tools can answer queries. The hooks (SessionStart, UserPromptSubmit, PreCompact from neuralmind install-hooks) need a built index too.

MCP tool schemas

`neuralmind_wakeup`

{
  "project_path": "string (required) — absolute path to project root"
}

Returns:

{
  "context": "string",
  "tokens": 412,
  "reduction_ratio": 121.4,
  "layers": ["L0", "L1"]
}

`neuralmind_query`

{
  "project_path": "string (required)",
  "question":     "string (required) — natural language question"
}

Returns:

{
  "context": "string",
  "tokens": 847,
  "reduction_ratio": 59.0,
  "layers": ["L0", "L1", "L2", "L3"],
  "communities_loaded": [5, 12],
  "search_hits": 8
}

`neuralmind_search`

{
  "project_path": "string (required)",
  "query":        "string (required)",
  "n":            10
}

Returns array of:

{ "id": "node_id", "label": "authenticate_user", "file_type": "code",
  "source_file": "auth/handlers.py", "score": 0.92 }

`neuralmind_skeleton`

{
  "project_path": "string (required)",
  "file_path":    "string (required) — absolute or project-relative path"
}

Returns:

{ "file": "src/auth/handlers.py", "skeleton": "# src/auth/handlers.py ...", "chars": 620, "indexed": true }

`neuralmind_recursive_query`

Recursively decompose and explore complex questions. Breaks multi-part questions into focused sub-queries, executes them, identifies gaps, and synthesizes results. Searches both code and document indexes.

{
  "project_path": "string (required)",
  "question":     "string (required) — compound question to decompose",
  "max_depth":    3,
  "include_docs": true
}

Returns:

{
  "question": "string",
  "answer": "string — synthesized answer",
  "sub_queries": [{"query": "string", "results": [...], "source": "string"}],
  "depth_reached": 2,
  "gaps_identified": ["string"],
  "total_queries": 6,
  "token_estimate": 4156,
  "sources": ["file1.ts", "file2.ts", "doc.md"]
}

When to use: Multi-faceted questions spanning multiple files or concepts, like "How does auth work and what security measures are in place?" or "What is the deployment architecture and how do Cloudflare and Render interact?"

Benchmark: 6x more tokens than standard query, but decomposes compound questions and achieves full term coverage on 3/5 test questions. See graphify-out/RECURSIVE_QUERY_BENCHMARK.md after running benchmark_report.py.

`neuralmind_query_docs`

Search reference documents (legal, clinical, strategic PDFs/DOCX converted to markdown). NOT for code — use neuralmind_query for code questions.

{
  "project_path": "string (required)",
  "question":     "string (required) — question about reference documents",
  "n":            5
}

Returns:

{
  "results": [
    {
      "content": "string — relevant text chunk",
      "source_file": "docs/reference/filename.md",
      "file_name": "filename.md",
      "chunk": "3/12",
      "relevance": 0.719
    }
  ],
  "total_doc_chunks": 241,
  "query": "string"
}

Setup: Documents must be converted to markdown and indexed first:

# Convert documents (PDF, DOCX, TXT, HTML → .md)
pip install pypdf mammoth
python doc_indexer.py build /path/to/project

# Or use the doc-ingest skill for batch conversion

Auto-rebuild: A git post-commit hook can rebuild the doc index when files in docs/reference/ change.

Search reference docs via CLI:

python doc_indexer.py query /path/to/project "HIPAA compliance"
python doc_indexer.py stats /path/to/project

`neuralmind_build`

{
  "project_path": "string (required)",
  "force":        false
}

Returns:

{
  "success": true,
  "nodes_total": 241,
  "nodes_added": 5,
  "nodes_updated": 2,
  "nodes_skipped": 234,
  "communities": 23,
  "duration_seconds": 3.1
}

`neuralmind_stats`

{ "project_path": "string (required)" }

Returns:

{ "built": true, "total_nodes": 241, "communities": 23, "db_path": "..." }

`neuralmind_benchmark`

{ "project_path": "string (required)" }

Returns:

{
  "project": "myapp",
  "wakeup_tokens": 341,
  "avg_query_tokens": 739,
  "avg_reduction_ratio": 65.6,
  "results": [...]
}

🪝 PostToolUse Compression

When neuralmind install-hooks has been run, Claude Code automatically applies these transforms to every tool output before the agent sees it.

Read → skeleton

Raw source files are replaced with the graph skeleton (functions + rationales + call graph + cross-file edges). This is ~88% smaller and contains the structural information agents need most.

To get the full source anyway:

NEURALMIND_BYPASS=1 <command>

Bash → filtered output

Long bash output is reduced to:

All error/ERROR/FAIL/traceback/warning lines
All summary lines (=====, passed, failed, Finished, Done in, etc.)
Last 3 lines verbatim
Header: [neuralmind: bash compressed, exit=N]

All errors and failures are always preserved. Routine pip/npm/build chatter is dropped.

Grep → capped results

Search results are capped at 25 matches with a [N more hidden] note appended. Prevents context flooding from repository-wide searches.

Tunable thresholds

Variable	Default	Description
`NEURALMIND_BYPASS`	unset	Set to `1` to disable all compression
`NEURALMIND_BASH_TAIL`	`3`	Lines to keep verbatim from end of bash output
`NEURALMIND_BASH_MAX_CHARS`	`3000`	Below this size, bash output is not compressed
`NEURALMIND_SEARCH_MAX`	`25`	Max grep/search matches before capping
`NEURALMIND_OFFLOAD_THRESHOLD`	`15000`	Chars above which content is written to a temp file

🧠 Continual Learning

NeuralMind optionally learns from your query patterns to improve future relevance.

How it works

Collect — Each neuralmind query logs which modules appeared in the result to .neuralmind/memory/query_events.jsonl (opt-in, local only, zero overhead)
Learn — neuralmind learn . analyzes cooccurrence: which clusters appear together across queries
Improve — The next neuralmind query applies a +0.3 reranking boost to modules that co-occur with the current query's top matches
Repeat — The system gets smarter as you use it

Opt-in / consent

On first TTY query:

NeuralMind can keep local query memory (project + global JSONL) to improve future retrieval.
Enable? [y/N]:

Consent saved to ~/.neuralmind/memory_consent.json. Disable at any time:

export NEURALMIND_MEMORY=0     # disable query logging
export NEURALMIND_LEARNING=0   # disable pattern application

File locations

~/.neuralmind/
├── memory_consent.json             # consent flag
└── memory/
    └── query_events.jsonl          # global event log

<project>/.neuralmind/
├── memory/
│   └── query_events.jsonl          # project-specific events
└── learned_patterns.json           # created by: neuralmind learn .

Privacy

100% local — nothing is sent to any server. Delete ~/.neuralmind/ and <project>/.neuralmind/ at any time to remove all learning data.

⏰ Keeping the Index Fresh

Automatic — Git post-commit hook (recommended)

neuralmind init-hook .

After every commit, the hook runs:

neuralmind build . 2>/dev/null && echo "[neuralmind] OK"

Manual

graphify update .
neuralmind build .

Scheduled — cron

0 6 * * * cd /path/to/project && graphify update . && neuralmind build .

CI/CD — GitHub Actions

- run: pip install neuralmind graphifyy
- run: graphify update . && neuralmind build .
- run: neuralmind wakeup . > AI_CONTEXT.md

🔌 Compatibility

Component	Works With	Notes
CLI	Any environment	Pure Python, no daemon required
MCP Server	Claude Code, Claude Desktop, Cursor, Cline, Continue, any MCP client	Bundled with `pip install neuralmind`
SKILL.md	OpenClaw (ClawHub), Agent Zero, Hermes-Agent, any SKILL.md host	Portable agent playbook at `skills/neuralmind/SKILL.md` — pairs with the MCP server
PostToolUse Hooks	Claude Code only	Uses Claude Code's `PostToolUse` hook system
Git hook	Any git workflow	Appends to existing `post-commit`, idempotent
Copy-paste	ChatGPT, Gemini, any LLM	`neuralmind wakeup . \| pbcopy`

Quick-start by tool

Claude Code — full two-phase optimization

pip install neuralmind graphifyy
cd your-project
graphify update .
neuralmind build .
neuralmind install-hooks .    # PostToolUse compression
neuralmind init-hook .        # auto-rebuild on commit (optional)

Then use MCP tools in sessions: neuralmind_wakeup, neuralmind_query, neuralmind_skeleton.

Cursor / Cline / Continue — MCP server

pip install neuralmind graphifyy
graphify update .
neuralmind build .

Add to your MCP config:

{ "mcpServers": { "neuralmind": { "command": "neuralmind-mcp" } } }

ChatGPT / Gemini / any LLM — CLI + copy-paste

neuralmind wakeup . | pbcopy      # macOS — paste into chat
neuralmind query . "question"     # get context for a specific question

✨ What's New in v0.5.4 — Graph view

neuralmind serve ships in v0.5.4 — see the Graph view section above. The next patch release (v0.5.5) lands graph-view Phase B: the replay-last-query overlay (#105), edge tooltips + min-weight synapse slider (#106), pin UX, and a Cmd/Ctrl-K quick-switch. Phase C after that: a live activity feed of synapse co-activations. Full plan in ROADMAP.md.

✨ What's New in v0.4.0 — Brain-like Synapse Layer

NeuralMind now runs as a second brain alongside the LLM: a persistent associative memory that learns continuously from how the agent and the codebase actually interact. See the release notes for the full story.

Feature	Details
Synapse store	SQLite-backed weighted graph; Hebbian reinforce, decay, long-term potentiation
Spreading activation	`mind.synaptic_neighbors(query)` — usage-based recall complementing vector search
`neuralmind watch` daemon	File edits become co-activation signals; brain learns even when no query runs
Three new Claude Code hooks	SessionStart (decay+export), UserPromptSubmit (recall injection), PreCompact (hub normalization)
Auto-memory export	Writes `SYNAPSE_MEMORY.md` to Claude Code's auto-memory dir so associations surface natively
Three new MCP tools	`synaptic_neighbors`, `synapse_stats`, `synapse_decay`, `export_synapse_memory`
3× fewer embedder calls per query	Selector caches one search per query and slices for L2/L3/synapses

Earlier (v0.3.x)

Feature	Version	Details
Memory Collection	v0.3.0	Local JSONL storage for query events
Opt-in Consent	v0.3.0	One-time TTY prompt, env var overrides
EmbeddingBackend abstraction	v0.3.1	Pluggable vector backend (Pinecone/Weaviate ready)
Pattern Learning	v0.3.2	`neuralmind learn .` analyzes cooccurrence
Smart Reranking	v0.3.2	L3 results boosted by learned patterns
Accurate Build Stats	v0.3.3	Correctly distinguishes added vs updated nodes
Documentation polish	v0.3.4	CLI flags sync, Setup Guide, agent guidance in README

📊 Benchmarks

NeuralMind benchmarks itself on every pull request. A hermetic fixture (tests/fixtures/sample_project/) plus a committed query set (tests/fixtures/benchmark_queries.json) runs through the full retrieval pipeline, and CI fails if aggregate reduction drops below a conservative floor (currently 4× on the small fixture — the fixture is intentionally tiny, real repos consistently hit 40–70× as shown below).

What CI measures on every PR

Phase 1 — Reduction. Naive baseline (every .py file in the fixture concatenated) vs NeuralMind.query() output, per query. All tokens counted with tiktoken.
Phase 2 — Learning uplift. Same queries run cold, then after seeding memory and running neuralmind learn. Reports the delta in reduction ratio and top-k retrieval hit rate. On a 500-line fixture the numerical uplift is modest by design — the test proves the mechanism persists, not that it's magic.
Per-model breakdown. GPT-4o and GPT-4/3.5 counts are measured via real tiktoken encodings. Claude uses the Anthropic SDK tokenizer when available, else a clearly-labeled estimate derived from published vocab ratios. Llama is always estimated. No fabricated numbers anywhere.
Memory persistence. tests/test_memory_persistence.py asserts events are logged, neuralmind learn produces a patterns file, and subsequent queries load it without error.

Community benchmarks

Real-world numbers submitted by users. Your code never leaves your machine — you submit a PR (or an issue, which a maintainer converts to a PR) with only the numbers. CI validates every entry against the schema and re-renders this table automatically.

Project	Lang	Nodes	Wakeup	Avg Query	Reduction	Model	Submitted
cmmc20	JavaScript	241	341	739	65.6×	Claude 3.5 Sonnet	@dfrostar · 2025-10-01
mempalace	Python	1,626	412	891	46.0×	Claude 3.5 Sonnet	@dfrostar · 2025-10-01

2 submission(s). See the JSON data for notes and verification commands.

Submit yours:

Easy path: open a benchmark submission issue — fill out a form, a maintainer converts it to a PR.
PR directly: add an entry to docs/community-benchmarks.json and run python scripts/render_community_table.py --inject README.md to regenerate the table. Schema: community-benchmarks.schema.json.

All entries include the exact neuralmind command that produced them, so reviewers (and any reader) can audit the numbers.

Reproduce locally (on our fixture)

pip install . tiktoken matplotlib graphifyy
graphify update tests/fixtures/sample_project
neuralmind build tests/fixtures/sample_project --force
python -m tests.benchmark.run          # phase 1 + phase 2
python -m tests.benchmark.multi_model  # per-model breakdown
python scripts/generate_chart.py       # refreshes the PNG above

Full machine-readable results land in tests/benchmark/results.json, human-readable report in tests/benchmark/report.md.

Reproduce on your code

Don't just trust numbers from our fixture — run it on your repo:

pip install neuralmind graphifyy
graphify update . && neuralmind build .
neuralmind benchmark . --contribute

Output shows your reduction ratio, tokens per query, and estimated monthly savings at Claude 3.5 Sonnet pricing. Full walkthrough: Does NeuralMind work on your codebase?

Retrieval quality baseline

Heuristic-only baseline (community-reported): 70–80% top-5 retrieval accuracy
NeuralMind target on the same query set: exceed that baseline via semantic retrieval + learned cooccurrence reranking

The pytest regression gate (tests/test_benchmark_regression.py) currently enforces ≥50% top-k hit rate on the fixture plus ≥4× reduction (low because the fixture is tiny; real repos measure 10× higher).

❓ FAQ

How much does NeuralMind reduce Claude / GPT token costs?

Measured on real repos: 40–70× reduction per query (see Benchmarks). For a team running 100 queries/day on Claude Sonnet, that is roughly $450/month → $7/month. Exact savings depend on codebase size and model pricing.

Does NeuralMind work outside Claude Code?

Yes. The CLI works anywhere Python runs; the MCP server works with Cursor, Cline, Continue, Claude Desktop, and any MCP-compatible agent. For non-MCP tools like ChatGPT or Gemini, neuralmind wakeup . | pbcopy pipes context into a regular chat window. Only the PostToolUse compression hooks are Claude-Code-specific.

Does my code leave my machine?

No. NeuralMind is fully offline — no API calls, no cloud services. Embeddings run locally via ChromaDB, and the knowledge graph is stored in graphify-out/ in your project. Query memory (optional, opt-in) is written to .neuralmind/ on disk.

Is this RAG? How is it different from LangChain or LlamaIndex?

It is a form of RAG, but specialized for code. Instead of chunking text, NeuralMind retrieves over a knowledge graph of code entities (functions, classes, clusters) with a fixed 4-layer structure. That keeps the call graph intact and produces a token-budgeted output instead of a flat list of chunks. See vs. LangChain/LlamaIndex.

I have a 1M context window now — do I still need this?

Long context makes it possible to stuff a whole repo in; it does not make it cheap. You still pay per input token, so a 50K-token repo at Claude Sonnet rates costs ~$0.15 every turn. NeuralMind drops that to ~$0.002. See vs. long context.

Does it support my language?

Any language graphify supports (Python, JavaScript/TypeScript, and others via tree-sitter). NeuralMind consumes graphify-out/graph.json — if graphify can index it, NeuralMind can query it.

What is the difference between `wakeup`, `query`, and `skeleton`?

wakeup — ~400 tokens of project orientation (L0 + L1). Run it at session start.
query — ~800–1,100 tokens for a specific natural-language question (L0–L3).
skeleton — compact view of a single file (functions + call graph + cross-file edges). Use before Read.

How does the PostToolUse compression work?

When neuralmind install-hooks . has been run, Claude Code invokes NeuralMind after every Read/Bash/Grep tool call but before the agent sees the output. Read becomes a skeleton (~88% smaller), Bash keeps errors + last 3 lines (~91% smaller), Grep caps at 25 matches. Set NEURALMIND_BYPASS=1 on any command to opt out.

Can I use NeuralMind without a knowledge graph?

No — the knowledge graph (graphify-out/graph.json) is the source of truth. Run graphify update . first, then neuralmind build ..

Does it auto-update when I change code?

Only if you install the git post-commit hook with neuralmind init-hook .. Otherwise run neuralmind build . manually; it is incremental and only re-embeds changed nodes.

What do I do if retrieval quality is poor on my repo?

Check that neuralmind stats . reports all your nodes indexed.
Run neuralmind benchmark . to see reduction ratios.
Enable query memory (it prompts on first TTY run) and periodically run neuralmind learn . — cooccurrence-based reranking improves relevance on your actual queries.
Open an issue with the query and expected result — retrieval quality is the thing we most want to improve.

📚 Documentation

Resource	Contents
Business Case	Fact-based ROI argument with provable claims, math you can plug your numbers into, and three concrete scenarios
Honest Assessment	Skeptic's companion — when NeuralMind isn't worth installing, what the headline numbers don't measure
Enterprise Use Cases	Regulated industries, on-premise, multi-team — what to know before pitching internally
Setup Guide	First-time setup for Claude Code, Claude Desktop, Cursor, any LLM
CLI Reference	All commands and options
Scheduling Guide	Automate audits with Windows Task Scheduler, GitHub Actions, or cron
Version Strategy	Versioning policy, breaking changes, support timeline, upgrade path
Compatibility Matrix	Version compatibility, Python/platform support, known issues, migration guides
Learning Guide	Continual learning details
API Reference	Python API (`NeuralMind`, `ContextResult`, `TokenBudget`)
Architecture	4-layer progressive disclosure design
Integration Guide	MCP, CI/CD, VS Code, JetBrains
Troubleshooting	Common issues and fixes
Roadmap	What's shipping next, where we want help, what's out of scope
Future-Proofing Plan	8-initiative engineering plan for sustainability and scale
Brain-like Learning	Design rationale for the learning system
Use Cases	Step-by-step walkthroughs: Claude Code, cost optimization, any-LLM, offline/regulated, growing monorepo, multi-agent (new in v0.6.0)
Release Notes v0.9.0	Enterprise-Ready — GHCR auto-build, CycloneDX SBOM, air-gapped install walkthrough, compliance one-pager
Release Notes v0.8.0	Always-On — systemd + launchd templates, Windows Task Scheduler walkthrough, `/healthz` endpoint
Release Notes v0.7.0	Install anywhere — `pip` / `pipx` / `uv` / Docker / source, Dockerfile, event-log rotation fix
Release Notes v0.6.0	Live activity feed, cross-process JSONL bridge, pin UX, depth slider, replay overlay
Comparisons	NeuralMind vs. Cursor, Copilot, Cody, Aider, Claude Projects, LangChain, long context, prompt caching, RAG, tree-sitter
USAGE.md	Extended usage examples

🤝 Contributing

See CONTRIBUTING.md for guidelines and ROADMAP.md for what we're working on next and where help is most welcome.

📄 License

MIT License — see LICENSE for details.

⭐ Star this repo if NeuralMind saves you money!

Name		Name	Last commit message	Last commit date
Latest commit History 274 Commits
.github		.github
Build Package		Build Package
Lint		Lint
Test (Python 3.10)		Test (Python 3.10)
Test (Python 3.11)		Test (Python 3.11)
Test (Python 3.12)		Test (Python 3.12)
Type Check		Type Check
docs		docs
neuralmind		neuralmind
optimization		optimization
scripts		scripts
skills/neuralmind		skills/neuralmind
tests		tests
.gitignore		.gitignore
.release-please-manifest.json		.release-please-manifest.json
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
RELEASE_NOTES_v0.3.2.md		RELEASE_NOTES_v0.3.2.md
RELEASE_NOTES_v0.4.0.md		RELEASE_NOTES_v0.4.0.md
RELEASE_NOTES_v0.5.0.md		RELEASE_NOTES_v0.5.0.md
RELEASE_NOTES_v0.5.3.md		RELEASE_NOTES_v0.5.3.md
RELEASE_NOTES_v0.6.0.md		RELEASE_NOTES_v0.6.0.md
RELEASE_NOTES_v0.7.0.md		RELEASE_NOTES_v0.7.0.md
RELEASE_NOTES_v0.8.0.md		RELEASE_NOTES_v0.8.0.md
RELEASE_NOTES_v0.9.0.md		RELEASE_NOTES_v0.9.0.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
USAGE.md		USAGE.md
benchmark.py		benchmark.py
plugin.yaml		plugin.yaml
pyproject.toml		pyproject.toml
release-please-config.json		release-please-config.json
requirements-pinned.txt		requirements-pinned.txt

Folders and files

Latest commit

History

Repository files navigation

🧠 NeuralMind

⚡ 30-second proof

📊 The fact-based case

🔒 Security & Compliance

📌 For AI Coding Agents

Check if NeuralMind is active

Startup workflow

Decision tree — which tool to call

Understanding the output

wakeup / query output format

skeleton output format

search output format

PostToolUse hooks — what happens automatically

After making code changes

MCP tool quick reference

⚡ Two-phase optimization

🎯 The Problem

💰 Realistic savings

✅ Does it work on your code? Prove it in 5 minutes.

🚨 When do I reach for NeuralMind?

Symptoms — "This is happening to me"

Goals — "What am I trying to solve for?"

Still not sure?

🏢 For organizations evaluating NeuralMind

⚖️ NeuralMind vs. Alternatives

🚀 Quick Start (humans)

Install — pick your path

Index a project

🕸️ Graph view (neuralmind serve)

🔧 How It Works

🖥️ Complete CLI Reference

neuralmind build

neuralmind wakeup

neuralmind query

neuralmind search

neuralmind skeleton

neuralmind benchmark

neuralmind stats

neuralmind learn

neuralmind install-hooks

neuralmind init-hook

🔌 MCP Server

Starting the server

Claude Desktop configuration

Claude Code / Cursor project-scoped auto-registration

Hermes-Agent (Nous Research)

OpenClaw

Agent Zero

Skill (OpenClaw, Agent Zero, Hermes, any SKILL.md host)

Troubleshooting

MCP tool schemas

neuralmind_wakeup

neuralmind_query

neuralmind_search

neuralmind_skeleton

neuralmind_recursive_query

neuralmind_query_docs

neuralmind_build

neuralmind_stats

neuralmind_benchmark

🪝 PostToolUse Compression

Read → skeleton

Bash → filtered output

Grep → capped results

Tunable thresholds

🧠 Continual Learning

How it works

Opt-in / consent

File locations

Privacy

⏰ Keeping the Index Fresh

Automatic — Git post-commit hook (recommended)

Manual

Scheduled — cron

CI/CD — GitHub Actions

🔌 Compatibility

`wakeup` / `query` output format

`skeleton` output format

`search` output format

🕸️ Graph view (`neuralmind serve`)

`neuralmind build`

`neuralmind wakeup`

`neuralmind query`

`neuralmind search`

`neuralmind skeleton`

`neuralmind benchmark`

`neuralmind stats`

`neuralmind learn`

`neuralmind install-hooks`

`neuralmind init-hook`

`neuralmind_wakeup`

`neuralmind_query`

`neuralmind_search`

`neuralmind_skeleton`

`neuralmind_recursive_query`

`neuralmind_query_docs`

`neuralmind_build`

`neuralmind_stats`

`neuralmind_benchmark`

What is the difference between `wakeup`, `query`, and `skeleton`?

Packages