Wire the live synapse graph into retrieval (budget-neutral) by dfrostar · Pull Request #142 · dfrostar/neuralmind

dfrostar · 2026-05-20T15:26:58Z

Summary

The Hebbian synapse store was written to on every query (_reinforce_from_query) but never read during retrieval — selection ranked on vector similarity plus a separate static learned_patterns.json, and synaptic recall only reached the UserPromptSubmit hook (never the MCP/programmatic query() path). The advertised "brain that learns your codebase" had no effect on what the agent actually received.

This branch closes that gap, then proves it helps:

b6aeb06 — L3 reranking. Seed spreading activation from the top L3 hits and boost any other result the synapse graph co-activates, so learned association — not just vector similarity — shapes ranking. Reaches every retrieval consumer, not just the hook.
6d4a105 — L2/L3 selection, budget-neutral. Surfaces context vector search missed without spending extra tokens:
- L3: swap the weakest vector hits for the strongest absent co-activated neighbors (displacement, result count fixed) — not append.
- L2: a co-activated community can win a slot by outscoring a vector one, but cannot grow how many communities load past what vector search alone surfaced.
- Adds GraphEmbedder.get_nodes_by_ids.
20d4976 — quality proof. A synapse-recall A/B phase in the self-benchmark (below).

All behind the existing NEURALMIND_SYNAPSE_INJECT kill switch and a no-op on a cold graph, so cold-start output is byte-identical to a build without a synapse store.

Does it actually help? (Phase 3 — synapse A/B)

Same warm graph, same query set, only NEURALMIND_SYNAPSE_INJECT differs:

metric	recall off	recall on	Δ
top-k hit rate	72%	83%	+12 pts
reduction ratio	6.1×	6.1×	~0 (budget-neutral)

Associative recall surfaces co-edited modules (e.g. users/crud.py on an auth query) that a purely textual search ranks lower — and does it at no token cost, because recalled nodes displace the weakest hits rather than adding to them.

Why budget-neutral

An earlier additive draft (append recalled nodes) improved recall but dropped reduction from 6.0× to 4.8× — against the headline metric the product is sold on. Displacement keeps the budget fixed: same tokens, better picks.

build	reduction	avg context
baseline (pre-change)	6.0×	783 tok
additive draft (rejected)	4.8×	979 tok
this branch (cold)	6.0×	783 tok
this branch (warm)	5.9×	804 tok

Test plan

tests/test_context_selector.py — 13 synapse tests: L3 reorder/displacement, L2 community displacement + count cap, safety no-ops (no recall / cold graph / kill switch / recall raising).
tests/test_benchmark_regression.py — 5 gates, incl. recall must never lower hit rate (catches displacement dropping a relevant hit) and reduction stays budget-neutral.
Full suite green (only the firewall-blocked ONNX S3 integration test skips).
bash scripts/demo.sh holds at ~6× cold and warm.

Note: this wires the synapse store in alongside the existing learned_patterns.json reranker. Collapsing that now-partly-redundant dual source is a deliberate, riskier follow-up — not in this PR.

https://claude.ai/code/session_01DRbKLVDX9PNyNdXwuNqTDp

L3 search ranked results via vector similarity plus a separate static learned_patterns.json reranker; the Hebbian synapse store was written to on every query but never consulted during retrieval, and synaptic recall only reached the UserPromptSubmit hook (never the MCP/programmatic query path). Seed spreading activation from the top L3 hits and boost any other result the synapse graph co-activates, so learned association shapes ranking for every retrieval consumer. Behind the existing NEURALMIND_SYNAPSE_INJECT switch and a no-op on a cold graph, so cold-start output is byte-identical. https://claude.ai/code/session_01DRbKLVDX9PNyNdXwuNqTDp

PR1 wired the synapse graph into L3 reranking. This extends it so learned co-activation also surfaces context the agent missed — without spending extra tokens: - L3: swap the weakest vector hits for the strongest absent neighbors (displacement, result count fixed) instead of appending them. - L2: a co-activated community can win a slot by outscoring a vector one, but cannot grow how many communities load past what vector search alone surfaced. Adds GraphEmbedder.get_nodes_by_ids to fetch recalled neighbors. All behind NEURALMIND_SYNAPSE_INJECT and a no-op on a cold graph. Fixture demo holds at ~6x reduction warm and cold (was 4.8x with an additive draft). https://claude.ai/code/session_01DRbKLVDX9PNyNdXwuNqTDp

github-actions · 2026-05-20T15:28:29Z

NeuralMind self-benchmark

Status: PASS — floor 4×, measured 5.9×.

Phase 1 — Reduction on committed fixture

Average reduction: 5.9×
Top-k retrieval hit rate: 71.7%
Naive baseline: 47,360 tokens (all fixture files concatenated)
NeuralMind total: 8,185 tokens across 10 queries
Estimated monthly savings @ 100 queries/day on Claude 3.5 Sonnet: ~$35.26

#	Query	Shape	Naive	NeuralMind	Ratio	Hit
1	`auth-flow`	cross-file	4,736	815	5.8×	33.3%
2	`api-endpoints`	focused	4,736	809	5.9×	100.0%
3	`billing-flow`	cross-file	4,736	846	5.6×	33.3%
4	`user-storage`	cross-file	4,736	672	7.0×	50.0%
5	`jwt-verify`	focused	4,736	681	7.0×	100.0%
6	`stripe-webhook`	focused	4,736	838	5.7×	100.0%
7	`create-user`	cross-file	4,736	822	5.8×	50.0%
8	`refund`	focused	4,736	827	5.7×	100.0%
9	`db-choice`	identity	4,736	899	5.3×	100.0%
10	`invoice-send`	cross-file	4,736	976	4.9×	50.0%

Phase 2 — Learning uplift

Memory events logged: 20
Learned patterns: 20
Reduction ratio after neuralmind learn: 5.8× (Δ -0.07× vs. cold)
Top-k hit rate after learning: 75.0% (Δ +3.3 points vs. cold)

Note: uplift numbers on a 500-line fixture are intentionally modest — the point is to
verify the learning mechanism persists and applies. On real production repos the lift
is larger; this test only catches regressions in persistence.

Phase 3 — Synapse recall A/B (same warm graph, recall off vs on)

Synapse edges after seeding co-editing sessions: 2793
Top-k hit rate: 71.7% off → 83.3% on (Δ +11.7 points)
Reduction ratio: 5.9× off → 5.8× on (Δ -0.06× — budget-neutral by design)

This isolates the Hebbian synapse layer from the learned_patterns reranker in
Phase 2. The hit-rate delta shows associative recall surfacing co-edited modules a
purely textual search ranks lower; the near-zero reduction delta confirms it does so
without spending extra tokens (recalled nodes displace the weakest hits, not add to them).

Assumptions

Baseline: every .py file in tests/fixtures/sample_project/ concatenated.
Tokenizer: tiktoken GPT-4o encoding (per-model breakdown in multi_model.json if generated).
Pricing: Claude 3.5 Sonnet input @ $3.0/MTok.
Regression floor: 4× — well below NeuralMind's typical 40–70× on real repos.

Per-model token reduction

Model	Tokenizer	Naive	NeuralMind	Ratio	Source
GPT-4o / GPT-4o-mini	`tiktoken o200k_base`	4,739	927	5.1×	measured
GPT-4 / GPT-3.5-turbo	`tiktoken cl100k_base`	4,710	918	5.1×	measured
Claude 3.5 Sonnet	`estimated: GPT-4o × 1.08 — install` anthropic `for an exact count`	5,118	1,001	5.1×	estimated
Llama 3 (70B)	`estimated: GPT-4o × 1.22 — Llama tokenizer requires model weights; estimate based on published vocab ratios`	5,781	1,130	5.1×	estimated

Rows marked measured use the provider's real tokenizer. Rows marked
estimated apply a published vocab-size correction to the GPT-4o count —
honest approximations, not hardcoded claims.

Automated by .github/workflows/ci-benchmark.yml — regenerate locally with python -m tests.benchmark.run.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6d4a1056e0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-20T15:30:48Z

+        if num_swap <= 0:
+            return results
+        energy_by_id = dict(candidates[:num_swap])
+        fetched = self.embedder.get_nodes_by_ids(list(energy_by_id))


Guard synapse pull-in for backends without id lookup

This call assumes every embedding backend implements get_nodes_by_ids, but only GraphEmbedder gained that method in this change. NeuralMind can run with InMemoryEmbeddingBackend (via backend switching/config), and once the synapse graph is warm enough to recall an ID not already in results, this path raises AttributeError and breaks query() instead of degrading gracefully. Add a backend capability check/fallback here (or extend the backend interface and all implementations) before attempting pull-in.

Useful? React with 👍 / 👎.

Fixed in eb22466. Pull-in now guards get_nodes_by_ids with a callable(getattr(...)) check and degrades to boost-only when the embedder doesn't implement it, instead of raising. Added a regression test (test_pull_in_degrades_without_id_lookup).

Generated by Claude Code

Copilot

Pull request overview

This PR wires the live Hebbian synapse graph into retrieval so learned co-activation can influence both L3 reranking and (budget-neutral) L2/L3 selection, rather than only being written during reinforcement.

Changes:

Add synapse-driven boosting + displacement to L3 search results (with kill switch / cold-graph no-ops).
Allow L2 community selection to be influenced by synapse-recalled community_<id> pseudo-nodes without increasing the community budget.
Add GraphEmbedder.get_nodes_by_ids() so L3 can pull in recalled neighbors not returned by vector search.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File	Description
`neuralmind/context_selector.py`	Implements synapse recall integration for L2 community scoring and L3 reranking/displacement; adds output labels for synapse boosts and recalled nodes.
`neuralmind/core.py`	Wires `ContextSelector.synapse_recall` to a synapse-store spreading activation method during build.
`neuralmind/embedder.py`	Adds `get_nodes_by_ids()` to fetch recalled nodes by id from the vector store for L3 displacement.
`tests/test_context_selector.py`	Adds targeted tests covering L3 boost/reorder/displacement and L2 community displacement/budget caps, including cold-graph and kill-switch behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

dfrostar · 2026-05-20T15:48:01Z

+        for r in results:
+            nid = r.get("id")
+            if nid in seed_set or nid not in energy:
+                continue
+            boost = self.SYNAPSE_BOOST_WEIGHT * energy[nid]
+            r["score"] = r.get("score", 0.0) + boost
+            r["_synapse_boost"] = boost
+            boosted = True


Fixed in eb22466. _apply_synapse_boost now operates on shallow copies of the result dicts, so it never mutates the objects _fetch_search caches — the boost is idempotent and the cached vector scores stay clean. Added a regression test (test_boost_does_not_mutate_cached_results) asserting a repeated call is identical and the cached dict keeps its original score.

Generated by Claude Code

Phase 2 measures the learned_patterns reranker; nothing isolated the Hebbian synapse layer's effect on retrieval quality. Add Phase 3: reinforce realistic co-editing sessions, then measure the same query set with synapse recall off vs on (same warm graph, only NEURALMIND_SYNAPSE_INJECT differs). On the fixture, recall lifts top-k hit rate 72% -> 83% (+12 points) while the reduction ratio holds at 6.1x -> 6.1x — associative recall surfaces co-edited modules a textual search ranks lower, at no token cost. Two regression gates: recall must never lower hit rate (catches budget- neutral displacement dropping a relevant hit) and reduction must stay budget-neutral. https://claude.ai/code/session_01DRbKLVDX9PNyNdXwuNqTDp

Address two review findings on the synapse retrieval path: - The boost incremented score on result dicts that _fetch_search caches and reuses, so repeated calls compounded the boost and corrupted cached vector scores. Operate on shallow copies — the boost is now idempotent and leaves the cache clean. - Pull-in called get_nodes_by_ids unconditionally, which only GraphEmbedder implements; an embedder without it would raise AttributeError mid-query. Guard with a capability check and degrade to boost-only. https://claude.ai/code/session_01DRbKLVDX9PNyNdXwuNqTDp

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

dfrostar · 2026-05-20T15:49:28Z

+                if nid not in present
+                and not nid.startswith("community_")
+                and e >= self.SYNAPSE_PULL_IN_MIN_ENERGY
+            ),
+            key=lambda x: x[1],
+            reverse=True,
+        )[: self.SYNAPSE_PULL_IN_MAX]
+        if not candidates:
+            return results
+
+        # Keep at least one vector hit; only displace as many as we can fetch.


Already addressed in eb22466 (this review was generated against the earlier commit). The pull-in now does exactly your option (2): get_nodes_by_ids = getattr(self.embedder, "get_nodes_by_ids", None) and returns boost-only if it isn't callable, so a backend without id lookup degrades gracefully instead of raising. Covered by test_pull_in_degrades_without_id_lookup.

Generated by Claude Code

claude added 2 commits May 20, 2026 14:24

Copilot AI review requested due to automatic review settings May 20, 2026 15:26

github-actions Bot added enhancement New feature or request question Further information is requested labels May 20, 2026

Copilot started reviewing on behalf of dfrostar May 20, 2026 15:27 View session

chatgpt-codex-connector Bot reviewed May 20, 2026

View reviewed changes

Copilot AI reviewed May 20, 2026

View reviewed changes

dfrostar requested a review from Copilot May 20, 2026 15:44

Copilot started reviewing on behalf of dfrostar May 20, 2026 15:44 View session

Copilot AI reviewed May 20, 2026

View reviewed changes

dfrostar mentioned this pull request May 20, 2026

Deprecate the learned_patterns reranker + neuralmind learn (superseded by the synapse layer) #143

Open

dfrostar merged commit 98cfc51 into main May 20, 2026
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wire the live synapse graph into retrieval (budget-neutral)#142

Wire the live synapse graph into retrieval (budget-neutral)#142
dfrostar merged 4 commits into
mainfrom
claude/analyze-openclaw-tokens-ypkup

dfrostar commented May 20, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 20, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 20, 2026

Uh oh!

dfrostar May 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

dfrostar May 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

dfrostar May 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dfrostar commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Does it actually help? (Phase 3 — synapse A/B)

Why budget-neutral

Test plan

Uh oh!

github-actions Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

NeuralMind self-benchmark

Phase 1 — Reduction on committed fixture

Phase 2 — Learning uplift

Phase 3 — Synapse recall A/B (same warm graph, recall off vs on)

Assumptions

Per-model token reduction

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

dfrostar May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

dfrostar May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

dfrostar May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dfrostar commented May 20, 2026 •

edited

Loading

github-actions Bot commented May 20, 2026 •

edited

Loading