Skip to content

feat: observable retrieval traces + batch dedup#319

Open
AliceLJY wants to merge 2 commits intoCortexReach:masterfrom
AliceLJY:feat/observable-retrieval
Open

feat: observable retrieval traces + batch dedup#319
AliceLJY wants to merge 2 commits intoCortexReach:masterfrom
AliceLJY:feat/observable-retrieval

Conversation

@AliceLJY
Copy link
Collaborator

Summary

  • Observable Retrieval Traces (Feature 3): Adds TraceCollector and RetrievalStatsCollector to make the multi-stage retrieval pipeline fully debuggable. Each stage (vector search, BM25, RRF fusion, min score filter, rerank, recency boost, importance weight, length normalization, time/decay boost, hard cutoff, noise filter, MMR diversity) is instrumented with entry ID tracking, drop computation, score ranges, and timing. Zero overhead when tracing is disabled (uses optional chaining).
  • memory_debug tool: New management tool that returns retrieval results plus full per-stage pipeline trace showing exactly which entries were dropped at each stage and why.
  • memory_stats extension: Now includes retrieval quality metrics (zero-result rate, latency percentiles, top drop stages) when stats collector is active.
  • Batch-Internal Cosine Dedup (Feature 2): Adds batchDedup() for O(n^2) pairwise cosine similarity check on extraction candidates (n <= 5) to skip near-duplicate candidates before expensive LLM dedup calls. Includes ExtractionCostStats tracking.

Test plan

  • 16 new tests for TraceCollector and RetrievalStatsCollector (stage tracking, drops, score ranges, zero-overhead, p95 latency, capacity eviction)
  • 10 new tests for batchDedup and ExtractionCostStats (similar/dissimilar detection, threshold sensitivity, edge cases)
  • All existing tests pass (embedder, migration, CLI smoke, reflection, self-improvement)
  • Manual verification with live LanceDB instance

🤖 Generated with Claude Code

AliceLJY and others added 2 commits March 23, 2026 18:39
Observable Retrieval (Feature 3):
- Add TraceCollector (src/retrieval-trace.ts) that tracks entry IDs
  through each pipeline stage, computing drops, score ranges, and timing
- Add RetrievalStatsCollector (src/retrieval-stats.ts) for aggregate
  query metrics: latency percentiles, zero-result rate, top drop stages
- Instrument all retrieval stages in MemoryRetriever.retrieve() with
  optional trace (zero overhead when disabled via optional chaining)
- Add retrieveWithTrace() for always-on debug tracing
- Add memory_debug tool (requires enableManagementTools) returning full
  per-stage pipeline trace with drop info
- Extend memory_stats tool to include retrieval quality metrics

Batch Dedup (Feature 2):
- Add batchDedup() (src/batch-dedup.ts) for cosine similarity dedup
  within extraction batches before expensive LLM dedup calls
- Add ExtractionCostStats tracking (batchDeduped, durationMs, llmCalls)

Tests: 26 new tests (16 trace + 10 batch-dedup), all passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. [High] Fix hybrid trace timing: replace sequential vector_search/bm25_search
   stages with single parallel_search stage that correctly represents concurrent execution
2. [High] Fix negative drop display: search stages with input=0 now show
   "found N" instead of "dropped -N"
3. [Medium] Fix rerankUsed overcount: only emit rerank trace stage when rerank
   is actually enabled (config.rerank !== "none"), not on every query

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant