Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ All notable changes to RecallForge will be documented in this file.

- Added staged background reindex promotion so document, video, audio, and conversation replacements stay hidden until their parent/child memory batches are complete.
- Added index-version-aware query caching for repeated text/media embeddings and generated expansion branches.
- Added MCP progress notifications for long-running search, ingest, batch, memory write, and FTS rebuild tool calls when clients provide a progress token.
- Added deterministic memory graph enrichment with entity/relation side tables and new `memory_graph_entities` / `memory_graph_related` MCP tools.
- Replaced the tiny UAT video clips with compact episodic-memory fixtures, richer transcript sidecars, related artifact metadata, and regression coverage for the video corpus.
- Added `memory_add_conversation` so conversation threads ingest as canonical parent memories with turn-level child memories and standard memory rollups.
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -146,7 +146,7 @@ Run over HTTP/SSE:
recallforge serve --http --host 127.0.0.1 --port 7433 --mode embed
```

RecallForge now exposes **26 MCP tools** across search, ingest, memory graph navigation, collection admin, and runtime config. HTTP/SSE mode also exposes `/health`, `/sse`, and `/messages/`.
RecallForge now exposes **26 MCP tools** across search, ingest, memory graph navigation, collection admin, and runtime config. HTTP/SSE mode also exposes `/health`, `/sse`, and `/messages/`. Long-running tools emit MCP `notifications/progress` when the client supplies a request `_meta.progressToken`, so compatible HTTP/SSE clients can show live progress for ingest, search, batch, memory writes, and FTS rebuilds.

See [docs/mcp-tools.md](docs/mcp-tools.md) for the full tool reference.

Expand Down
2 changes: 2 additions & 0 deletions docs/ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -216,13 +216,15 @@ Tools: 26 MCP tools across search, ingest, memory, memory graph, collection admi
Transport: stdio (default) or HTTP/SSE (`/health`, `/sse`, `/messages/`)
Startup: backend.warm_up() for predictable latency
Signals: SIGTERM/SIGINT graceful shutdown
Progress: request `_meta.progressToken` enables `notifications/progress` during long-running tool calls
```

Key runtime details:

- Blocking tool work is routed through a bounded async semaphore to avoid overloading local model/runtime resources
- HTTP mode requires the optional `server` extra (`starlette` + `uvicorn`)
- Runtime-safe config changes (`mode`, `collection`, `rerank_top_k`, `caption_media`, model IDs) are exposed through `get_config` / `set_config`
- Progress notifications are best-effort and preserve stable final response JSON. `search_batch` reports per-query completion before returning the final merged results; `batch` reports per-operation completion.

## Storage Layout

Expand Down
11 changes: 11 additions & 0 deletions docs/mcp-tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,17 @@ HTTP mode also exposes:
- `/sse`
- `/messages/`

## Progress Notifications

RecallForge supports MCP progress notifications for long-running tool calls. When a client includes `_meta.progressToken` in a request, compatible transports receive `notifications/progress` events with numeric progress, optional total, and a human-readable status message.

Progress is best-effort and does not change the final tool response shape. It currently covers:

- search and explain phases
- vector and full-text search phases
- `search_batch` per-query completion updates before the final merged result
- `ingest`, individual index/memory writes, `batch`, and `rebuild_fts`

Example MCP client config (Claude Desktop):

```json
Expand Down
8 changes: 5 additions & 3 deletions docs/research/recallforge-memory-mcp-roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,16 +135,18 @@ Goal:
- Prove RecallForge as a memory MCP, not just a benchmark pipeline.

Current Linear fit:
- `REC-160`
- `REC-153`
- `REC-33`

Shipped Linear work:
- `REC-153`
- `REC-61`

What this phase delivers:
- memory-level evaluation
- explanation quality checks
- latency and RSS budget enforcement
- real episodic corpora coverage
- MCP progress notifications for long-running search, ingest, batch, and rebuild workflows
- alpha and beta validation with real workflows

Why this comes last:
Expand All @@ -156,7 +158,7 @@ Why this comes last:
- Keep `Retrieval and Ranking` for cheap broad retrieval work like `REC-169`, `REC-148`, `REC-72`, `REC-71`, `REC-146`
- Add a milestone such as `Memory Policy and Enrichment` for `REC-84`, `REC-83`, `REC-75`, `REC-76`, `REC-78`
- Keep `Research Queue` for gated expensive-stage work like `REC-130`, `REC-115`, `REC-147`, `REC-168`
- Keep `Benchmark Integrity` and `Launch and Distribution` for `REC-160`, `REC-153`, `REC-33`, `REC-61`
- Keep `Benchmark Integrity` and `Launch and Distribution` for `REC-33` and any future public validation work

## Architecture Principle

Expand Down
12 changes: 11 additions & 1 deletion src/recallforge/search.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
import time
from dataclasses import dataclass, field, replace
from hashlib import sha256
from typing import List, Dict, Any, Optional, Union
from typing import Any, Callable, Dict, List, Optional, Union

from .backends.base import ModelBackend
from .cache import EmbeddingCache
Expand Down Expand Up @@ -1769,6 +1769,7 @@ def search_batch(
profile: Optional[str] = None,
max_workers: int = 4,
rrf_k: int = 60,
progress_callback: Optional[Callable[[int, int, int], None]] = None,
) -> List[BatchSearchResult]:
"""
Run multiple search queries in parallel and merge results using RRF.
Expand All @@ -1789,6 +1790,8 @@ def search_batch(
profile: Optional profile namespace filter
max_workers: Maximum parallel threads
rrf_k: RRF fusion constant
progress_callback: Optional callback invoked as each query branch
completes with (completed_count, total_count, branch_result_count)

Returns:
List of BatchSearchResult objects, sorted by best merged score
Expand Down Expand Up @@ -1845,6 +1848,7 @@ def run_single_query(q: BatchQuery) -> List[tuple]:

# Run all queries in parallel
all_results: List[List[tuple]] = [[] for _ in batch_queries]
completed_queries = 0
with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as executor:
future_to_idx = {
executor.submit(run_single_query, q): i
Expand All @@ -1857,6 +1861,12 @@ def run_single_query(q: BatchQuery) -> List[tuple]:
except Exception as e:
logger.error("Batch query %d failed: %s", idx, e)
all_results[idx] = []
completed_queries += 1
if progress_callback is not None:
try:
progress_callback(completed_queries, len(batch_queries), len(all_results[idx]))
except Exception as exc:
logger.debug("search_batch progress callback failed: %s", exc)

# Merge results using RRF with best-score-wins
merged: Dict[str, Dict[str, Any]] = {}
Expand Down
Loading