A small research scaffold for multimodal long-term memory in LLM agents.
As agent systems move from single-turn interaction to persistent workflows, memory quality becomes critical. This project explores a practical memory stack with:
- deduplication
- recency-aware retrieval
- lightweight salience signals
- unified memory item schema (
text+ optionalimage_caption) - near-duplicate filtering
- recency decay + semantic similarity scoring
- timeline-friendly retrieval traces
python -m venv .venv
source .venv/bin/activate
pip install -e .python -m agent_memory_weaver.cli add \
--store examples/store.json \
--id m1 \
--text "Compared two VLM checkpoints on OCR-heavy samples" \
--tags eval,ocr
python -m agent_memory_weaver.cli search \
--store examples/store.json \
--query "OCR benchmark insights" \
--top-k 3Retrieval score is a weighted sum of:
- semantic similarity (hashing vector cosine)
- salience (user provided, default 0.5)
- recency decay (exponential, configurable half-life)
- Add cross-session memory graph links
- Add modality-specific salience heads
- Add memory compaction and snapshotting
Best used as a prototype before integrating production vector DBs.