Skip to content

[codex] Add index-versioned query cache#35

Merged
brianmeyer merged 1 commit into
masterfrom
codex/rec-115-index-version-cache
May 17, 2026
Merged

[codex] Add index-versioned query cache#35
brianmeyer merged 1 commit into
masterfrom
codex/rec-115-index-version-cache

Conversation

@brianmeyer
Copy link
Copy Markdown
Owner

Summary

  • Add model- and index-version-aware query cache keys for repeated text, image, video, and generated expansion inputs.
  • Persist a durable storage index version in LanceDB cache metadata and bump it only for visible index mutations or batch promotion.
  • Document the cache invalidation behavior and add regression coverage for version changes and hidden staged batches.

Validation

  • python3 -m pytest -q
  • python3 -m compileall -q src tests
  • git diff --check
  • bash tests/uat/test_mcp_server.sh
  • .venv/bin/python -m pip wheel . -w /tmp/recallforge-wheel-rec115
  • PYTHONPATH=/tmp/recallforge-twine .venv/bin/python -m twine check /tmp/recallforge-wheel-rec115/recallforge-0.3.0-py3-none-any.whl

@brianmeyer brianmeyer merged commit f9ace1c into master May 17, 2026
4 checks passed
@brianmeyer brianmeyer deleted the codex/rec-115-index-version-cache branch May 17, 2026 20:14
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 42bc919878

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/recallforge/search.py
Comment on lines +517 to +521
for attr in ("model_name", "model_id", "model", "_model_name"):
value = getattr(self.backend, attr, None)
if isinstance(value, str) and value:
return value
return type(self.backend).__name__
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Derive cache model ID from active embedder model

_cache_model_id() only inspects model_name/model_id/model/_model_name, then falls back to the backend class name, but the shipped backends track the active embedding model in fields like EMBEDDER_MODEL (and MLX can change it at runtime via set_model_ids). In that case the cache key does not change when the embedder model changes, so repeated queries can reuse vectors generated by the previous model, producing retrieval in the wrong embedding space after a model switch.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant