Skip to content

docs: RAG tuning guide — jina-embeddings-v5-text-small + qwen3-vl-8b SmartExtraction#304

Open
ordiy wants to merge 1 commit intoCortexReach:masterfrom
ordiy:docs/rag-tuning-jina-v5-openclaw
Open

docs: RAG tuning guide — jina-embeddings-v5-text-small + qwen3-vl-8b SmartExtraction#304
ordiy wants to merge 1 commit intoCortexReach:masterfrom
ordiy:docs/rag-tuning-jina-v5-openclaw

Conversation

@ordiy
Copy link

@ordiy ordiy commented Mar 22, 2026

Summary

This PR adds a practical tuning guide to docs/ based on real production deployments of memory-lancedb-pro v1.1.0-beta.9 in OpenClaw 2026.3.8/2026.3.13 environments.

What's included

New file: docs/rag-tuning-jina-v5-openclaw.md

Topics covered

  1. Embedding model comparison — jina-embeddings-v3 vs jina-embeddings-v5-text-small on a Chinese-language domain corpus

    • Entity recall: +3.1% (74.7% → 77.8%)
    • Technical keyword phrase: +14.8% (11.8% → 26.6%)
    • Architecture details from the model card (Qwen3-0.6B-Base, 677M params, 32K context, MTEB 71.7)
  2. SmartExtraction LLM comparison — qwen-2.5-7b vs qwen3-14b vs qwen3-vl-8b-instruct

    • qwen3-vl-8b-instruct: 35 completion tokens, $0.0000211/call vs 456 tokens for qwen3-14b
    • Why newer "thinking" models are a poor fit for extraction tasks
  3. Embedding migration procedure — step-by-step including the reembed --force double-insert pitfall

  4. Two bugs documented with patches:

    • AUTO_RECALL_TIMEOUT_MS = 3_000 hardcoded — too tight for cloud embedding APIs (sed patch included)
    • Default llm.model = "openai/gpt-oss-120b" silently fails against non-OpenAI base URLs
  5. Cross-version compatibility table — 2026.3.8 vs 2026.3.13 CLI command availability

Testing

Benchmarks run on actual Jina API responses with real memory data. LLM comparisons measured via OpenRouter with identical prompts.

Style

Follows the existing docs/openclaw-integration-playbook.md format: numbered sections, code blocks, practical rules over theory.

…3-vl-8b

- Benchmark comparison: jina-embeddings-v3 vs v5-text-small on Chinese domain corpus
  (entity recall +3.1%, keyword phrase recall +14.8%)
- SmartExtraction LLM comparison: qwen-2.5-7b vs qwen3-14b vs qwen3-vl-8b-instruct
  (qwen3-vl-8b: 35 completion tokens, 5x cheaper, Chinese fidelity preserved)
- Documents the autoRecallTimeoutMs patch for hardcoded 3s timeout bug
- Documents the default llm.model silent-failure bug (gpt-oss-120b on Jina baseURL)
- Full model migration procedure including the reembed double-insert pitfall
- Cross-version compatibility table (2026.3.8 vs 2026.3.13)
- Covers both macOS (workspace/plugins) and Linux (extensions/) deployment paths

Tested on: OpenClaw 2026.3.8 (Linux) and 2026.3.13 (macOS),
memory-lancedb-pro v1.1.0-beta.9, Jina AI API, OpenRouter

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant