docs: RAG tuning guide — jina-embeddings-v5-text-small + qwen3-vl-8b SmartExtraction by ordiy · Pull Request #304 · CortexReach/memory-lancedb-pro

ordiy · 2026-03-22T08:24:01Z

Summary

This PR adds a practical tuning guide to docs/ based on real production deployments of memory-lancedb-pro v1.1.0-beta.9 in OpenClaw 2026.3.8/2026.3.13 environments.

What's included

New file: docs/rag-tuning-jina-v5-openclaw.md

Topics covered

Embedding model comparison — jina-embeddings-v3 vs jina-embeddings-v5-text-small on a Chinese-language domain corpus
- Entity recall: +3.1% (74.7% → 77.8%)
- Technical keyword phrase: +14.8% (11.8% → 26.6%)
- Architecture details from the model card (Qwen3-0.6B-Base, 677M params, 32K context, MTEB 71.7)
SmartExtraction LLM comparison — qwen-2.5-7b vs qwen3-14b vs qwen3-vl-8b-instruct
- qwen3-vl-8b-instruct: 35 completion tokens, $0.0000211/call vs 456 tokens for qwen3-14b
- Why newer "thinking" models are a poor fit for extraction tasks
Embedding migration procedure — step-by-step including the reembed --force double-insert pitfall
Two bugs documented with patches:
- AUTO_RECALL_TIMEOUT_MS = 3_000 hardcoded — too tight for cloud embedding APIs (sed patch included)
- Default llm.model = "openai/gpt-oss-120b" silently fails against non-OpenAI base URLs
Cross-version compatibility table — 2026.3.8 vs 2026.3.13 CLI command availability

Testing

Benchmarks run on actual Jina API responses with real memory data. LLM comparisons measured via OpenRouter with identical prompts.

Style

Follows the existing docs/openclaw-integration-playbook.md format: numbered sections, code blocks, practical rules over theory.

…3-vl-8b - Benchmark comparison: jina-embeddings-v3 vs v5-text-small on Chinese domain corpus (entity recall +3.1%, keyword phrase recall +14.8%) - SmartExtraction LLM comparison: qwen-2.5-7b vs qwen3-14b vs qwen3-vl-8b-instruct (qwen3-vl-8b: 35 completion tokens, 5x cheaper, Chinese fidelity preserved) - Documents the autoRecallTimeoutMs patch for hardcoded 3s timeout bug - Documents the default llm.model silent-failure bug (gpt-oss-120b on Jina baseURL) - Full model migration procedure including the reembed double-insert pitfall - Cross-version compatibility table (2026.3.8 vs 2026.3.13) - Covers both macOS (workspace/plugins) and Linux (extensions/) deployment paths Tested on: OpenClaw 2026.3.8 (Linux) and 2026.3.13 (macOS), memory-lancedb-pro v1.1.0-beta.9, Jina AI API, OpenRouter Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: RAG tuning guide — jina-embeddings-v5-text-small + qwen3-vl-8b SmartExtraction#304

docs: RAG tuning guide — jina-embeddings-v5-text-small + qwen3-vl-8b SmartExtraction#304
ordiy wants to merge 1 commit intoCortexReach:masterfrom
ordiy:docs/rag-tuning-jina-v5-openclaw

ordiy commented Mar 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ordiy commented Mar 22, 2026

Summary

What's included

Topics covered

Testing

Style

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant