Local-first website RAG with Next.js, Crawlee, Ollama, AI SDK, and zvec.
Vectorfetch lets you submit a website root URL, crawl that site locally, turn the readable content into chunks and embeddings, store them in a temporary local vector index, and chat against the indexed site with local models.
- Crawl a same-origin website recursively from a single root URL
- Extract readable HTML content and skip obvious layout noise
- Chunk and embed content locally through Ollama
- Store vectors in a local in-process zvec collection
- Chat against the active indexed site with retrieval-backed answers
- Show crawl and indexing progress in the UI while the local index builds
- Submit a root URL in the app.
- Vectorfetch crawls same-origin links recursively, up to the current crawl limit.
- It extracts readable content from
main,article, orbodyHTML. - The content is chunked into retrieval-friendly text windows.
- Chunks are embedded locally with Ollama.
- Embeddings and chunk metadata are written into a local zvec collection.
- Chat requests can retrieve relevant chunks from the active site index before answering.
Default models:
- Chat:
lfm2:24b - Embeddings:
qwen3-embedding:0.6b
Example:
ollama pull lfm2:24b
ollama pull qwen3-embedding:0.6bbun install
cp .env.example .env
bun devThen open http://localhost:3000.
Model selection is optional. If you do nothing, Vectorfetch uses the built-in defaults from the app.
Environment variables:
VECTORFETCH_CHAT_MODELVECTORFETCH_EMBEDDING_MODELVECTORFETCH_CRAWL_USER_AGENTVECTORFETCH_CRAWL_MAX_CONCURRENCYVECTORFETCH_CRAWL_DELAY_MSVECTORFETCH_ZVEC_INSERT_BATCH_SIZE
Current defaults:
VECTORFETCH_CHAT_MODEL=lfm2:24b
VECTORFETCH_EMBEDDING_MODEL=qwen3-embedding:0.6b
VECTORFETCH_CRAWL_USER_AGENT=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36
VECTORFETCH_CRAWL_MAX_CONCURRENCY=4
VECTORFETCH_CRAWL_DELAY_MS=750
VECTORFETCH_ZVEC_INSERT_BATCH_SIZE=200If you want different local Ollama models, copy .env.example to .env and
replace those values.
Vectorfetch uses CheerioCrawler by default. It sends browser-like request
headers, uses a conservative crawl concurrency, and now applies request
backoff adaptively when it starts seeing blocked or rate-limited responses.
Normal sites stay fast; sites that begin returning 403 or 429 responses
are slowed down automatically to reduce avoidable blocks.
Some sites still block non-browser crawlers, especially JS-heavy or strongly protected properties. When that happens, blocked pages are surfaced in the UI activity feed, but the current implementation does not yet fall back to a real browser crawler automatically.
This project is licensed under the MIT License.