📚 AdaptiveRAG

title	AdaptiveRAG
sdk	docker
pinned	true
license	mit
short_description	Agentic + Self-RAG + Modular RAG with visual pipeline UI

📚 AdaptiveRAG

Production-grade RAG combining Modular · Self-RAG · Agentic patterns

Every stage of the pipeline is visualized live — from raw text to grounded answer with citations.

🚀 Try Live Demo · 💻 Run Locally

🎬 Demo

📹 Watch full pipeline demo →

Shows: question encoding → Self-RAG routing → hybrid retrieval → 2D vector space → self-critique

🧠 What makes this different

Most RAG demos do: embed query → cosine search → stuff into prompt. This does:

User question
   ↓  embed (MiniLM-L6 → 384-dim vector)
   ↓  Self-RAG router  →  RETRIEVE / ANSWER_DIRECTLY / CLARIFY
   ↓  Planner          →  break into focused sub-queries
   ↓  Dense retrieval  →  ChromaDB cosine similarity (k=12)
   ↓  Sparse retrieval →  BM25 keyword matching (k=12)
   ↓  RRF fusion       →  Reciprocal Rank Fusion merge
   ↓  Cross-encoder    →  BGE reranker deep relevance scoring (top 5)
   ↓  LLM answer       →  Qwen3-VL (local) / LLaMA 3.1 via Groq (hosted)
   ↓  Self-critique     →  grounded? complete? confidence score
   ↓  Refine & retry   →  if confidence < 0.85
   →  Answer + citations + trace

🔬 Underhood Pipeline View

Every step renders its inputs and outputs as it runs:

Step	What you see
1 · Question encoding	Embedding model · 384 dimensions · L2 norm · latency · first-32-dim bar chart · raw `vector[0:8]` values
2 · Self-RAG router	Color-coded decision pill (`RETRIEVE` / `ANSWER_DIRECTLY` / `CLARIFY`) + LLM reasoning
3 · Planner	Sub-query cards with rationale for each step
4 · Dense retrieval	Cosine similarity bar chart + chunk cards with scores
4 · Sparse retrieval	BM25 normalized score chart + chunk cards
4 · RRF fusion	Merged ranking chart showing how both lists combine
4 · Cross-encoder rerank	BGE relevance score chart (final top-5)
4 · Vector space	2D PCA scatter — query vs all hits, colored by source (dense / sparse / both)
5 · Context assembly	Exact passages handed to the LLM, with metadata
6 · Self-critique	Grounded ✅ · Complete ✅ · Confidence score vs threshold

🗂️ Knowledge Base

14 foundational AI papers pre-indexed as 1,934 semantic chunks:

Category	Papers
Transformers	Attention Is All You Need · BERT · GPT-3
Diffusion	DDPM · DDIM
RAG	RAG Original · RAG Survey · Self-RAG · HyDE
Vision	ViT · CLIP
Agents	ReAct · Chain-of-Thought
LLMs	LLM Survey

⚙️ Tech Stack

Component	Tool	Why
Vector DB	ChromaDB (local)	No API cost, persistent
Dense embeddings	`all-MiniLM-L6-v2`	Fast, 384-dim, normalized
Sparse retrieval	`rank-bm25` (BM25Okapi)	Keyword precision
Fusion	Reciprocal Rank Fusion	Combines rankings without score normalization
Reranker	`BAAI/bge-reranker-base`	Cross-encoder, deep relevance scoring
LLM (local)	Qwen3-VL 8B via Ollama	Vision-language, runs on Apple Silicon
LLM (hosted)	LLaMA 3.1 8B via Groq	Free API, fast inference
UI	Streamlit	Fast to build, easy to demo

🚀 Run Locally

git clone https://github.com/Gh-Novel/AdaptiveRAG
cd AdaptiveRAG

python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

# start Ollama with the vision-language model
ollama serve
ollama pull qwen3-vl:8b-instruct-q8_0-optimized

streamlit run app.py

Or use the CLI:

.venv/bin/python ask.py "How does Self-RAG decide when to retrieve?"

☁️ Hosted on Hugging Face

The live demo runs on HF Spaces (CPU free tier) with Groq API handling LLM calls.

Embedding + retrieval + reranking run locally inside the container (MiniLM + BGE)
GROQ_API_KEY secret drives routing, planning, answering, and self-critique
Pre-built index (1,934 chunks, ~59 MB) is committed via git-lfs — no ingestion on startup

🤗 Open Live Demo

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.streamlit		.streamlit
agent		agent
healing		healing
ingestion		ingestion
llm		llm
retrieval		retrieval
storage		storage
versioning		versioning
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
ask.py		ask.py
config.py		config.py
download_papers.sh		download_papers.sh
ingest.py		ingest.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📚 AdaptiveRAG

Production-grade RAG combining Modular · Self-RAG · Agentic patterns

🎬 Demo

🧠 What makes this different

🔬 Underhood Pipeline View

🗂️ Knowledge Base

⚙️ Tech Stack

🚀 Run Locally

☁️ Hosted on Hugging Face

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📚 AdaptiveRAG

Production-grade RAG combining Modular · Self-RAG · Agentic patterns

🎬 Demo

🧠 What makes this different

🔬 Underhood Pipeline View

🗂️ Knowledge Base

⚙️ Tech Stack

🚀 Run Locally

☁️ Hosted on Hugging Face

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages