RAG Chatbot

A multi-PDF conversational chatbot powered by Retrieval-Augmented Generation (RAG), running entirely on your local machine with Ollama and LangChain.

Upload one or more PDF documents, and the chatbot will answer questions grounded in their content — with source citations, semantic answer caching, and real-time streaming.

Features

Multi-PDF ingestion — upload and query several documents at once.
Local LLM inference — no API keys, no cloud services; everything runs on-device via Ollama.
Streaming responses — tokens are rendered in real time for a responsive chat experience.
Semantic answer cache — repeated or similar questions are served instantly from an embedding-based cache with configurable similarity threshold.
Multi-question detection — compound questions are automatically split and answered individually.
Source citations — every answer references the originating document and page number.
Debug mode — toggle an on-screen diagnostic panel for retrieval timing, similarity scores, and processing details.

Prerequisites

Requirement	Details
Python	3.10 or later
Ollama	Installed and running locally — see ollama.ai
LLM model	Pulled into Ollama (default: `phi4-mini`)
Embedding model	Pulled into Ollama (default: `all-minilm:l6-v2`)

Pull the required models before first use:

ollama pull phi4-mini
ollama pull all-minilm:l6-v2

Installation

Clone the repository:

git clone https://github.com/PenSul/RAG-Chatbot.git
cd rag-chatbot

Create and activate a virtual environment (recommended):

python -m venv venv
source venv/bin/activate   # Linux / macOS
venv\Scripts\activate      # Windows

Install the package in editable mode:
```
pip install -e .
```
This installs all dependencies listed in pyproject.toml and makes the rag_chatbot package importable.

Usage

Make sure Ollama is running (e.g. ollama serve in a separate terminal).
Start the application:
```
streamlit run src/rag_chatbot/app.py
```
Open the URL printed to the terminal (typically http://localhost:8501).
Upload one or more PDFs via the sidebar, click Process PDFs, and start chatting.

Project Structure

src/rag_chatbot/
├── __init__.py              # Package metadata
├── app.py                   # Streamlit UI entry point
├── cache_manager.py         # Semantic question-answer cache (disk + in-memory)
├── config.py                # Centralised constants and prompt templates
├── conversation.py          # LangChain conversational chain setup
├── document_processor.py    # PDF loading, chunking, and vector-store creation
├── models.py                # Cached Ollama LLM and embedding resources
├── question_parser.py       # Multi-question detection and text cleaning
├── response_processor.py    # Post-processing and citation formatting
├── session_state.py         # Streamlit session-state initialisation
└── stream_handler.py        # Token-by-token streaming callback

Configuration

All tunable parameters live in src/rag_chatbot/config.py, including model names, chunking sizes, cache paths, similarity thresholds, and prompt templates. Modify that single file to adapt the chatbot to different models or retrieval strategies.

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
src/rag_chatbot		src/rag_chatbot
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Chatbot

Features

Prerequisites

Installation

Usage

Project Structure

Configuration

License

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RAG Chatbot

Features

Prerequisites

Installation

Usage

Project Structure

Configuration

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages