A local Retrieval-Augmented Generation (RAG) system that allows you to chat with your PDF documents using LLMs (Large Language Models) running on your machine through Ollama.
This project implements a full-stack RAG system with the following components:
- Web UI: A Streamlit-based interface for uploading documents, managing your document library, and chatting with your documents
- API Backend: A FastAPI service that handles document processing, vector storage, and LLM interactions
- Database Storage: MongoDB for storing documents and conversation history
- Vector Storage: ChromaDB for storing and searching document embeddings
- LLM Integration: Uses Ollama to run local language models for embeddings and completion
- Docker and Docker Compose
This section explains how to run the app using conteinarized services
- Clone this repository:
git clone <repository-url>
cd local-llm-rag
- Start the application:
docker-compose up --build
This section explains how to run the API and UI components locally while using containerized MongoDB and ChromaDB.
- Install Poetry if you haven't already:
pip install poetry- Install API dependencies:
cd api
poetry install
cd ..- Install UI dependencies:
cd ui
poetry install
cd ..- Install Ollama and pull required models:
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Pull the required models
ollama pull llama2
ollama pull nomic-embed-text- Start MongoDB and ChromaDB containers:
docker-compose up mongodb chroma --build- Start the API server:
cd api
poetry run uvicorn src.main:app --reload --host 0.0.0.0 --port 8000- Start the UI server (in another terminal):
cd ui
poetry run streamlit run src/app.py- Access the application:
- Web UI: http://localhost:8501
- API Documentation: http://localhost:8000/docs
Note: Make sure you have Ollama installed and running locally with your desired models. The API will connect to Ollama on the default address (http://localhost:11434).
-
Open your browser and navigate to http://localhost:8501 to access the UI
-
Upload PDF documents using the file uploader in the sidebar
-
Ask questions about your documents in the chat interface
-
Start a new conversation or delete documents as needed using the sidebar controls
-
api/: FastAPI backend serviceconfig/: Configuration settingssrc/: Source code for the API components
-
ui/: Streamlit front-end application
You can modify the default LLM settings in the api/config/settings.py file:
- Change the embedding model:
embedding_model - Change the LLM model:
llm_model - Adjust chunk size and overlap for document processing