Skip to content

chelorope/local-llm-rag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Local LLM RAG System

A local Retrieval-Augmented Generation (RAG) system that allows you to chat with your PDF documents using LLMs (Large Language Models) running on your machine through Ollama.

Overview

This project implements a full-stack RAG system with the following components:

  • Web UI: A Streamlit-based interface for uploading documents, managing your document library, and chatting with your documents
  • API Backend: A FastAPI service that handles document processing, vector storage, and LLM interactions
  • Database Storage: MongoDB for storing documents and conversation history
  • Vector Storage: ChromaDB for storing and searching document embeddings
  • LLM Integration: Uses Ollama to run local language models for embeddings and completion

Prerequisites

Running in Docker

This section explains how to run the app using conteinarized services

Installation

  1. Clone this repository:
git clone <repository-url>
cd local-llm-rag

Run

  1. Start the application:
docker-compose up --build

Running locally

This section explains how to run the API and UI components locally while using containerized MongoDB and ChromaDB.

Installation

  1. Install Poetry if you haven't already:
pip install poetry
  1. Install API dependencies:
cd api
poetry install
cd ..
  1. Install UI dependencies:
cd ui
poetry install
cd ..
  1. Install Ollama and pull required models:
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Pull the required models
ollama pull llama2
ollama pull nomic-embed-text

Run

  1. Start MongoDB and ChromaDB containers:
docker-compose up mongodb chroma --build
  1. Start the API server:
cd api
poetry run uvicorn src.main:app --reload --host 0.0.0.0 --port 8000
  1. Start the UI server (in another terminal):
cd ui
poetry run streamlit run src/app.py
  1. Access the application:

Note: Make sure you have Ollama installed and running locally with your desired models. The API will connect to Ollama on the default address (http://localhost:11434).

Usage

  1. Open your browser and navigate to http://localhost:8501 to access the UI

  2. Upload PDF documents using the file uploader in the sidebar

  3. Ask questions about your documents in the chat interface

  4. Start a new conversation or delete documents as needed using the sidebar controls

Project Structure

  • api/: FastAPI backend service

    • config/: Configuration settings
    • src/: Source code for the API components
  • ui/: Streamlit front-end application

Customization

You can modify the default LLM settings in the api/config/settings.py file:

  • Change the embedding model: embedding_model
  • Change the LLM model: llm_model
  • Adjust chunk size and overlap for document processing

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors