RAG Document Q&A System

Try the Demo!

Recruiters: Test this system in 5 minutes! See RECRUITER_SETUP.md for quick setup instructions.

# Windows - Quick Start
demo_setup.bat    # One-time setup
start_demo.bat    # Launch demo (opens browser automatically)

Sample Documents: Ready-to-use test documents are included in the demo_samples/ folder.

Overview

RAG Document Q&A System is a production-grade Retrieval-Augmented Generation (RAG) platform for semantic document search and question answering. It ingests unstructured documents, generates vector embeddings, retrieves the most relevant context, and produces grounded answers with source attribution. The system emphasizes reliability, observability, and deployment flexibility.

Quick Start

Install dependencies: pip install -r requirements.txt
Configure API key: Set GEMINI_API_KEY in .env file
Start server: uvicorn main:app --reload
Open chat UI: http://localhost:8000
View API docs: http://localhost:8000/docs

Features

Multi-format document ingestion: PDF, TXT, DOCX, Markdown
Vector storage and semantic search using ChromaDB
Contextual and accurate answer generation via Google Gemini (embeddings + LLM)
Similarity score thresholding and multi-document filtering
Source citation with confidence scoring
Structured logging (trace IDs) and error handling via custom exceptions and retries
Containerized (Docker) for reproducible deployments + AWS EC2 cloud deployment
Automated testing (unit + integration) with coverage reporting
CI/CD pipelines (GitHub Actions) for build, test, and deploy automation

Architecture

FastAPI application providing REST endpoints:

Ingestion: File upload, validation, chunking
Embedding: Gemini embedding generation
Indexing: Vectors stored in ChromaDB (persistent directory)
Query: Retrieve top-k relevant chunks (threshold + filters)
Synthesis: Assemble context and generate answer via LLM
Response: Return answer, cited sources, and metadata

Key Modules:

api/: Routing, request/response schemas, middleware
rag/: Ingestion, pipeline orchestration, retrieval logic
core/: Configuration management, LLM + embedding clients
database/: Vector store abstraction (ChromaDB)
utils/: Logging, retry strategy, custom exceptions

Data Flow

Upload → Validation → Chunking → Embedding → Vector Store → Query → Retrieval → Answer Generation → Response

Installation

Prerequisites: Python 3.11+, Google Gemini API key

python -m venv venv
./venv/Scripts/Activate.ps1
pip install -r requirements.txt
copy .env.example .env  # Set GEMINI_API_KEY in .env

Configuration

Environment variables (in .env):

Variable	Purpose	Default
GEMINI_API_KEY	LLM + embedding access	Required
CHUNK_SIZE	Text chunk token size	500
CHUNK_OVERLAP	Overlap between chunks	50
TOP_K_RESULTS	Max retrieved chunks	5
MIN_SIMILARITY_SCORE	Similarity filter threshold	0.3
MAX_UPLOAD_SIZE	Max file size in bytes (10MB = 10000000)	10000000
LOG_LEVEL	Log verbosity	INFO

Running Locally

uvicorn main:app --reload --host 0.0.0.0 --port 8000

API Endpoints

Method	Endpoint	Description
POST	`/api/v1/documents/upload`	Upload and ingest a document
GET	`/api/v1/documents`	List ingested documents
DELETE	`/api/v1/documents/{doc_id}`	Delete a document
POST	`/api/v1/query`	Submit a question and get answer
GET	`/api/v1/health`	Health check endpoint

Testing

Unit and Integration Tests

# Run all tests
pytest tests -v

# Run with coverage report
pytest tests -v --cov=src --cov-report=html
# View report
start htmlcov/index.html

Logging and Error Handling Tests

Test the structured JSON logging, trace IDs, custom exceptions, and retry mechanisms:

# Automated test suite (recommended)
python test_logging_errors.py

For comprehensive testing guide, see TESTING_LOGGING_ERRORS.md

Evaluation

Run comprehensive evaluation to benchmark system performance:

python evaluate.py

This generates evaluation_report.json with metrics:

NDCG@5: Retrieval ranking quality (Normalized Discounted Cumulative Gain)
Similarity scores: Average relevance of retrieved documents
Response time: End-to-end latency analysis
Topic coverage: How well answers address expected topics
SLA compliance: Percentage of queries meeting the 15-second threshold

Deployment

Docker Compose

docker-compose -f deployment/docker/docker-compose.yml up -d

Docker Image

docker build -f deployment/docker/Dockerfile -t rag-qa:latest .
docker run -p 8000:8000 --env-file .env rag-qa:latest

Cloud (AWS EC2)

Launch an Ubuntu 22.04 EC2 instance
Clone the repository and run deployment/aws/ec2-setup.sh
CI/CD: Use GitHub Actions for build/test/deploy

Directory Structure

src/
  api/          # Routes, schemas, middleware
  core/         # Settings, embeddings, LLM client
  rag/          # Pipeline, ingestion, retrieval
  database/     # Vector store wrapper (ChromaDB)
  utils/        # Logging, exceptions, retry
tests/          # Unit and integration tests
deployment/     # Docker and AWS assets
static/         # Front-end assets (chat UI)
main.py         # FastAPI entry point
requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Document Q&A System

Try the Demo!

Overview

Quick Start

Features

Architecture

Data Flow

Installation

Configuration

Running Locally

API Endpoints

Testing

Unit and Integration Tests

Logging and Error Handling Tests

Evaluation

Deployment

Docker Compose

Docker Image

Cloud (AWS EC2)

Directory Structure

About

Uh oh!

Releases

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
demo_samples		demo_samples
deployment		deployment
docs		docs
src		src
static		static
tests		tests
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
RECRUITER_SETUP.md		RECRUITER_SETUP.md
TESTING_LOGGING_ERRORS.md		TESTING_LOGGING_ERRORS.md
demo_setup.bat		demo_setup.bat
evaluate.py		evaluate.py
main.py		main.py
requirements.txt		requirements.txt
start_demo.bat		start_demo.bat
test_logging_errors.py		test_logging_errors.py

Folders and files

Latest commit

History

Repository files navigation

RAG Document Q&A System

Try the Demo!

Overview

Quick Start

Features

Architecture

Data Flow

Installation

Configuration

Running Locally

API Endpoints

Testing

Unit and Integration Tests

Logging and Error Handling Tests

Evaluation

Deployment

Docker Compose

Docker Image

Cloud (AWS EC2)

Directory Structure

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Contributors

Uh oh!

Languages