Skip to content

hal-29/semantic-search-engine-app

Repository files navigation

Semantic Search Engine (SSE)

A FastAPI-powered semantic search application with React frontend, featuring document uploads, vector embeddings, and LLM-powered question answering.

✨ Features

  • Document Processing: Upload PDFs, Word docs, and text files
  • Semantic Search: Find relevant content using vector embeddings
  • RAG Integration: Generate AI-powered answers from your documents
  • Session Management: Isolate document sets per session
  • Modern Stack: FastAPI + React + FAISS + HuggingFace models
Feature Description
Multi-format Support PDF, Word, Text files
Hybrid Search Semantic + keyword search
RAG Integration Context-aware answers
Performance Optimized chunking & indexing

🛠️ Installation

Docker runner

docker compose up --build
# Accsess the app at http:localhost:8000

Development Setup

   # Linux / Mac
   python -m venv venv
   source venv/bin/activate 

   # Windows
   venv\Scripts\activate

   # run
   pip install -r requirements.txt
   uvicorn main:app --reload

Frontend

   cd frontend
   pnpm install
   pnpm run dev

API Endpoints

Endpoint Method Description
/api/session GET Create a new session
/api/upload POST Upload files and resource
/api/search GET Perform semantic search

How the app operates?

  1. You Open the App

    The app creates a unique session ID for you

    Behind the scenes:
    → Generates a random ID
    → Creates an empty folder named after that ID to store your files.

  2. You Upload Files

    You drag/drop PDFs, Word docs, or text files into the app.

    Behind the scenes:
    → Saves files to your session folder.
    → Breaks each file into small text chunks (e.g., 1-2 sentences each).
    → Stores these chunks in a list with metadata (file name, page number, etc.).

  3. The App "Understands" Your Files

    The app converts every text chunk into number sequences (vectors) using AI.

    Behind the scenes:
    → Uses a pre-trained model (all-MiniLM-L6-v2) to generate vectors.
    → Builds a searchable index using FAISS (Facebook’s search tool).

  4. You Search for Something

    Behind the scenes:
    → Converts your query into a barcode (vector) using the same AI model.
    → Compares it against all document barcodes to find the closest matches.
    → Reranks results using a second AI (cross-encoder) to prioritize relevance.

  5. You Get Results

    The app shows you:
    -> Direct excerpts from your documents (sorted by relevance).
    -> AI-generated summary (if RAG is enabled), combining the top matches into a natural answer.

Simple Analogy

I. Session ID = Your private locker.

II. File Upload = Putting books into the locker.

III. Chunking = Tearing out pages and highlighting paragraphs.

IV. Vectors = Giving each paragraph a unique ID number.

V. Search = Finding paragraphs with matching ID numbers to your question.

VI. AI Answer = A friend (the LLM) reads those paragraphs and explains the answer to you.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors