Skip to content

rohitbansal2005/Search-Engine

Repository files navigation

C++ Mini Search Engine (Client-Server)

A beginner-friendly multi-threaded search engine project in C++ that demonstrates:

  • DSA: Inverted index (unordered_map<string, vector<int>>)
  • Relevance Ranking: Search results sorted by keyword match score
  • OS: Multi-threaded request handling with std::thread
  • CN: TCP/IP client-server communication using Winsock
  • DBMS: Persistent file-based storage (data/documents.db)
  • System Design: Clear separation of client, server, indexing, and storage modules
  • Libraries: Optional Boost and Abseil integration for cleaner parsing/utilities

Core Tech Stack

  • C++17
  • CMake
  • TCP/IP sockets (Windows + Linux)
  • STL containers + threading
  • Boost (optional, auto-detected)
  • Abseil (optional, auto-detected)

Architecture

  • server: Accepts TCP connections and handles each client in a separate thread.
  • client: Sends add/search commands to server.
  • search_engine: Handles indexing + storage logic.

Command Protocol

  • ADD|title|content
  • SEARCH|mode|query where mode is AND or OR
  • PING (health probe, returns PONG)
  • QUIT

Web UI + HTTP API

Server also supports browser usage on the same port.

  • Open: http://localhost:8080/
  • Health endpoint: GET /health
  • Search API: GET /api/search?q=network+thread&mode=OR

This works alongside the existing TCP client protocol.

Response examples:

  • OK|Document added with ID 3
  • PONG
  • RESULTS|2
  • DOC|3|Title|Content
  • END
  • ERROR|message

Example searches:

  • SEARCH|OR|network thread
  • SEARCH|AND|database indexing

Build (Windows + CMake)

cmake -S . -B build
cmake --build build --config Release

If Boost/Abseil are installed and discoverable, CMake enables them automatically.

Production Deployment (Docker)

Build and run with Docker Compose:

docker compose up -d --build

Check logs:

docker compose logs -f search-server

Stop service:

docker compose down

For complete free production deployment on Oracle VM, see:

Helper scripts:

Production details:

  • Server listens on 0.0.0.0:8080
  • Host port mapping: 8080:8080
  • Persistent storage mounted via volume: ./data:/app/data
  • Restart policy: unless-stopped
  • Healthcheck uses PING/PONG over TCP to verify server responsiveness

Run

  1. Start server:
.\build\Release\server.exe
  1. In another terminal, start client:
.\build\Release\client.exe

Data Persistence

Documents are stored in:

  • data/documents.db

Each line format: id<TAB>title<TAB>content

Starter dataset is preloaded in data/documents.db for quick demo.

About

A multi-threaded C++ search engine with TCP client-server architecture, inverted indexing, relevance ranking, and optional HTTP API.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors