A beginner-friendly multi-threaded search engine project in C++ that demonstrates:
- DSA: Inverted index (
unordered_map<string, vector<int>>) - Relevance Ranking: Search results sorted by keyword match score
- OS: Multi-threaded request handling with
std::thread - CN: TCP/IP client-server communication using Winsock
- DBMS: Persistent file-based storage (
data/documents.db) - System Design: Clear separation of client, server, indexing, and storage modules
- Libraries: Optional Boost and Abseil integration for cleaner parsing/utilities
- C++17
- CMake
- TCP/IP sockets (Windows + Linux)
- STL containers + threading
- Boost (optional, auto-detected)
- Abseil (optional, auto-detected)
server: Accepts TCP connections and handles each client in a separate thread.client: Sends add/search commands to server.search_engine: Handles indexing + storage logic.
ADD|title|contentSEARCH|mode|querywheremodeisANDorORPING(health probe, returnsPONG)QUIT
Server also supports browser usage on the same port.
- Open:
http://localhost:8080/ - Health endpoint:
GET /health - Search API:
GET /api/search?q=network+thread&mode=OR
This works alongside the existing TCP client protocol.
Response examples:
OK|Document added with ID 3PONGRESULTS|2DOC|3|Title|ContentENDERROR|message
Example searches:
SEARCH|OR|network threadSEARCH|AND|database indexing
cmake -S . -B build
cmake --build build --config ReleaseIf Boost/Abseil are installed and discoverable, CMake enables them automatically.
Build and run with Docker Compose:
docker compose up -d --buildCheck logs:
docker compose logs -f search-serverStop service:
docker compose downFor complete free production deployment on Oracle VM, see:
Helper scripts:
Production details:
- Server listens on
0.0.0.0:8080 - Host port mapping:
8080:8080 - Persistent storage mounted via volume:
./data:/app/data - Restart policy:
unless-stopped - Healthcheck uses
PING/PONGover TCP to verify server responsiveness
- Start server:
.\build\Release\server.exe- In another terminal, start client:
.\build\Release\client.exeDocuments are stored in:
data/documents.db
Each line format:
id<TAB>title<TAB>content
Starter dataset is preloaded in data/documents.db for quick demo.