Lore Compendium is an AI-powered Discord bot that can answer questions about your documents. Whether you have lore books, PDFs, Word documents, or spreadsheets, the bot searches through them and provides intelligent, cited answers — entirely locally, with no cloud services.
Supported file formats: .docx, .pdf, .xlsx, .csv, .txt, .md
- Python 3.13 — python.org
- On Windows: check "Add Python to PATH" during installation
- Ollama — ollama.com
- Windows: run the installer
- Mac:
brew install ollamaor download from the website - Linux:
curl -fsSL https://ollama.com/install.sh | sh
- Go to the Discord Developer Portal
- Click New Application and give it a name
- Go to the Bot section → click Reset Token and copy it
- Enable Message Content Intent under Privileged Gateway Intents
- Go to OAuth2 → URL Generator, select
bot, choose permissions, and invite it to your server
Windows: double-click setup.bat
Mac/Linux: ./setup.sh
This installs all libraries, downloads AI models (10–20 min), and walks you through configuration.
Put your files in the input/ folder.
Windows: double-click start.bat
Mac/Linux: ./start.sh
Your bot is now running. Go to Discord and start asking questions!
| Command | Description |
|---|---|
/lore <query> |
Search all indexed documents and generate an answer |
/ask <filename> <query> |
Search within a specific document (filename autocompletes) |
/search <query> [filename] |
Preview raw matching chunks without LLM generation |
/status |
Show which documents are currently indexed |
/reindex |
Force a re-scan of the input/ folder |
/help |
Show all commands |
Conversational mode: DM the bot or @mention it in a channel for a free-form conversation. The bot automatically searches your documents when relevant.
Drag-and-drop ingestion: Drop a supported file into any channel and the bot saves and indexes it automatically, sending a follow-up message when indexing is complete.
- Hybrid search — combines vector (semantic) search with BM25 keyword search, merged via Reciprocal Rank Fusion, so both meaning and exact terms are matched
- Multi-query retrieval — generates 2 alternative phrasings of every query to improve recall when document wording differs from the question
- Context expansion — retrieved chunks are expanded with their neighbours before being sent to the LLM, providing richer surrounding context
- Cross-encoder reranking (optional) — if a rerank model is configured, retrieved chunks are reordered by a cross-encoder before generation for higher precision
- Semantic chunking — documents are split at topic boundaries rather than fixed character counts, keeping related content together; falls back to character-based splitting for very long sentences
- Source citations — every answer includes a numbered Sources footer with filename and page/line location
- Answer faithfulness check — after generation, a fast model verifies all claims are grounded in the retrieved sources; a warning is shown if not
- Conversation memory —
/loreretains the last 3 Q&A turns per user so follow-up questions are understood in context - Query result caching — identical queries return instantly from an in-memory cache (1-hour TTL, auto-invalidated when documents change)
- Live sync — a Watchdog observer detects new, changed, or deleted files in
input/and updates the index automatically - Parallel ingestion — multiple documents are loaded concurrently via
ProcessPoolExecutor - Duplicate detection — content-identical files uploaded via Discord are flagged before indexing
- Tag filtering — documents can be tagged and queries can be scoped to a specific tag
A local web interface (FastAPI + HTMX) runs alongside the bot:
- Chunk Explorer — browse indexed documents and inspect individual chunks
- Similarity Search — test retrieval queries with scored results and snippets
- Settings — edit model names and bot personality without touching
config.jsondirectly
Two separate LangGraph workflows:
RAG Pipeline (document_engine.py)
retrieve → grade_documents → [generate_rag | transform_query]
- retrieve: runs up to 3 query variants through MMR vector search + BM25, merges with RRF, optionally reranks
- grade_documents: filters irrelevant chunks in parallel using
fast_llm - transform_query: rewrites the query with
fast_llmif no relevant chunks are found (up to 3 retries) - generate_rag: expands chunk context, prepends conversation history, streams the answer, checks faithfulness, appends a Sources footer
Conversational Agent (conversation.py)
A LangGraph ReAct agent with per-user MemorySaver history. The search_documents tool delegates to the RAG pipeline when the LLM decides retrieval is needed.
- Python 3.13
- Ollama running (
ollama serve)
git clone <repository-url>
cd LoreCompendium
python -m venv .venv
.venv\Scripts\activate # Windows
source .venv/bin/activate # Mac/Linux
pip install -r requirements.txtcd modelfiles
ollama create -f gpt-oss-20b-modelfile.txt gpt-oss
ollama create -f llama3.2-modelfile.txt llama3.2
ollama pull mxbai-embed-large
cd ..Create config.json in the project root:
{
"discord_bot_token": "your_token_here",
"role_description": "Bot personality description",
"thinking_ollama_model": "gpt-oss",
"fast_ollama_model": "llama3.2",
"embedding_model": "mxbai-embed-large",
"rerank_model": ""
}rerank_model is optional. Set it to a model like mxbai-rerank-large (requires ollama pull mxbai-rerank-large) to enable cross-encoder reranking. Leave blank to skip.
All settings can also be edited at runtime via the web UI Settings page.
python discord_main.pyOn startup the bot runs an Ollama health check, initialises the vector store, and begins indexing any new or changed documents.
python web_app.pyAccessible at http://localhost:8000 by default.
.venv/Scripts/python -m pytest tests/ # Windows
.venv/bin/python -m pytest tests/ # Mac/Linux| Path | Contents |
|---|---|
input/ |
Drop documents here for indexing |
chroma_store/ |
Persisted ChromaDB vector store |
chroma_store/indexed_files.txt |
JSON manifest tracking indexed files with mtime/size/hash |
config.json |
Runtime configuration (not committed) |
See repository for license information.
Contributions are welcome. Please open an issue or pull request.