Multi-Datalake RAG Indexer with Local MCP Integration
- Overview
- Features
- Installation
- Quick Start
- Configuration
- Vector Store Configuration
- CLI Commands
- Integration
- Development
- Contributing
- License
- Support
Prometh Cortex is a local-first, extensible system for indexing multiple datalake repositories containing Markdown files and exposing their content for retrieval-augmented generation (RAG) workflows through a local MCP (Modular Command Processor) server.
- Multi-Datalake Support: Index multiple repositories of Markdown documents
- YAML Frontmatter Parsing: Rich metadata extraction with structured schema support
- Dual Vector Store Support: Choose between local FAISS or cloud-native Qdrant
- Incremental Indexing: Smart change detection for efficient updates
- MCP Server: Local server with stdio, SSE, and streamable HTTP transports for Claude, OpenCode, VSCode, and other tools
- CLI Interface: Easy-to-use command line tools for indexing and querying
- Performance Optimized: Target <100ms query response time on M1/M2 Mac
pip install prometh-cortexgit clone https://github.com/prometh-sh/prometh-cortex.git
cd prometh-cortex
pip install -e ".[dev]"- Install via pip:
pip install prometh-cortex- Initialize configuration (creates
~/.config/prometh-cortex/config.toml):
pcortex config --init- Edit your config file:
# macOS/Linux
nano ~/.config/prometh-cortex/config.toml
# Or use your preferred editor
# Update the [datalake] repos with your document paths- Build index:
pcortex build- Query locally:
pcortex query "search for something"- Start servers:
# For Claude Desktop (MCP protocol)
pcortex mcp
# For Perplexity/VSCode/HTTP integrations
pcortex servePrometh-cortex follows the XDG Base Directory Specification. Config files are searched in this order:
./config.toml- Current directory (highest priority)~/.config/prometh-cortex/config.toml- XDG config directory (recommended)~/.prometh-cortex/config.toml- Fallback location
Useful commands:
# Initialize config in XDG directory (recommended for system-wide use)
pcortex config --init
# Create sample config in current directory (for project-specific config)
pcortex config --sample
# Show all config search paths
pcortex config --show-pathsCreate a config.toml file with your settings:
cp config.toml.sample config.toml
# Edit config.toml with your specific paths and settingsPrometh Cortex v0.5.0 builds on the unified collection architecture with memory preservation during force rebuilds. A single FAISS/Qdrant index contains all documents plus an auto-injected virtual prmth_memory source for session summaries and decisions.
[datalake]
# Add your document directories here
repos = [
"/path/to/your/notes",
"/path/to/your/documents",
"/path/to/your/projects"
]
[storage]
rag_index_dir = "/path/to/index/storage"
[server]
port = 8080
host = "localhost"
auth_token = "your-secure-token"
transport = "stdio" # "stdio", "sse", or "streamable-http" (v0.4.0+)
[embedding]
model = "sentence-transformers/all-MiniLM-L6-v2"
max_query_results = 10
# Single unified collection
[[collections]]
name = "prometh_cortex"
# Multiple sources with per-source chunking parameters
[[sources]]
name = "knowledge_base"
chunk_size = 768
chunk_overlap = 76
source_patterns = ["docs/specs", "docs/prds"]
[[sources]]
name = "meetings"
chunk_size = 512
chunk_overlap = 51
source_patterns = ["meetings"]
[[sources]]
name = "todos"
chunk_size = 256
chunk_overlap = 26
source_patterns = ["todos", "reminders"]
[[sources]]
name = "default"
chunk_size = 512
chunk_overlap = 50
source_patterns = ["*"] # Catch-all for unmatched documents
# Virtual memory source (v0.5.0+)
# Auto-injected; no file-based routing
# Stores session summaries, decisions, patterns
# Preserved during force rebuilds
[[sources]]
name = "prmth_memory"
chunk_size = 512
chunk_overlap = 50
source_patterns = [".prmth_memory"] # Virtual pattern (won't match real files)
[vector_store]
type = "faiss" # or "qdrant"
# Qdrant configuration (when type = "qdrant")
[vector_store.qdrant]
host = "localhost"
port = 6333
collection_name = "prometh_cortex"Key Features in v0.5.0:
- ✅ Memory preservation during
pcortex build --forceandpcortex rebuild - ✅ Virtual
prmth_memorysource auto-injected into all configs - ✅ Session memories queryable immediately (no rebuild needed)
- ✅ Deduped memories by content hash (idempotent)
- ✅ Works with both FAISS (sidecar JSON) and Qdrant (filter-based deletion)
Key Changes from v0.4.0:
- ✅ Memory tool now preserves session data across force rebuilds
- ✅ Better incremental indexing for memory-heavy workflows
- ✅ Improved metadata tracking for memory documents
Key Changes from v0.3.0:
- ✅ SSE/HTTP Transport for daemon mode
- ✅ OpenCode first-class support
- ✅ Auto config generation for all clients
- ✅ Memory tool for session persistence
Prometh Cortex supports two vector store backends:
Best for: Local development, private deployments, no external dependencies
[vector_store]
type = "faiss"
[storage]
rag_index_dir = ".rag_index"Advantages:
- ✅ No external dependencies
- ✅ Fast local queries
- ✅ Works offline
- ✅ Simple setup
Disadvantages:
- ❌ Limited to single machine
- ❌ No concurrent write access
- ❌ Manual backup required
Best for: Production deployments, team collaboration, scalable solutions
# Start Qdrant container with persistent storage
docker run -d \
--name qdrant \
-p 6333:6333 \
-v $(pwd)/qdrant_storage:/qdrant/storage \
qdrant/qdrant
[vector_store]
type = "qdrant"
[vector_store.qdrant]
host = "localhost"
port = 6333
collection_name = "prometh_cortex"[vector_store]
type = "qdrant"
[vector_store.qdrant]
host = "your-cluster.qdrant.io"
port = 6333
collection_name = "prometh_cortex"
api_key = "your-api-key-here"
use_https = trueAdvantages:
- ✅ Concurrent access support
- ✅ Built-in clustering and replication
- ✅ Advanced filtering capabilities
- ✅ REST API access
- ✅ Automatic backups (cloud)
- ✅ Horizontal scaling
Disadvantages:
- ❌ Requires external service
- ❌ Network dependency
- ❌ Additional complexity
-
Local Docker Setup:
# Create persistent storage directory mkdir -p qdrant_storage # Start Qdrant container docker run -d \ --name qdrant \ --restart unless-stopped \ -p 6333:6333 \ -p 6334:6334 \ -v $(pwd)/qdrant_storage:/qdrant/storage \ qdrant/qdrant # Verify Qdrant is running curl http://localhost:6333/health
-
Configure Environment:
[vector_store] type = "qdrant"
[vector_store.qdrant] host = "localhost" port = 6333 collection_name = "prometh_cortex"
3. **Build Index**:
```bash
# Initial build or incremental update
pcortex build
# Force complete rebuild
pcortex rebuild --confirm
- Verify Setup:
# Check health and statistics pcortex query "test" --max-results 1 # Or directly check Qdrant curl http://localhost:6333/collections/prometh_cortex
-
Create Qdrant Cloud Account:
- Visit Qdrant Cloud
- Create a cluster and get your credentials
-
Configure Environment:
[vector_store] type = "qdrant"
[vector_store.qdrant] host = "your-cluster-id.qdrant.io" port = 6333 collection_name = "prometh_cortex" api_key = "your-api-key" use_https = true
#### Migration Between Vector Stores
```bash
# Backup current index (if using FAISS)
pcortex build --backup /tmp/backup_$(date +%Y%m%d_%H%M%S)
# Change vector store type in config.toml
sed -i 's/type = "faiss"/type = "qdrant"/' config.toml
# Rebuild index with new vector store
pcortex rebuild --confirm
# Verify migration successful
pcortex query "test migration" --max-results 1
# Initial build (automatic incremental updates)
pcortex build
# Force complete rebuild (ignores incremental changes)
pcortex build --force
# Disable incremental indexing
pcortex build --no-incremental
# Rebuild entire index (with confirmation)
pcortex rebuild
pcortex rebuild --confirm # Skip confirmation prompt# Query across all sources in unified collection
pcortex query "search term"
# Query with source filtering (optional)
pcortex query "meeting notes" --source meetings
pcortex query "action items" -s todos
# Query with options
pcortex query "search term" --max-results 5 --show-content# List all configured sources with statistics
pcortex sources
# Verbose output with chunk configuration details
pcortex sources -v# Start MCP server with stdio protocol (default)
pcortex mcp start
# Start as persistent SSE daemon (v0.4.0+)
pcortex mcp start --transport sse --port 3100
# SSE on all interfaces (for Tailscale/remote access)
pcortex mcp start -t sse --host 0.0.0.0 -p 3100
# Streamable HTTP transport (newer MCP spec)
pcortex mcp start -t streamable-http --port 3100# Generate config for various clients
pcortex mcp init claude # Claude Desktop (stdio)
pcortex mcp init opencode # OpenCode (stdio)
pcortex mcp init opencode --write # Write directly to config file
# Generate SSE client configs (for daemon mode)
pcortex mcp init claude -t sse # Claude Desktop (SSE)
pcortex mcp init opencode -t sse # OpenCode (SSE)
pcortex mcp init opencode -t sse --url http://mac-mini.tail:3100 # Remote SSE# Start HTTP server (default: localhost:8080)
pcortex serve
# Custom host/port
pcortex serve --host 0.0.0.0 --port 9000
# Development mode with auto-reload
pcortex serve --reloadFor Claude Desktop, OpenCode, Claude Code, and other MCP clients
Provides MCP tools with configurable transport (v0.4.0+):
- stdio (default): Subprocess per client session, suitable for Claude Desktop
- sse: Persistent daemon with Server-Sent Events, shared across multiple clients
- streamable-http: Newer MCP spec HTTP transport (v0.5.0+)
MCP Tools:
- prometh_cortex_query: Search unified index with optional
source_typefiltering - prometh_cortex_list_sources: List all sources with statistics (v0.3.0+)
- prometh_cortex_health: Get system health status and unified collection metrics
- prometh_cortex_memory: Store session summaries, decisions, patterns directly to index (v0.5.0+)
Store and query session insights without rebuilding the entire index.
Purpose: Capture high-value knowledge from agent sessions (OpenCode, Claude Desktop) and make it immediately searchable across your knowledge base.
Key Features:
- ✅ Immediate Availability: Documents queryable right after creation (no rebuild needed)
- ✅ Automatic Deduplication: Same title + content = same document ID (idempotent)
- ✅ Memory Preservation: Memories survive
pcortex build --forceandpcortex rebuild - ✅ Metadata Rich: Store tags, session IDs, project references, custom metadata
- ✅ Virtual Source: Auto-injected
prmth_memorysource (no file-based routing)
Parameters:
{
"title": "string (required) — Document title for search",
"content": "string (required) — Markdown body (Content, Decisions, Patterns, etc.)",
"tags": ["array of strings (optional) — e.g., 'kubernetes', 'incident', 'session'"],
"metadata": {
"source_project": "string (optional) — Project or context",
"author": "string (optional) — Author/agent name",
"session_id": "string (optional) — Session identifier",
"custom_field": "any (optional) — Custom metadata"
}
}Usage Example (Claude Desktop / OpenCode):
User: "Save this session summary to memory"
Agent Response:
prometh_cortex_memory(
title="Session: Microservices Architecture Review - 2026-04-20",
content="""## Summary
Reviewed and documented the microservices architecture decisions for the platform migration project.
## Decisions Made
- Use event-driven architecture for service communication
- Implement circuit breaker pattern for resilience
- Store session state in distributed cache (Redis/Memcached)
## Lessons Learned
- Service mesh complexity grows with cluster size
- Proper monitoring critical before production deployment
- Version compatibility matrix must be maintained
## Next Steps
- Document API contracts for all services
- Set up distributed tracing infrastructure
- Schedule follow-up architecture review in 2 weeks
""",
tags=["session", "architecture", "microservices"],
metadata={
"session_id": "sess_arch_review_2026_04_20",
"project": "platform-migration",
"version": "v0.5.0"
}
)
Query Memory Documents:
# Query across memory documents only
pcortex query "circuit breaker pattern" --source prmth_memory
# Query everywhere (memories + other sources)
pcortex query "architecture decisions"
# Via HTTP API
curl -X POST http://localhost:8001/prometh_cortex_query \
-H "Authorization: Bearer your-token" \
-H "Content-Type: application/json" \
-d '{
"query": "session decisions and lessons learned",
"source_type": "prmth_memory",
"max_results": 5
}'Force Rebuild with Memory Preservation:
# Both commands now preserve memories (v0.5.0+)
pcortex build --force
pcortex rebuild --confirm
# Verify memories still accessible after rebuild
pcortex query "architecture decisions" --source prmth_memoryMemory Workflow (Typical Session):
- During Session: Capture decisions/patterns via
prometh_cortex_memory()tool - Immediately Queryable: Ask "What did we decide about X?" → searches memory
- Force Rebuild: Run
pcortex build --forcewhen source docs change - Memories Preserved: Session insights survive the rebuild
- Long-term KB: Export memories to permanent documents when needed
For Perplexity, VSCode, web integrations
POST /prometh_cortex_query
{
"query": "search term or question",
"max_results": 10,
"source_type": "meetings", // Optional: filter by source (v0.3.0+)
"filters": {
"datalake": "notes",
"tags": ["work", "project"]
}
}GET /prometh_cortex_sources
Returns all configured sources with:
- Source names and chunking parameters
- Source patterns for document routing
- Document count per source
- Total documents in unified index
{
"collection_name": "prometh_cortex",
"sources": [
{
"name": "knowledge_base",
"chunk_size": 768,
"chunk_overlap": 76,
"source_patterns": ["docs/specs", "docs/prds"],
"document_count": 145
},
{
"name": "meetings",
"chunk_size": 512,
"chunk_overlap": 51,
"source_patterns": ["meetings"],
"document_count": 89
}
],
"total_sources": 2,
"total_documents": 412
}GET /prometh_cortex_health
Returns server status, unified collection metrics, and performance metrics.
---
title: Document Title
created: YYYY-MM-DDTHH:MM:SS
author: Author Name
category: #Category
tags:
- #tag1
- tag2
focus: Work
uuid: document-uuid
project:
- name: Project Name
uuid: project-uuid # UUID preserved for document linking
reminder:
- subject: Reminder Text
uuid: reminder-uuid # UUID preserved for document linking
list: List Name
event:
subject: Event Subject
uuid: event-uuid # UUID preserved for document linking
shortUUID: MF042576B # Short UUID also preserved
organizer: Organizer Name
attendees:
- Attendee 1
- Attendee 2
location: Event Location
start: YYYY-MM-DDTHH:MM:SS # Event start time
end: YYYY-MM-DDTHH:MM:SS # Event end time
related:
- Related Item 1
- Related Item 2
---Note on UUIDs for Document Linking:
- Project, reminder, and event UUIDs are preserved in vector store metadata
- These UUIDs enable cross-document linking and relationship queries
- Use these UUIDs to find related documents across your datalake
- Query by UUID:
event_uuid:B897515C-1BE9-41B6-8423-3988BE0C9E3E
---
title: "[PRJ-0119] Add New Feature" # Quoted because of brackets
author: "John O'Connor" # Quoted because of apostrophe
tags:
- "C#" # Quoted because of hash symbol
- "project-2024" # Safe without quotes
category: "Work & Personal" # Quoted because of ampersand
------
title: [PRJ-0119] Add New Feature # Brackets will cause parsing errors
author: John O'Connor # Apostrophe may cause issues
tags:
- C# # Hash symbol conflicts with YAML
category: Work & Personal # Ampersand may cause issues
---- Square brackets
[]:title: "[PROJECT-123] Task Name" - Curly braces
{}:status: "{COMPLETED}" - Hash/Pound
#:tag: "C#" - Colon
::note: "Time: 3:30 PM" - Ampersand
&:title: "Sales & Marketing" - Asterisk
*:priority: "*HIGH*" - Pipe
|:command: "grep | sort" - Greater/Less than
<>:comparison: "<100ms" - At symbol
@:email: "@company.com" - Apostrophes
':name: "O'Connor"
- Metadata Parsing: Improper YAML syntax prevents frontmatter from being extracted
- Index Quality: Missing metadata means poor search results and filtering
- Qdrant Storage: Malformed YAML leads to incomplete document payloads
- Search Performance: Documents without proper metadata are harder to find
Test your YAML frontmatter before indexing:
# Quick validation of a document
python -c "
import yaml
import frontmatter
with open('your-document.md', 'r') as f:
post = frontmatter.load(f)
print('✅ YAML parsed successfully')
print(f'Title: {post.metadata.get(\"title\", \"N/A\")}')
print(f'Fields: {list(post.metadata.keys())}')
"Configure Claude Desktop by editing ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"prometh-cortex": {
"command": "/path/to/your/project/.venv/bin/python",
"args": [
"-m", "prometh_cortex.cli.main", "mcp"
],
"env": {
"DATALAKE_REPOS": "/path/to/your/notes,/path/to/your/documents,/path/to/your/projects",
"RAG_INDEX_DIR": "/path/to/index/storage",
"MCP_PORT": "8080",
"MCP_HOST": "localhost",
"MCP_AUTH_TOKEN": "your-secure-token",
"EMBEDDING_MODEL": "sentence-transformers/all-MiniLM-L6-v2",
"MAX_QUERY_RESULTS": "10",
"CHUNK_SIZE": "512",
"CHUNK_OVERLAP": "50",
"VECTOR_STORE_TYPE": "faiss"
}
}
}
}Setup Steps:
-
Install in Virtual Environment:
cd /path/to/prometh-cortex python -m venv .venv source .venv/bin/activate # On macOS/Linux pip install -e .
-
Configure Settings: Create and customize your configuration:
# Create configuration file cp config.toml.sample config.toml # Edit config.toml with your specific paths and settings # Update the [datalake] repos array with your document directories # Set your preferred [storage] rag_index_dir location # Customize [server] auth_token for security
-
Build your index:
source .venv/bin/activate pcortex build --force -
Get Absolute Paths: Update the MCP configuration with your actual paths:
# Get your virtual environment Python path which python # While .venv is activated # Get your project directory pwd
-
Update Claude Desktop Config: Use absolute paths in your
claude_desktop_config.json:{ "mcpServers": { "prometh-cortex": { "command": "/path/to/your/project/.venv/bin/python", "args": [ "-m", "prometh_cortex.cli.main", "mcp" ], "env": { "DATALAKE_REPOS": "/path/to/your/notes,/path/to/your/documents,/path/to/your/projects", "RAG_INDEX_DIR": "/path/to/index/storage", "MCP_PORT": "8080", "MCP_HOST": "localhost", "MCP_AUTH_TOKEN": "your-secure-token", "EMBEDDING_MODEL": "sentence-transformers/all-MiniLM-L6-v2", "MAX_QUERY_RESULTS": "10", "CHUNK_SIZE": "512", "CHUNK_OVERLAP": "50" } } } } -
Verify Configuration:
# Test MCP server manually source .venv/bin/activate pcortex mcp # Should start without errors
-
Restart Claude Desktop: Completely quit and restart Claude Desktop application.
Troubleshooting:
- ✅ Check Logs: Look at Claude Desktop console logs for MCP connection errors
- ✅ Verify Paths: Ensure all paths in the config are absolute and correct
- ✅ Test Index: Run
pcortex query "test"to verify your index works - ✅ Environment: Make sure environment variables are accessible from the MCP context
Usage: After restarting Claude Desktop, you'll have access to these MCP tools:
- prometh_cortex_query: Search your indexed documents
- Ask: "Search my notes for yesterday's meetings"
- Ask: "Find documents about project planning"
- Ask: "What meetings did I have last week?"
- prometh_cortex_health: Check system status
- Ask: "How many documents are indexed in prometh-cortex?"
- Ask: "What's the health status of my knowledge base?"
Generate and install configuration automatically:
# Generate OpenCode config (prints to console)
pcortex mcp init opencode
# Write directly to ~/.config/opencode/opencode.json
pcortex mcp init opencode --write
# SSE mode (requires running daemon, see SSE Daemon Mode below)
pcortex mcp init opencode --transport sseManual Configuration: Add the "mcp" section to ~/.config/opencode/opencode.json:
{
"mcp": {
"prometh-cortex": {
"type": "local",
"command": ["/path/to/pcortex", "mcp", "start"],
"environment": {
"RAG_INDEX_DIR": "/path/to/index/storage",
"VECTOR_STORE_TYPE": "qdrant",
"QDRANT_HOST": "your-cluster.qdrant.io",
"QDRANT_PORT": "6333",
"QDRANT_COLLECTION_NAME": "prometh_cortex",
"QDRANT_API_KEY": "your-api-key",
"QDRANT_USE_HTTPS": "true",
"MCP_AUTH_TOKEN": "your-secure-token",
"EMBEDDING_MODEL": "sentence-transformers/all-MiniLM-L6-v2",
"MAX_QUERY_RESULTS": "10"
},
"enabled": true,
"timeout": 60000
}
}
}Remote Mode (for persistent SSE daemon):
{
"mcp": {
"prometh-cortex": {
"type": "remote",
"url": "http://127.0.0.1:3100/sse",
"headers": {
"Authorization": "Bearer your-secure-token"
},
"enabled": true,
"timeout": 60000
}
}
}Run Cortex as a persistent daemon instead of spawning per client session (v0.4.0+). This gives you single startup cost, shared Qdrant connections, and no duplicate vector index loads.
1. Start the daemon:
pcortex mcp start --transport sse --port 3100
# Or bind to all interfaces for Tailscale/remote access
pcortex mcp start -t sse --host 0.0.0.0 -p 31002. Configure clients to connect via SSE:
# Claude Code
claude mcp add --transport sse prometh-cortex http://127.0.0.1:3100/sse
# OpenCode
pcortex mcp init opencode -t sse --write
# Claude Desktop
pcortex mcp init claude -t sse --write
# Remote access (e.g., via Tailscale)
pcortex mcp init opencode -t sse --url http://mac-mini.tail:31003. (Optional) Run as macOS launchd service:
Create ~/Library/LaunchAgents/sh.prometh.cortex-mcp.plist for auto-start on boot with keepalive.
Configure Claude.ai to use your MCP server by adding it as a custom integration:
- Start your MCP server:
pcortex serve - Use the webhook URL:
http://localhost:8080/prometh_cortex_query - Set authentication header:
Authorization: Bearer your-secret-token - Send queries in JSON format:
{ "query": "search term", "max_results": 10 }
Configure Perplexity to use your local MCP server for document search:
Prerequisites:
-
Start HTTP Server (not MCP protocol):
source .venv/bin/activate pcortex serve --port 8001 # Use different port than MCP
-
Configure for Performance (important for Perplexity timeouts):
3. **Verify Health**:
```bash
curl -H "Authorization: Bearer your-secret-token" \
http://localhost:8001/prometh_cortex_health
Integration Setup:
-
Server Configuration:
- Protocol:
HTTP - URL:
http://localhost:8001/prometh_cortex_query - Method:
POST - Headers:
Authorization: Bearer your-secret-token - Content-Type:
application/json
- Protocol:
-
Query Format:
{ "query": "your search query", "max_results": 3 } -
Example Request:
curl -X POST http://localhost:8001/prometh_cortex_query \ -H "Authorization: Bearer your-secret-token" \ -H "Content-Type: application/json" \ -d '{"query": "meeting notes", "max_results": 3}'
Performance Optimization:
- ✅ Reduced Results: Use
max_results: 3instead of 10 to avoid timeouts - ✅ Dedicated Port: Use separate port (8001) for Perplexity vs other integrations
- ✅ Quick Queries: Response time optimized to <400ms for timeout compatibility
Usage in Perplexity:
- Ask: "Search my local documents for project updates"
- Ask: "Find my notes about last week's meetings"
- Ask: "What information do I have about [specific topic]?"
Configure VSCode to use your MCP server with GitHub Copilot:
-
Install MCP for VSCode:
# Install the VSCode MCP extension code --install-extension ms-vscode.mcp -
Configure MCP Settings: Add to your VSCode
settings.jsonor create.vscode/mcp.json:{ "mcpServers": { "prometh-cortex": { "command": "/path/to/your/project/.venv/bin/python", "args": [ "-m", "prometh_cortex.cli.main", "mcp" ], "env": { "DATALAKE_REPOS": "/path/to/your/notes,/path/to/your/documents,/path/to/your/projects", "RAG_INDEX_DIR": "/path/to/index/storage", "MCP_PORT": "8080", "MCP_HOST": "localhost", "MCP_AUTH_TOKEN": "your-secure-token", "EMBEDDING_MODEL": "sentence-transformers/all-MiniLM-L6-v2", "MAX_QUERY_RESULTS": "10", "CHUNK_SIZE": "512", "CHUNK_OVERLAP": "50" } } } } -
Update User Settings: Add to your VSCode
settings.json:{ "mcp.servers": { "prometh-cortex": { "enabled": true } } } -
Verify Integration:
- Open Command Palette (
Cmd+Shift+P) - Run "MCP: List Servers"
- You should see "prometh-cortex" listed and active
- Open Command Palette (
Add to your VSCode settings.json:
{
"github.copilot.advanced": {
"debug.useElectronPrompts": true,
"debug.useNodeUserForPrompts": true
},
"prometh-cortex.server.url": "http://localhost:8001",
"prometh-cortex.server.token": "your-secret-token"
}Start the HTTP server:
source .venv/bin/activate
pcortex serve --port 8001Create .vscode/tasks.json for quick queries:
{
"version": "2.0.0",
"tasks": [
{
"label": "Query Prometh-Cortex",
"type": "shell",
"command": "curl",
"args": [
"-H", "Authorization: Bearer your-secret-token",
"-H", "Content-Type: application/json",
"-d", "{\"query\": \"${input:searchQuery}\", \"max_results\": 5}",
"http://localhost:8001/prometh_cortex_query"
],
"group": "build",
"presentation": {
"echo": true,
"reveal": "always",
"panel": "new"
}
}
],
"inputs": [
{
"id": "searchQuery",
"description": "Enter your search query",
"default": "meeting notes",
"type": "promptString"
}
]
}Setup Steps:
-
Build Index: Ensure your RAG index is built and up-to-date
source .venv/bin/activate pcortex build --force -
Start MCP Server (for Option 1):
# MCP server runs automatically when VSCode starts # Check VSCode Output panel for MCP logs
-
Start HTTP Server (for Options 2-3):
source .venv/bin/activate pcortex serve --port 8001
Usage:
- Option 1: Use MCP commands directly in GitHub Copilot chat
- Ask: "Search my documents for project planning notes"
- Ask: "Find my meeting notes from last week"
- Option 2: GitHub Copilot will automatically query your local documents
- Option 3: Press
Ctrl+Shift+P→ "Tasks: Run Task" → "Query Prometh-Cortex"
Troubleshooting:
- ✅ Check MCP Output: View "Output" panel in VSCode, select "MCP" from dropdown
- ✅ Verify Paths: Ensure all paths are absolute and accessible
- ✅ Test Manually: Run
pcortex mcporpcortex serveto verify functionality - ✅ Restart VSCode: After configuration changes, restart VSCode completely
Usage: Press Ctrl+Shift+P → "Tasks: Run Task" → "Query Prometh-Cortex"
Two Server Types Available:
-
MCP Protocol Server (
pcortex mcp start):- Purpose: AI assistant integration (Claude Desktop, OpenCode, Claude Code, VSCode)
- Transports (v0.4.0+):
stdio(default): Subprocess per client, no network portsse: Persistent daemon on configurable host:port, shared across clientsstreamable-http: Newer MCP spec transport
- Usage: Direct integration with MCP-compatible clients
-
HTTP REST Server (
pcortex serve):- Purpose: Web applications, HTTP clients (Perplexity, custom integrations)
- Protocol: HTTP REST API
- Port: Configurable (default: 8080)
- Usage: Traditional HTTP API access
| Transport | Best For | Port | Startup | Shared State | Setup |
|---|---|---|---|---|---|
| stdio | Single client (Claude Desktop) | None | ~2s | No | Simple: pcortex mcp |
| sse | Multiple clients (OpenCode + Claude Desktop) | Yes | ~2s | Yes | Daemon: pcortex mcp start -t sse |
| streamable-http | HTTP clients + MCP | Yes | ~2s | Yes | Daemon: pcortex mcp start -t streamable-http |
Decision Tree:
- Just using Claude Desktop? → Use
stdio(default) - Using OpenCode + Claude Desktop? → Use
ssedaemon (shared startup cost) - Remote access needed? → Use
ssewith--host 0.0.0.0(Tailscale/SSH tunnel) - Need HTTP API + MCP? → Use
streamable-httpdaemon
Configuration Prerequisites:
-
Environment Setup:
# Create and activate virtual environment python -m venv .venv source .venv/bin/activate # macOS/Linux # Install in development mode pip install -e .
-
Create Configuration:
# Create configuration from sample cp config.toml.sample config.toml # Edit config.toml with your specific settings
-
Build Index:
pcortex build --force
-
Test Configuration:
# Test MCP server pcortex mcp # Should start without errors, Ctrl+C to stop # Test HTTP server pcortex serve # Should show server info, Ctrl+C to stop # Test query functionality pcortex query "test search"
Common Integration Pattern:
For HTTP integrations (Perplexity, web apps):
# Start HTTP server
pcortex serve --port 8001
# Query endpoint
POST http://localhost:8001/prometh_cortex_query
Authorization: Bearer your-secret-token
Content-Type: application/json
{
"query": "your search query",
"max_results": 10,
"filters": {
"datalake": "notes",
"tags": ["work"]
}
}
# Health check
GET http://localhost:8001/prometh_cortex_health
Authorization: Bearer your-secret-tokenFor MCP integrations (Claude Desktop, VSCode):
{
"mcpServers": {
"prometh-cortex": {
"command": "/path/to/your/project/.venv/bin/python",
"args": [
"-m", "prometh_cortex.cli.main", "mcp"
],
"env": {
"DATALAKE_REPOS": "/path/to/your/notes,/path/to/your/documents,/path/to/your/projects",
"RAG_INDEX_DIR": "/path/to/index/storage",
"MCP_PORT": "8080",
"MCP_HOST": "localhost",
"MCP_AUTH_TOKEN": "your-secure-token",
"EMBEDDING_MODEL": "sentence-transformers/all-MiniLM-L6-v2",
"MAX_QUERY_RESULTS": "10",
"CHUNK_SIZE": "512",
"CHUNK_OVERLAP": "50",
"VECTOR_STORE_TYPE": "faiss"
}
}
}
}Performance Tuning:
- For Perplexity: Set
max_query_results = 3in config.toml to avoid timeouts - For Development: Use
--reloadflag withpcortex serve - For Production: Use production WSGI server instead of development server
Auto-start Script:
Create start_servers.sh for easy management:
#!/bin/bash
# Kill existing servers
pkill -f "pcortex serve" 2>/dev/null || true
pkill -f "pcortex mcp" 2>/dev/null || true
# Activate virtual environment
source .venv/bin/activate
# Start HTTP server in background
nohup pcortex serve --port 8001 > /tmp/prometh-cortex-http.log 2>&1 &
echo "Prometh-Cortex servers started"
echo "HTTP Server: http://localhost:8001"
echo "MCP Server: Available for stdio connections"
echo "Logs: /tmp/prometh-cortex-http.log"Troubleshooting Checklist:
- ✅ Virtual Environment: Always use absolute paths to
.venv/bin/python - ✅ Configuration: Set
datalake.reposandstorage.rag_index_dirin config.toml - ✅ Index Built: Run
pcortex buildbefore using servers - ✅ Ports Available: Check port conflicts with
lsof -i :8080 - ✅ Logs Check: Monitor server logs for configuration errors
- ✅ Path Permissions: Ensure read access to datalake and write access to index directory
# Clone repository
git clone https://github.com/prometh-sh/prometh-cortex.git
cd prometh-cortex
# Install with development dependencies
pip install -e ".[dev]"
# Install pre-commit hooks
pre-commit install# Run all tests
pytest
# Run with coverage
pytest --cov=src/prometh_cortex
# Run specific test types
pytest tests/unit/
pytest tests/integration/# Format code
black src/ tests/
isort src/ tests/
# Lint code
flake8 src/ tests/
# Type checking
mypy src/- Query Speed: Target <100ms on M1/M2 Mac
- Index Size: Scales to thousands of documents
- Memory Usage: Optimized chunking and streaming processing
- Storage: Efficient FAISS local storage or scalable Qdrant
- Incremental Updates: Only processes changed documents
┌─────────────────────┐
│ config.toml │
└──────────┬──────────┘
│
┌──────────▼──────────────────┐
│ Datalake Ingest & Parser │
│ - Markdown files │
│ - YAML frontmatter │
└──────────┬──────────────────┘
│
┌──────────▼──────────────────┐
│ Vector Store / Indexing │
│ - FAISS (local) or Qdrant │
│ - Local embedding model │
│ - Incremental indexing │
└──────────┬──────────────────┘
│
┌──────────▼──────────────────┐
│ MCP Server │
│ - stdio / SSE / HTTP │
│ - prometh_cortex_query │
│ - prometh_cortex_health │
│ - prometh_cortex_sources │
└──────────┬──────────────────┘
│
┌──────┼──────┐
│ │ │
stdio SSE HTTP
│ │ │
Claude Multi REST
Desktop client API
daemon
Apache 2.0 License - see LICENSE for details.
We welcome contributions! Please see CONTRIBUTING.md for detailed guidelines.
- Fork the repository
- Create a feature branch:
git checkout -b feature/your-feature-name - Make your changes with clear, descriptive commits
- Add tests for new functionality
- Ensure all tests pass:
pytest - Format code:
black src/ tests/andisort src/ tests/ - Submit a pull request with a clear description
This project follows our Code of Conduct. By participating, you agree to uphold this code.
Found a security vulnerability? Please see SECURITY.md for responsible disclosure guidelines.
Architecture & Design:
- Memory Preservation Spec (v0.5.0) - Technical specification for preserving memories across force rebuilds
- Unified Collection Spec (v0.3.0+) - Complete technical specification for unified collection with per-source chunking architecture
- Multi-Collection Spec (v0.2.0 - Deprecated) - Legacy multi-collection architecture (archived for reference)
Migration Guides:
- v0.4.0 → v0.5.0 Migration Guide - Memory preservation and improved indexing
- v0.2.0 → v0.3.0 Migration Guide - Step-by-step guide for migrating from multi-collection to unified collection
- v0.1.x → v0.2.0 Migration Guide - Historical migration guide (archived)
Key Improvements in v0.5.0:
- Memory Preservation: Session memories survive
pcortex build --forceandpcortex rebuild - Memory Tool (MCP):
prometh_cortex_memory()for capturing decisions, patterns, and session summaries - Dual Backend Support: FAISS (sidecar JSON) and Qdrant (filter-based) memory preservation
- Smart Metadata Retrieval: Handle both parent document IDs and chunk IDs seamlessly
Key Improvements in v0.4.0:
- SSE/HTTP Transport: Run MCP as a persistent daemon shared across clients
- OpenCode Support: First-class config generation for OpenCode
- Auto Config:
pcortex mcp init <target>generates configs for Claude, OpenCode, VSCode, Codex, Perplexity - Remote Access: SSE daemon with
--host 0.0.0.0for Tailscale/multi-machine setups
Key Improvements in v0.3.0:
- Unified Collection: Single FAISS/Qdrant index instead of multiple
- Per-Source Chunking: Different chunk sizes per document source in unified index
- Topic-Based Queries: Query across document types naturally
- Better Performance: ~300ms queries (vs ~500ms multi-collection)
- Lower Memory: Single index (vs 3-5x for multi-collection)
- Documentation: See the /docs directory for detailed guides
- Issues: Report bugs or request features via GitHub Issues
- Discussions: Ask questions or share ideas in GitHub Discussions
- Security: For security issues, see SECURITY.md
- PyPI Package: https://pypi.org/project/prometh-cortex/
- Source Code: https://github.com/prometh-sh/prometh-cortex
- Changelog: See CHANGELOG.md for version history
We encourage community participation! Whether you're fixing bugs, adding features, improving documentation, or helping others, all contributions are valued.
Made with ❤️ for the knowledge management community