CodeSeeker Storage Configuration

NEW: CodeSeeker now works out-of-the-box with zero setup!

By default, CodeSeeker uses embedded storage (SQLite + Graphology + LRU-cache) that requires no Docker or external databases. Just npm install and go.

CodeSeeker supports two storage modes to fit different use cases:

Storage Modes

Mode	Setup	Best For
Embedded (default)	Zero setup - just `npm install`	Personal use, small-medium projects, getting started
Server	Docker or manual setup	Large codebases, teams, production environments

Embedded Mode (Default)

Zero configuration required. Works immediately after installation.

What It Uses

Component	Technology	Persistence
Vector Search	SQLite + better-sqlite3	`~/.codeseeker/data/vectors.db`
Graph Database	Graphology (in-memory)	`~/.codeseeker/data/graph.json`
Cache	LRU-cache (in-memory)	`~/.codeseeker/data/cache.json`
Projects	SQLite	`~/.codeseeker/data/projects.db`

Data Location

Data is stored in platform-specific locations:

Platform	Location
Windows	`%APPDATA%\codeseeker\data\`
macOS	`~/Library/Application Support/codeseeker/data/`
Linux	`~/.local/share/codeseeker/data/`

Features

Automatic persistence: All data auto-saves every 30 seconds and on exit
Crash recovery: Uses SQLite WAL mode for durability
No external dependencies: Everything runs in-process
Fast startup: No network connections to establish
Offline capable: Works without internet

Customizing Data Location

Set the CODESEEKER_DATA_DIR environment variable:

# Windows (PowerShell)
$env:CODESEEKER_DATA_DIR = "D:\codeseeker-data"

# macOS/Linux
export CODESEEKER_DATA_DIR="/custom/path/to/data"

Or create a config file:

// ~/.codeseeker/storage.json (Windows: %APPDATA%\codeseeker\storage.json)
{
  "mode": "embedded",
  "dataDir": "/custom/path/to/data",
  "flushIntervalSeconds": 60
}

Server Mode (Advanced)

For large codebases (100K+ files), teams, or production environments.

Note: Most users don't need server mode. Start with embedded mode and upgrade only if you hit performance limits or need multi-user support.

What It Uses

Component	Technology	Purpose
Vector Search	PostgreSQL + pgvector	Scalable vector similarity search
Graph Database	Neo4j	Powerful graph queries with Cypher
Cache	Redis	Distributed caching
Projects	PostgreSQL	Relational data with ACID

Setup Options (Choose One)

Option	Best For	Documentation
Manual Installation	Recommended for most users	Database Scripts
Kubernetes	Production deployments	Kubernetes Templates
Docker Compose	Quick testing only (experimental)	See below

Manual Installation (Recommended)

Follow the Database Scripts Guide to install PostgreSQL, Neo4j, and Redis manually. This gives you the most control and is recommended for production use.

Docker Compose (Experimental)

⚠️ Docker Compose is experimental and provided for quick local testing only. For production, use manual installation or Kubernetes.

# Start database services only (experimental)
docker-compose up -d database redis neo4j

# Verify services are running
docker-compose ps

Configuration

Create ~/.codeseeker/storage.json:

{
  "mode": "server",
  "server": {
    "postgres": {
      "host": "localhost",
      "port": 5432,
      "database": "codeseeker",
      "user": "codeseeker",
      "password": "your-password"
    },
    "neo4j": {
      "uri": "bolt://localhost:7687",
      "user": "neo4j",
      "password": "your-password"
    },
    "redis": {
      "host": "localhost",
      "port": 6379,
      "password": "optional-password"
    }
  }
}

Environment Variables

You can also configure via environment variables:

# Storage mode
export CODESEEKER_STORAGE_MODE=server

# PostgreSQL
export CODESEEKER_PG_HOST=localhost
export CODESEEKER_PG_PORT=5432
export CODESEEKER_PG_DATABASE=codeseeker
export CODESEEKER_PG_USER=codeseeker
export CODESEEKER_PG_PASSWORD=secret

# Neo4j
export CODESEEKER_NEO4J_URI=bolt://localhost:7687
export CODESEEKER_NEO4J_USER=neo4j
export CODESEEKER_NEO4J_PASSWORD=secret

# Redis
export CODESEEKER_REDIS_HOST=localhost
export CODESEEKER_REDIS_PORT=6379
export CODESEEKER_REDIS_PASSWORD=optional

PostgreSQL Setup

If not using Docker, install PostgreSQL with pgvector:

-- Create database
CREATE DATABASE codeseeker;

-- Enable pgvector extension
CREATE EXTENSION vector;

-- Create user
CREATE USER codeseeker WITH PASSWORD 'your-password';
GRANT ALL PRIVILEGES ON DATABASE codeseeker TO codeseeker;

Neo4j Setup

If not using Docker, install Neo4j Community Edition:

Download from https://neo4j.com/download/
Start the service
Set initial password via Neo4j Browser

Redis Setup

If not using Docker:

# macOS
brew install redis
brew services start redis

# Ubuntu/Debian
sudo apt install redis-server
sudo systemctl start redis

Checking Storage Status

# Check current storage mode and health
codeseeker storage status

# Test server connectivity (server mode)
codeseeker storage test

Migrating Between Modes

Embedded to Server

Configure server mode in storage.json
Run codeseeker init to re-index your project
Existing embedded data remains in place as backup

Server to Embedded

Change mode to embedded in storage.json
Run codeseeker init to re-index your project
Server data remains intact for future use

Persistence Details

Embedded Mode Persistence

Store	Format	Flush Interval	Durability
Vectors	SQLite WAL	Automatic	High (WAL)
Graph	JSON	30 seconds	Good
Cache	JSON	30 seconds	Good
Projects	SQLite WAL	Automatic	High (WAL)

Flush Behavior

Automatic flush: Every 30 seconds (configurable)
Graceful shutdown: Flushes before exit
Crash recovery: SQLite WAL protects vector/project data
JSON stores: May lose up to 30 seconds of data on crash

Customizing Flush Interval

{
  "mode": "embedded",
  "flushIntervalSeconds": 10
}

Troubleshooting

"Cannot find module 'better-sqlite3'"

Rebuild native modules:

npm rebuild better-sqlite3

"Database is locked"

Only one CodeSeeker process can access embedded storage at a time. Kill any background processes:

# Find CodeSeeker processes
ps aux | grep codeseeker

# Or on Windows
tasklist | findstr codeseeker

Server mode connection errors

Verify services are running
Check firewall settings
Verify credentials in config

Test connectivity:

# PostgreSQL
psql -h localhost -U codeseeker -d codeseeker

# Redis
redis-cli ping

# Neo4j
cypher-shell -u neo4j -p password

Performance Comparison

Metric	Embedded	Server
Startup time	~100ms	~500ms+
Vector search (1K docs)	~50ms	~20ms
Vector search (100K docs)	~500ms	~50ms
Graph traversal	Good	Excellent
Concurrent users	1	Many
Memory usage	Low	Variable

Recommendation: Start with embedded mode. Switch to server mode when you have:

100K+ files to index
Multiple team members
High query volume

API Usage

import { getStorageProvider } from '@codeseeker/storage';

// Get the storage provider (auto-configured)
const storage = await getStorageProvider();

// Access individual stores
const vectors = storage.getVectorStore();
const graph = storage.getGraphStore();
const cache = storage.getCacheStore();
const projects = storage.getProjectStore();

// Check health
const health = await storage.healthCheck();
console.log('Storage healthy:', health.healthy);

// Manual flush (usually not needed)
await storage.flushAll();

// Cleanup on shutdown
await storage.closeAll();

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CodeSeeker Storage Configuration

Storage Modes

Embedded Mode (Default)

What It Uses

Data Location

Features

Customizing Data Location

Server Mode (Advanced)

What It Uses

Setup Options (Choose One)

Manual Installation (Recommended)

Docker Compose (Experimental)

Configuration

Environment Variables

PostgreSQL Setup

Neo4j Setup

Redis Setup

Checking Storage Status

Migrating Between Modes

Embedded to Server

Server to Embedded

Persistence Details

Embedded Mode Persistence

Flush Behavior

Customizing Flush Interval

Troubleshooting

"Cannot find module 'better-sqlite3'"

"Database is locked"

Server mode connection errors

Performance Comparison

API Usage

Uh oh!

FilesExpand file tree

storage.md

Latest commit

History

storage.md

File metadata and controls

CodeSeeker Storage Configuration

Storage Modes

Embedded Mode (Default)

What It Uses

Data Location

Features

Customizing Data Location

Server Mode (Advanced)

What It Uses

Setup Options (Choose One)

Manual Installation (Recommended)

Docker Compose (Experimental)

Configuration

Environment Variables

PostgreSQL Setup

Neo4j Setup

Redis Setup

Checking Storage Status

Migrating Between Modes

Embedded to Server

Server to Embedded

Persistence Details

Embedded Mode Persistence

Flush Behavior

Customizing Flush Interval

Troubleshooting

"Cannot find module 'better-sqlite3'"

"Database is locked"

Server mode connection errors

Performance Comparison

API Usage