-
Notifications
You must be signed in to change notification settings - Fork 1
Configuration and Models
SourceBridge reads configuration from a TOML file and environment variables. Environment variables use the SOURCEBRIDGE_ prefix and override file values. The config file is searched in order: ./config.toml, $HOME/.config/sourcebridge/config.toml, /etc/sourcebridge/config.toml.
See config.toml.example for an annotated example.
| Variable | Config key | Default | Description |
|---|---|---|---|
SOURCEBRIDGE_SERVER_HTTP_PORT |
server.http_port |
8080 |
API server HTTP port |
SOURCEBRIDGE_SERVER_GRPC_PORT |
server.grpc_port |
50051 |
gRPC port for API↔worker communication |
SOURCEBRIDGE_SERVER_PUBLIC_BASE_URL |
server.public_base_url |
http://localhost:8080 |
Public-facing URL (used in OAuth callbacks and generated links) |
SOURCEBRIDGE_SERVER_CORS_ORIGINS |
server.cors_origins |
http://localhost:3000 |
Comma-separated allowed CORS origins |
SOURCEBRIDGE_SERVER_MAX_BODY_SIZE |
server.max_body_size |
10485760 (10 MB) |
Max HTTP request body size in bytes |
| Variable | Config key | Default | Description |
|---|---|---|---|
SOURCEBRIDGE_STORAGE_SURREAL_MODE |
storage.surreal_mode |
embedded |
embedded or external
|
SOURCEBRIDGE_STORAGE_SURREAL_URL |
storage.surreal_url |
ws://localhost:8000/rpc |
SurrealDB WebSocket URL (external mode) |
SOURCEBRIDGE_STORAGE_SURREAL_NAMESPACE |
storage.surreal_namespace |
sourcebridge |
SurrealDB namespace |
SOURCEBRIDGE_STORAGE_SURREAL_DATABASE |
storage.surreal_database |
sourcebridge |
SurrealDB database name |
SOURCEBRIDGE_STORAGE_SURREAL_USER |
storage.surreal_user |
root |
SurrealDB username |
SOURCEBRIDGE_STORAGE_SURREAL_PASS |
storage.surreal_pass |
root |
SurrealDB password — change in production |
SOURCEBRIDGE_STORAGE_SURREAL_DATA_PATH |
storage.surreal_data_path |
./surrealdb-data |
Data directory for embedded mode |
SOURCEBRIDGE_STORAGE_REDIS_MODE |
storage.redis_mode |
memory |
memory or external
|
SOURCEBRIDGE_STORAGE_REDIS_URL |
storage.redis_url |
— | Redis URL (external mode) |
SOURCEBRIDGE_STORAGE_REPO_CACHE_PATH |
storage.repo_cache_path |
./repo-cache |
Local clone cache for indexed repos |
| Variable | Config key | Default | Description |
|---|---|---|---|
SOURCEBRIDGE_SECURITY_JWT_SECRET |
security.jwt_secret |
dev-secret-change-in-production |
JWT signing secret — required in production |
SOURCEBRIDGE_SECURITY_JWT_TTL_MINUTES |
security.jwt_ttl_minutes |
1440 |
JWT expiry (24 hours) |
SOURCEBRIDGE_SECURITY_ENCRYPTION_KEY |
security.encryption_key |
— | AES-256 key for field-level encryption (living wiki secrets) |
SOURCEBRIDGE_SECURITY_GRPC_AUTH_SECRET |
security.grpc_auth_secret |
— | Shared secret for API↔worker gRPC auth — required in production |
SOURCEBRIDGE_SECURITY_MODE |
security.mode |
oss |
oss or enterprise
|
SOURCEBRIDGE_SECURITY_CSRF_ENABLED |
security.csrf_enabled |
true |
CSRF protection |
SOURCEBRIDGE_SECURITY_GITHUB_WEBHOOK_SECRET |
security.github_webhook_secret |
— | HMAC secret for GitHub webhook validation |
SOURCEBRIDGE_SECURITY_GITLAB_WEBHOOK_SECRET |
security.gitlab_webhook_secret |
— | HMAC secret for GitLab webhook validation |
| Variable | Config key | Default | Description |
|---|---|---|---|
SOURCEBRIDGE_SECURITY_OIDC_ISSUER_URL |
security.oidc.issuer_url |
— | OIDC provider issuer URL |
SOURCEBRIDGE_SECURITY_OIDC_CLIENT_ID |
security.oidc.client_id |
— | OAuth client ID |
SOURCEBRIDGE_SECURITY_OIDC_CLIENT_SECRET |
security.oidc.client_secret |
— | OAuth client secret |
SOURCEBRIDGE_SECURITY_OIDC_REDIRECT_URL |
security.oidc.redirect_url |
— | OAuth redirect/callback URL |
SOURCEBRIDGE_SECURITY_OIDC_SCOPES |
security.oidc.scopes |
— | Comma-separated OIDC scopes |
| Variable | Config key | Default | Description |
|---|---|---|---|
SOURCEBRIDGE_LLM_PROVIDER |
llm.provider |
anthropic |
LLM provider (see table below) |
SOURCEBRIDGE_LLM_BASE_URL |
llm.base_url |
— | API endpoint (required for local providers) |
SOURCEBRIDGE_LLM_API_KEY |
llm.api_key |
— | API key (cloud providers) |
SOURCEBRIDGE_LLM_SUMMARY_MODEL |
llm.summary_model |
claude-sonnet-4-20250514 |
Default model for analysis |
SOURCEBRIDGE_LLM_REVIEW_MODEL |
llm.review_model |
claude-sonnet-4-20250514 |
Review operations |
SOURCEBRIDGE_LLM_ASK_MODEL |
llm.ask_model |
claude-sonnet-4-20250514 |
Discussion/QA operations |
SOURCEBRIDGE_LLM_KNOWLEDGE_MODEL |
llm.knowledge_model |
— | Knowledge generation (cliff notes, etc.) |
SOURCEBRIDGE_LLM_ARCHITECTURE_DIAGRAM_MODEL |
llm.architecture_diagram_model |
— | Architecture diagram generation |
SOURCEBRIDGE_LLM_REPORT_MODEL |
llm.report_model |
— | Report generation (enterprise) |
SOURCEBRIDGE_LLM_TIMEOUT_SECONDS |
llm.timeout_seconds |
900 |
Per-call LLM timeout (15 minutes) |
SOURCEBRIDGE_LLM_ADVANCED_MODE |
llm.advanced_mode |
false |
Enable per-operation model selection |
Per-operation model overrides (when advanced_mode = true) are an enterprise capability (per_op_models).
| Provider | Config value | Notes |
|---|---|---|
| Anthropic | anthropic |
Recommended for output quality. Claude Sonnet 4, Haiku, etc. |
| OpenAI | openai |
GPT-4o, GPT-4o-mini, etc. |
| Google Gemini | gemini |
Gemini 2.5 Pro, Flash, etc. |
| OpenRouter | openrouter |
100+ models behind one API key |
| Ollama | ollama |
Local. Set base_url to http://localhost:11434/v1
|
| vLLM | vllm |
Local, high-throughput PagedAttention |
| llama.cpp | llama-cpp |
Local, CPU/GPU, GGUF models |
| SGLang | sglang |
Local, RadixAttention |
| LM Studio | lmstudio |
Local, desktop GUI, OpenAI-compatible API |
All local providers expose an OpenAI-compatible API. Set base_url to the local endpoint.
Admin → Comprehension → Model Registry (/admin/comprehension/models) stores per-model metadata used by the Living Wiki quality validators. This is separate from the active-model selection at Admin → LLM (/admin/llm): the LLM page controls which model runs; the Model Registry controls how strictly the quality gates evaluate its output.
Each model carries a qualityGateTier that Living Wiki uses to pick appropriate gate thresholds:
| Tier | Typical models | Pattern-match rule |
|---|---|---|
frontier |
Claude (all), GPT-4o, GPT-4.1, o1, o3, Gemini Pro/Ultra | Anthropic provider (all); OpenAI gpt-4*, o1, o3; Gemini pro/ultra
|
mid |
gpt-4o-mini, o1-mini, Gemini Flash, open-weights ≥70B | OpenAI *-mini/*-nano; Gemini flash; size token ≥70B |
local |
Ollama-served models, open-weights <70B (qwen3:32b, llama3:8b, phi4, etc.) | Local inference providers (ollama, vllm, llama-cpp, sglang, lmstudio); size token <70B |
The default OSS install (config.toml.example ships qwen3:32b) resolves to TierLocal. Frontier gates (strict citation density, vagueness) are relaxed or demoted to warnings for local-tier runs, so a fresh install does not produce "0 pages generated".
When a model is not in the registry, ClassifyByPattern (internal/llm/modeltier/classify.go) runs the provider fast-path first, then the size parser, then family-name heuristics. Unknown providers default to TierLocal.
To override the pattern-match result for a specific model:
- Go to Admin → Comprehension → Model Registry.
- Create or update an entry using the model string alone as the key (e.g.
qwen3:32b,llama3.1:70b). Model IDs are stored and looked up in lowercase. - Set
qualityGateTiertofrontier,mid, orlocal.
The registry key is the model string, not provider/model — if two providers serve the same model name, register them under distinct IDs (e.g. openrouter/anthropic/claude-3-5-sonnet).
After a cold-start run, grep the API logs:
kubectl -n sourcebridge logs -l app=sourcebridge-api --tail=500 \
| grep "resolved quality-gate tier"Each line includes tier, source (registry or pattern), provider, and model.
| Variable | Config key | Default | Description |
|---|---|---|---|
SOURCEBRIDGE_WORKER_ADDRESS |
worker.address |
localhost:50051 |
gRPC address of the Python worker |
The worker has its own env vars prefixed with SOURCEBRIDGE_WORKER_:
| Variable | Description |
|---|---|
SOURCEBRIDGE_WORKER_GRPC_PORT |
gRPC listen port (default 50051) |
SOURCEBRIDGE_WORKER_LLM_PROVIDER |
LLM provider for the worker (can differ from API) |
SOURCEBRIDGE_WORKER_LLM_BASE_URL |
Worker LLM API endpoint |
SOURCEBRIDGE_WORKER_LLM_MODEL |
Worker LLM model name |
SOURCEBRIDGE_WORKER_LLM_API_KEY |
Worker LLM API key |
SOURCEBRIDGE_WORKER_EMBEDDING_PROVIDER |
Embedding provider |
SOURCEBRIDGE_WORKER_EMBEDDING_BASE_URL |
Embedding API endpoint |
SOURCEBRIDGE_WORKER_EMBEDDING_MODEL |
Embedding model (default nomic-embed-text) |
SOURCEBRIDGE_WORKER_EMBEDDING_DIMENSION |
Embedding dimension (default 768) |
SOURCEBRIDGE_WORKER_GRPC_AUTH_SECRET |
Must match SOURCEBRIDGE_SECURITY_GRPC_AUTH_SECRET
|
| Variable | Config key | Default | Description |
|---|---|---|---|
SOURCEBRIDGE_INDEXING_MAX_FILE_SIZE_BYTES |
indexing.max_file_size_bytes |
1048576 (1 MB) |
Skip files larger than this |
SOURCEBRIDGE_INDEXING_MAX_CONCURRENCY |
indexing.max_concurrency |
8 |
Parallel file parsing goroutines |
SOURCEBRIDGE_INDEXING_SCIP_ENABLED |
indexing.scip_enabled |
true |
SCIP-based precise indexing |
Default ignore globs: node_modules/**, dist/**, .git/**, vendor/**, __pycache__/**.
| Variable | Config key | Default | Description |
|---|---|---|---|
SOURCEBRIDGE_MCP_ENABLED |
mcp.enabled |
false |
Enable the MCP server |
SOURCEBRIDGE_MCP_REPOS |
mcp.repos |
— | Comma-separated repo IDs to expose (empty = all) |
SOURCEBRIDGE_MCP_SESSION_TTL |
mcp.session_ttl |
3600 |
Idle session reap time in seconds |
SOURCEBRIDGE_MCP_KEEPALIVE |
mcp.keepalive |
30 |
SSE keepalive ping interval in seconds |
SOURCEBRIDGE_MCP_MAX_SESSIONS |
mcp.max_sessions |
100 |
Max concurrent sessions (0 = unlimited) |
| Variable | Config key | Default | Description |
|---|---|---|---|
SOURCEBRIDGE_QA_SERVER_SIDE_ENABLED |
qa.server_side_enabled |
false |
Enable server-side deep-QA orchestrator |
SOURCEBRIDGE_QA_LOCAL_FAST_MODE_SUBPROCESS |
qa.local_fast_mode_subprocess |
true |
Keep subprocess QA path for local dev |
SOURCEBRIDGE_QA_QUESTION_MAX_BYTES |
qa.question_max_bytes |
4096 |
Max question length |
SOURCEBRIDGE_QA_SESSION_TOKENS_PER_HOUR |
qa.session_tokens_per_hour |
100000 |
Token budget per session per hour (0 = disabled) |
SOURCEBRIDGE_QA_REPO_TOKENS_PER_DAY |
qa.repo_tokens_per_day |
1000000 |
Token budget per repo per day |
SOURCEBRIDGE_QA_DEPLOYMENT_TOKENS_PER_DAY |
qa.deployment_tokens_per_day |
10000000 |
Deployment-level token circuit breaker |
SOURCEBRIDGE_QA_SYNTHESIS_LANE |
qa.synthesis_lane |
4 |
Concurrent synthesis calls against the worker |
SOURCEBRIDGE_QA_AGENTIC_RETRIEVAL_ENABLED |
qa.agentic_retrieval_enabled |
false |
Enable agentic retrieval loop |
SOURCEBRIDGE_QA_AGENTIC_RETRIEVAL_CANARY_PCT |
qa.agentic_retrieval_canary_pct |
0 |
Staged rollout percentage (0–100) |
SOURCEBRIDGE_QA_PROMPT_CACHING_ENABLED |
qa.prompt_caching_enabled |
true |
Anthropic prompt-cache markers |
SOURCEBRIDGE_QA_SMART_CLASSIFIER_ENABLED |
qa.smart_classifier_enabled |
false |
LLM-backed question profiler |
SOURCEBRIDGE_QA_QUERY_DECOMPOSITION_ENABLED |
qa.query_decomposition_enabled |
false |
Multi-hop query decomposition |
| Variable | Config key | Default | Description |
|---|---|---|---|
SOURCEBRIDGE_COMPREHENSION_MAX_CONCURRENCY |
comprehension.max_concurrency |
3 |
Max parallel LLM jobs |
| Variable | Config key | Default | Description |
|---|---|---|---|
SOURCEBRIDGE_TRASH_ENABLED |
trash.enabled |
true |
Enable soft-delete |
SOURCEBRIDGE_TRASH_RETENTION_DAYS |
trash.retention_days |
30 |
Days before permanent deletion (1–365) |
SOURCEBRIDGE_TRASH_SWEEP_INTERVAL |
trash.sweep_interval_sec |
21600 |
Sweep interval in seconds (6 hours) |
SOURCEBRIDGE_TRASH_SWEEP_MAX_BATCH |
trash.max_batch_size |
500 |
Max items per sweep pass |
| Variable | Config key | Default | Description |
|---|---|---|---|
SOURCEBRIDGE_LIVING_WIKI_ENABLED |
living_wiki.enabled |
false |
Enable living-wiki feature |
SOURCEBRIDGE_LIVING_WIKI_WORKER_COUNT |
living_wiki.worker_count |
4 |
Goroutines draining the dispatcher queue |
SOURCEBRIDGE_LIVING_WIKI_EVENT_TIMEOUT |
living_wiki.event_timeout |
5m |
Max duration per event handler |
SOURCEBRIDGE_LIVING_WIKI_SCHEDULER_INTERVAL |
living_wiki.scheduler_interval |
15m |
Default regen frequency per repo |
SOURCEBRIDGE_LIVING_WIKI_MAX_CONCURRENT_JOBS_PER_TENANT |
living_wiki.max_concurrent_jobs_per_tenant |
5 |
Per-tenant concurrency cap |
SOURCEBRIDGE_LIVING_WIKI_CONFLUENCE_WEBHOOK_SECRET |
living_wiki.confluence_webhook_secret |
— | HMAC secret for Confluence webhooks |
SOURCEBRIDGE_LIVING_WIKI_NOTION_WEBHOOK_SECRET |
living_wiki.notion_webhook_secret |
— | Reserved for Notion webhook validation |
SOURCEBRIDGE_LIVING_WIKI_KILL_SWITCH |
— | false |
Bypass living-wiki without redeploying |
| Variable | Config key | Default | Description |
|---|---|---|---|
SOURCEBRIDGE_GIT_DEFAULT_TOKEN |
git.default_token |
— | PAT used when no per-repo token is provided |
SOURCEBRIDGE_GIT_SSH_KEY_PATH |
git.ssh_key_path |
— | Path to SSH private key for SSH clone URLs |
| Variable | Description |
|---|---|
SOURCEBRIDGE_TELEMETRY |
Set to off to disable telemetry. DO_NOT_TRACK=1 also works |
SOURCEBRIDGE_TELEMETRY_PLATFORM |
Override the auto-detected platform string (useful for CI: set to test) |
These are read directly from the environment by the comprehension subsystem:
| Variable | Default | Description |
|---|---|---|
SOURCEBRIDGE_SELECTIVE_INVALIDATION |
true |
Delta-based artifact invalidation on reindex |
SOURCEBRIDGE_SELECTIVE_INVALIDATION_MAX_CHANGES |
200 |
Fall back to blanket-stale above this threshold |
SOURCEBRIDGE_DELTA_REGEN_MODE |
off |
off, shadow, or live auto-regen |
SOURCEBRIDGE_DELTA_REGEN_MAX_PER_INDEX |
5 |
Max artifacts auto-regenerated per reindex |
SOURCEBRIDGE_DELTA_REGEN_MAX_PER_REPO_PER_HOUR |
20 |
Rolling-hour cap per repo |
SourceBridge is open-source, licensed under AGPL-3.0.
Repository · Issues · Discussions · CHANGELOG