Configuration and Models

SourceBridge reads configuration from a TOML file and environment variables. Environment variables use the SOURCEBRIDGE_ prefix and override file values. The config file is searched in order: ./config.toml, $HOME/.config/sourcebridge/config.toml, /etc/sourcebridge/config.toml.

See config.toml.example for an annotated example.

Server

Variable	Config key	Default	Description
`SOURCEBRIDGE_SERVER_HTTP_PORT`	`server.http_port`	`8080`	API server HTTP port
`SOURCEBRIDGE_SERVER_GRPC_PORT`	`server.grpc_port`	`50051`	gRPC port for API↔worker communication
`SOURCEBRIDGE_SERVER_PUBLIC_BASE_URL`	`server.public_base_url`	`http://localhost:8080`	Public-facing URL (used in OAuth callbacks and generated links)
`SOURCEBRIDGE_SERVER_CORS_ORIGINS`	`server.cors_origins`	`http://localhost:3000`	Comma-separated allowed CORS origins
`SOURCEBRIDGE_SERVER_MAX_BODY_SIZE`	`server.max_body_size`	`10485760` (10 MB)	Max HTTP request body size in bytes

Storage

Variable	Config key	Default	Description
`SOURCEBRIDGE_STORAGE_SURREAL_MODE`	`storage.surreal_mode`	`embedded`	`embedded` or `external`
`SOURCEBRIDGE_STORAGE_SURREAL_URL`	`storage.surreal_url`	`ws://localhost:8000/rpc`	SurrealDB WebSocket URL (external mode)
`SOURCEBRIDGE_STORAGE_SURREAL_NAMESPACE`	`storage.surreal_namespace`	`sourcebridge`	SurrealDB namespace
`SOURCEBRIDGE_STORAGE_SURREAL_DATABASE`	`storage.surreal_database`	`sourcebridge`	SurrealDB database name
`SOURCEBRIDGE_STORAGE_SURREAL_USER`	`storage.surreal_user`	`root`	SurrealDB username
`SOURCEBRIDGE_STORAGE_SURREAL_PASS`	`storage.surreal_pass`	`root`	SurrealDB password — change in production
`SOURCEBRIDGE_STORAGE_SURREAL_DATA_PATH`	`storage.surreal_data_path`	`./surrealdb-data`	Data directory for embedded mode
`SOURCEBRIDGE_STORAGE_REDIS_MODE`	`storage.redis_mode`	`memory`	`memory` or `external`
`SOURCEBRIDGE_STORAGE_REDIS_URL`	`storage.redis_url`	—	Redis URL (external mode)
`SOURCEBRIDGE_STORAGE_REPO_CACHE_PATH`	`storage.repo_cache_path`	`./repo-cache`	Local clone cache for indexed repos

Security

Variable	Config key	Default	Description
`SOURCEBRIDGE_SECURITY_JWT_SECRET`	`security.jwt_secret`	`dev-secret-change-in-production`	JWT signing secret — required in production
`SOURCEBRIDGE_SECURITY_JWT_TTL_MINUTES`	`security.jwt_ttl_minutes`	`1440`	JWT expiry (24 hours)
`SOURCEBRIDGE_SECURITY_ENCRYPTION_KEY`	`security.encryption_key`	—	AES-256 key for field-level encryption (living wiki secrets)
`SOURCEBRIDGE_SECURITY_GRPC_AUTH_SECRET`	`security.grpc_auth_secret`	—	Shared secret for API↔worker gRPC auth — required in production
`SOURCEBRIDGE_SECURITY_MODE`	`security.mode`	`oss`	`oss` or `enterprise`
`SOURCEBRIDGE_SECURITY_CSRF_ENABLED`	`security.csrf_enabled`	`true`	CSRF protection
`SOURCEBRIDGE_SECURITY_GITHUB_WEBHOOK_SECRET`	`security.github_webhook_secret`	—	HMAC secret for GitHub webhook validation
`SOURCEBRIDGE_SECURITY_GITLAB_WEBHOOK_SECRET`	`security.gitlab_webhook_secret`	—	HMAC secret for GitLab webhook validation

OIDC SSO

Variable	Config key	Default	Description
`SOURCEBRIDGE_SECURITY_OIDC_ISSUER_URL`	`security.oidc.issuer_url`	—	OIDC provider issuer URL
`SOURCEBRIDGE_SECURITY_OIDC_CLIENT_ID`	`security.oidc.client_id`	—	OAuth client ID
`SOURCEBRIDGE_SECURITY_OIDC_CLIENT_SECRET`	`security.oidc.client_secret`	—	OAuth client secret
`SOURCEBRIDGE_SECURITY_OIDC_REDIRECT_URL`	`security.oidc.redirect_url`	—	OAuth redirect/callback URL
`SOURCEBRIDGE_SECURITY_OIDC_SCOPES`	`security.oidc.scopes`	—	Comma-separated OIDC scopes

LLM provider

Variable	Config key	Default	Description
`SOURCEBRIDGE_LLM_PROVIDER`	`llm.provider`	`anthropic`	LLM provider (see table below)
`SOURCEBRIDGE_LLM_BASE_URL`	`llm.base_url`	—	API endpoint (required for local providers)
`SOURCEBRIDGE_LLM_API_KEY`	`llm.api_key`	—	API key (cloud providers)
`SOURCEBRIDGE_LLM_SUMMARY_MODEL`	`llm.summary_model`	`claude-sonnet-4-20250514`	Default model for analysis
`SOURCEBRIDGE_LLM_REVIEW_MODEL`	`llm.review_model`	`claude-sonnet-4-20250514`	Review operations
`SOURCEBRIDGE_LLM_ASK_MODEL`	`llm.ask_model`	`claude-sonnet-4-20250514`	Discussion/QA operations
`SOURCEBRIDGE_LLM_KNOWLEDGE_MODEL`	`llm.knowledge_model`	—	Knowledge generation (cliff notes, etc.)
`SOURCEBRIDGE_LLM_ARCHITECTURE_DIAGRAM_MODEL`	`llm.architecture_diagram_model`	—	Architecture diagram generation
`SOURCEBRIDGE_LLM_REPORT_MODEL`	`llm.report_model`	—	Report generation (enterprise)
`SOURCEBRIDGE_LLM_TIMEOUT_SECONDS`	`llm.timeout_seconds`	`900`	Per-call LLM timeout (15 minutes)
`SOURCEBRIDGE_LLM_ADVANCED_MODE`	`llm.advanced_mode`	`false`	Enable per-operation model selection

Per-operation model overrides (when advanced_mode = true) are an enterprise capability (per_op_models).

Supported LLM providers

Provider	Config value	Notes
Anthropic	`anthropic`	Recommended for output quality. Claude Sonnet 4, Haiku, etc.
OpenAI	`openai`	GPT-4o, GPT-4o-mini, etc.
Google Gemini	`gemini`	Gemini 2.5 Pro, Flash, etc.
OpenRouter	`openrouter`	100+ models behind one API key
Ollama	`ollama`	Local. Set `base_url` to `http://localhost:11434/v1`
vLLM	`vllm`	Local, high-throughput PagedAttention
llama.cpp	`llama-cpp`	Local, CPU/GPU, GGUF models
SGLang	`sglang`	Local, RadixAttention
LM Studio	`lmstudio`	Local, desktop GUI, OpenAI-compatible API

All local providers expose an OpenAI-compatible API. Set base_url to the local endpoint.

Model Registry and capability tiers

Admin → Comprehension → Model Registry (/admin/comprehension/models) stores per-model metadata used by the Living Wiki quality validators. This is separate from the active-model selection at Admin → LLM (/admin/llm): the LLM page controls which model runs; the Model Registry controls how strictly the quality gates evaluate its output.

Capability tiers

Each model carries a qualityGateTier that Living Wiki uses to pick appropriate gate thresholds:

Tier	Typical models	Pattern-match rule
`frontier`	Claude (all), GPT-4o, GPT-4.1, o1, o3, Gemini Pro/Ultra	Anthropic provider (all); OpenAI `gpt-4*`, `o1`, `o3`; Gemini `pro`/`ultra`
`mid`	gpt-4o-mini, o1-mini, Gemini Flash, open-weights ≥70B	OpenAI `-mini`/`-nano`; Gemini `flash`; size token ≥70B
`local`	Ollama-served models, open-weights <70B (qwen3:32b, llama3:8b, phi4, etc.)	Local inference providers (ollama, vllm, llama-cpp, sglang, lmstudio); size token <70B

The default OSS install (config.toml.example ships qwen3:32b) resolves to TierLocal. Frontier gates (strict citation density, vagueness) are relaxed or demoted to warnings for local-tier runs, so a fresh install does not produce "0 pages generated".

When a model is not in the registry, ClassifyByPattern (internal/llm/modeltier/classify.go) runs the provider fast-path first, then the size parser, then family-name heuristics. Unknown providers default to TierLocal.

Registering a model

To override the pattern-match result for a specific model:

Go to Admin → Comprehension → Model Registry.
Create or update an entry using the model string alone as the key (e.g. qwen3:32b, llama3.1:70b). Model IDs are stored and looked up in lowercase.
Set qualityGateTier to frontier, mid, or local.

The registry key is the model string, not provider/model — if two providers serve the same model name, register them under distinct IDs (e.g. openrouter/anthropic/claude-3-5-sonnet).

Verifying the resolved tier

After a cold-start run, grep the API logs:

kubectl -n sourcebridge logs -l app=sourcebridge-api --tail=500 \
  | grep "resolved quality-gate tier"

Each line includes tier, source (registry or pattern), provider, and model.

Worker

Variable	Config key	Default	Description
`SOURCEBRIDGE_WORKER_ADDRESS`	`worker.address`	`localhost:50051`	gRPC address of the Python worker

The worker has its own env vars prefixed with SOURCEBRIDGE_WORKER_:

Variable	Description
`SOURCEBRIDGE_WORKER_GRPC_PORT`	gRPC listen port (default `50051`)
`SOURCEBRIDGE_WORKER_LLM_PROVIDER`	LLM provider for the worker (can differ from API)
`SOURCEBRIDGE_WORKER_LLM_BASE_URL`	Worker LLM API endpoint
`SOURCEBRIDGE_WORKER_LLM_MODEL`	Worker LLM model name
`SOURCEBRIDGE_WORKER_LLM_API_KEY`	Worker LLM API key
`SOURCEBRIDGE_WORKER_EMBEDDING_PROVIDER`	Embedding provider
`SOURCEBRIDGE_WORKER_EMBEDDING_BASE_URL`	Embedding API endpoint
`SOURCEBRIDGE_WORKER_EMBEDDING_MODEL`	Embedding model (default `nomic-embed-text`)
`SOURCEBRIDGE_WORKER_EMBEDDING_DIMENSION`	Embedding dimension (default `768`)
`SOURCEBRIDGE_WORKER_GRPC_AUTH_SECRET`	Must match `SOURCEBRIDGE_SECURITY_GRPC_AUTH_SECRET`

Indexing

Variable	Config key	Default	Description
`SOURCEBRIDGE_INDEXING_MAX_FILE_SIZE_BYTES`	`indexing.max_file_size_bytes`	`1048576` (1 MB)	Skip files larger than this
`SOURCEBRIDGE_INDEXING_MAX_CONCURRENCY`	`indexing.max_concurrency`	`8`	Parallel file parsing goroutines
`SOURCEBRIDGE_INDEXING_SCIP_ENABLED`	`indexing.scip_enabled`	`true`	SCIP-based precise indexing

Default ignore globs: node_modules/**, dist/**, .git/**, vendor/**, __pycache__/**.

MCP

Variable	Config key	Default	Description
`SOURCEBRIDGE_MCP_ENABLED`	`mcp.enabled`	`false`	Enable the MCP server
`SOURCEBRIDGE_MCP_REPOS`	`mcp.repos`	—	Comma-separated repo IDs to expose (empty = all)
`SOURCEBRIDGE_MCP_SESSION_TTL`	`mcp.session_ttl`	`3600`	Idle session reap time in seconds
`SOURCEBRIDGE_MCP_KEEPALIVE`	`mcp.keepalive`	`30`	SSE keepalive ping interval in seconds
`SOURCEBRIDGE_MCP_MAX_SESSIONS`	`mcp.max_sessions`	`100`	Max concurrent sessions (0 = unlimited)

QA (agentic retrieval)

Variable	Config key	Default	Description
`SOURCEBRIDGE_QA_SERVER_SIDE_ENABLED`	`qa.server_side_enabled`	`false`	Enable server-side deep-QA orchestrator
`SOURCEBRIDGE_QA_LOCAL_FAST_MODE_SUBPROCESS`	`qa.local_fast_mode_subprocess`	`true`	Keep subprocess QA path for local dev
`SOURCEBRIDGE_QA_QUESTION_MAX_BYTES`	`qa.question_max_bytes`	`4096`	Max question length
`SOURCEBRIDGE_QA_SESSION_TOKENS_PER_HOUR`	`qa.session_tokens_per_hour`	`100000`	Token budget per session per hour (0 = disabled)
`SOURCEBRIDGE_QA_REPO_TOKENS_PER_DAY`	`qa.repo_tokens_per_day`	`1000000`	Token budget per repo per day
`SOURCEBRIDGE_QA_DEPLOYMENT_TOKENS_PER_DAY`	`qa.deployment_tokens_per_day`	`10000000`	Deployment-level token circuit breaker
`SOURCEBRIDGE_QA_SYNTHESIS_LANE`	`qa.synthesis_lane`	`4`	Concurrent synthesis calls against the worker
`SOURCEBRIDGE_QA_AGENTIC_RETRIEVAL_ENABLED`	`qa.agentic_retrieval_enabled`	`false`	Enable agentic retrieval loop
`SOURCEBRIDGE_QA_AGENTIC_RETRIEVAL_CANARY_PCT`	`qa.agentic_retrieval_canary_pct`	`0`	Staged rollout percentage (0–100)
`SOURCEBRIDGE_QA_PROMPT_CACHING_ENABLED`	`qa.prompt_caching_enabled`	`true`	Anthropic prompt-cache markers
`SOURCEBRIDGE_QA_SMART_CLASSIFIER_ENABLED`	`qa.smart_classifier_enabled`	`false`	LLM-backed question profiler
`SOURCEBRIDGE_QA_QUERY_DECOMPOSITION_ENABLED`	`qa.query_decomposition_enabled`	`false`	Multi-hop query decomposition

Comprehension (knowledge generation)

Variable	Config key	Default	Description
`SOURCEBRIDGE_COMPREHENSION_MAX_CONCURRENCY`	`comprehension.max_concurrency`	`3`	Max parallel LLM jobs

Trash (soft-delete recycle bin)

Variable	Config key	Default	Description
`SOURCEBRIDGE_TRASH_ENABLED`	`trash.enabled`	`true`	Enable soft-delete
`SOURCEBRIDGE_TRASH_RETENTION_DAYS`	`trash.retention_days`	`30`	Days before permanent deletion (1–365)
`SOURCEBRIDGE_TRASH_SWEEP_INTERVAL`	`trash.sweep_interval_sec`	`21600`	Sweep interval in seconds (6 hours)
`SOURCEBRIDGE_TRASH_SWEEP_MAX_BATCH`	`trash.max_batch_size`	`500`	Max items per sweep pass

Living wiki

Variable	Config key	Default	Description
`SOURCEBRIDGE_LIVING_WIKI_ENABLED`	`living_wiki.enabled`	`false`	Enable living-wiki feature
`SOURCEBRIDGE_LIVING_WIKI_WORKER_COUNT`	`living_wiki.worker_count`	`4`	Goroutines draining the dispatcher queue
`SOURCEBRIDGE_LIVING_WIKI_EVENT_TIMEOUT`	`living_wiki.event_timeout`	`5m`	Max duration per event handler
`SOURCEBRIDGE_LIVING_WIKI_SCHEDULER_INTERVAL`	`living_wiki.scheduler_interval`	`15m`	Default regen frequency per repo
`SOURCEBRIDGE_LIVING_WIKI_MAX_CONCURRENT_JOBS_PER_TENANT`	`living_wiki.max_concurrent_jobs_per_tenant`	`5`	Per-tenant concurrency cap
`SOURCEBRIDGE_LIVING_WIKI_CONFLUENCE_WEBHOOK_SECRET`	`living_wiki.confluence_webhook_secret`	—	HMAC secret for Confluence webhooks
`SOURCEBRIDGE_LIVING_WIKI_NOTION_WEBHOOK_SECRET`	`living_wiki.notion_webhook_secret`	—	Reserved for Notion webhook validation
`SOURCEBRIDGE_LIVING_WIKI_KILL_SWITCH`	—	`false`	Bypass living-wiki without redeploying

Git

Variable	Config key	Default	Description
`SOURCEBRIDGE_GIT_DEFAULT_TOKEN`	`git.default_token`	—	PAT used when no per-repo token is provided
`SOURCEBRIDGE_GIT_SSH_KEY_PATH`	`git.ssh_key_path`	—	Path to SSH private key for SSH clone URLs

Telemetry

Variable	Description
`SOURCEBRIDGE_TELEMETRY`	Set to `off` to disable telemetry. `DO_NOT_TRACK=1` also works
`SOURCEBRIDGE_TELEMETRY_PLATFORM`	Override the auto-detected platform string (useful for CI: set to `test`)

Knowledge generation env vars (not in config struct)

These are read directly from the environment by the comprehension subsystem:

Variable	Default	Description
`SOURCEBRIDGE_SELECTIVE_INVALIDATION`	`true`	Delta-based artifact invalidation on reindex
`SOURCEBRIDGE_SELECTIVE_INVALIDATION_MAX_CHANGES`	`200`	Fall back to blanket-stale above this threshold
`SOURCEBRIDGE_DELTA_REGEN_MODE`	`off`	`off`, `shadow`, or `live` auto-regen
`SOURCEBRIDGE_DELTA_REGEN_MAX_PER_INDEX`	`5`	Max artifacts auto-regenerated per reindex
`SOURCEBRIDGE_DELTA_REGEN_MAX_PER_REPO_PER_HOUR`	`20`	Rolling-hour cap per repo

SourceBridge is open-source, licensed under AGPL-3.0.

Repository · Issues · Discussions · CHANGELOG

Home

Configuration and Models

Configuration and Models

Server

Storage

Security

OIDC SSO

LLM provider

Supported LLM providers

Model Registry and capability tiers

Capability tiers

Registering a model

Verifying the resolved tier

Worker

Indexing

MCP

QA (agentic retrieval)

Comprehension (knowledge generation)

Trash (soft-delete recycle bin)

Living wiki

Git

Telemetry

Knowledge generation env vars (not in config struct)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Evaluate

Use

Operate

Understand

Clone this wiki locally