Skip to content

What SourceBridge Can Do

jstuart0 edited this page Apr 28, 2026 · 2 revisions

What SourceBridge Can Do

SourceBridge is broad enough that new users often miss half the product on a first pass. This page is the complete capability map.

Repository indexing

Tree-sitter based parsing of source code into a structured code graph.

  • Languages: Go, Python, TypeScript, JavaScript, Java, Rust, Ruby, PHP, C++, C# (10 total)
  • Extracts functions, classes, methods, imports, call relationships, and doc comments
  • Test-file detection per language convention
  • Configurable file-size and glob-based ignore rules
  • Runs in the Go API server (not the worker) — no LLM needed for indexing
  • See Configuration and Models for tuning knobs

Code graph

The indexed repository becomes a queryable graph:

  • Symbols — every function, class, method, and type with location, kind, and doc summary
  • Callers / callees — who calls what, queryable N hops out
  • File imports — direct and transitive import relationships
  • Subsystem clusters — label-propagation clustering (Raghavan et al. 2007) groups symbols into architectural subsystems automatically after each index; the web UI Subsystems tab shows clusters with representative symbols and cross-cluster call counts
  • Entry points — HTTP routes, CLI entry points, Grails controller actions, FastAPI routers, and similar classified in basic or framework_aware mode

Field guides

AI-generated explanations at any scope. Powered by the Python worker.

  • Cliff notes — concise codebase or file/symbol summaries, multi-audience (beginner / developer)
  • Learning paths — ordered reading list for onboarding into a module or whole repo
  • Code tours — annotated walk-through of a conceptual flow through the code
  • Workflow stories — end-to-end narrative for a feature or subsystem
  • Architecture diagrams — Mermaid diagrams generated from the code graph structure

Generation mode is configurable per scope: Fast, Medium, or Deep. The admin monitor shows queue state and cache-reuse stats.

Requirement tracing

Import requirements and connect them to implementation:

  • Import from Markdown or CSV
  • Auto-link symbols to requirements using the Python worker
  • Create, edit, and soft-delete requirements from the web UI or VS Code extension
  • Traceability matrix: requirement → linked symbols → coverage confidence
  • Gap detection: surface requirements with weak or missing code evidence
  • Change impact: which requirements are touched by a given diff

The citation contract ((path:start-end) format) makes every traceability link point to an exact line range.

Agentic QA

Ask questions about the whole codebase and get grounded answers:

  • Server-side deep-QA orchestrator (internal/qa) with a tool-using agentic retrieval loop
  • Answers carry (path:start-end) or sym_<id> citations — the VS Code plugin can jump to the exact line
  • Hybrid search combines full-text, vector, and structural signals with RRF fusion
  • Smart classifier routes questions to the right retrieval strategy
  • Query decomposition for multi-hop architecture questions
  • Benchmarked at 71.67% useful-rate on a 120-question parity suite (Opus-4.7 judge)

Living wiki

Auto-generated, citation-grounded wiki maintained in sync with the codebase. See Living Wiki for full details.

  • Opens a PR within 90 seconds of enablement
  • Block-level reconciliation: human edits are preserved across regeneration cycles
  • Sink targets: Confluence (wired), Notion (wired), git_repo PR workflow (wired); GitHub wiki, GitLab wiki, Backstage TechDocs, MkDocs, Docusaurus, VitePress (stubbed/partial)
  • Per-repo configuration: audience (engineer/product/operator), sinks, edit policy
  • Includes auto-extracted Glossary (zero-LLM), Activity Log, and Decision Record templates

Knowledge-engine reports

Beyond field guides, SourceBridge generates structured report types from indexed repositories:

  • Cliff notes, learning paths, code tours, workflow stories, architecture diagrams (all scopes)
  • Glossary — deterministic, one entry per exported symbol, updates on reindex
  • Activity log — commit-graph bucketed by author and week, optional LLM weekly digest
  • Decision records — detects decision: / adr: commit prefixes and BREAKING CHANGE: bodies

Claude Code integration

sourcebridge setup claude --repo-id <id>

Writes three files into the repository:

  • .claude/CLAUDE.md — per-subsystem skill card with graph-derived Watch out: lines (cross-package callers, hot-path symbols)
  • .mcp.json — MCP server registration, idempotent merge preserving foreign keys
  • .claude/sourcebridge.json — repo ID, server URL, index timestamp

Re-run after reindex to refresh. --dry-run shows what would change without writing. --force overwrites user-edited sections.

VS Code extension

  • CodeLens lenses and hover decorations over linked symbols
  • Cmd+I streaming AI chat grounded in the indexed repository
  • Cmd+. code-action lightbulbs: link to requirement, create from symbol, show linked requirements
  • Cmd+K N generates a field guide for the active file
  • Cmd+Shift+; scoped command palette
  • Change Risk sidebar: changed files, affected requirements, stale field guides
  • Full requirement CRUD without leaving the editor
  • See VS Code Extension

MCP server

23 tools across six groups, plus 3 prompts. See AI Clients and MCP for the full list and connection instructions.

  • Symbol search and code graph traversal
  • AI explanation and QA
  • Requirements and impact
  • Indexing lifecycle (register, poll, refresh)
  • Subsystem clustering (3 tools)
  • Compound workflows (review diff, impact summary, onboard contributor)

GraphQL and REST API

Full programmatic access to all platform capabilities. The GraphQL schema lives in internal/api/graphql/schema.graphqls. The introspection endpoint is available at /graphql. See CLI and API.

OIDC SSO support

OIDC-based single sign-on via security.oidc.* config keys. Configured through environment variables: SOURCEBRIDGE_SECURITY_OIDC_ISSUER_URL, SOURCEBRIDGE_SECURITY_OIDC_CLIENT_ID, SOURCEBRIDGE_SECURITY_OIDC_CLIENT_SECRET.

Multi-tenant architecture

tenant_id is a first-class field at the schema level. Each tenant gets isolated data, independent per-sink rate limits in the living wiki scheduler, and separate concurrency caps.

Clone this wiki locally