-
Notifications
You must be signed in to change notification settings - Fork 1
Core Concepts
SourceBridge is easier to adopt when everyone shares the same vocabulary. This page defines the terms used across the product.
The top-level indexed unit. SourceBridge clones or reads a directory, parses the code with tree-sitter, and stores symbols and relationships in SurrealDB. Most user-facing features operate on a repository ID.
An index is a snapshot of a repository at a specific commit. Indexing runs asynchronously after a repository is registered. The resulting index feeds the code graph and all generation features.
A named code entity extracted from the index: function, class, method, type, or variable. Every symbol has a stable ID, a file path, a line range, a kind, and optionally a doc-comment summary. Symbols are the nodes of the code graph.
The structured representation SourceBridge builds from the index:
- File nodes — source files with language and import edges
- Symbol nodes — functions, classes, methods, types
- Call edges — caller → callee relationships
- Import edges — file → imported-file relationships
-
Test edges — persisted
test_foredges from the indexer's resolve pass - Requirement links — symbol → requirement associations
Many user-facing features are projections of this graph.
A citation is a grounded reference from a generated artifact back to a specific code location. The canonical format is (path:start-end), where start and end are one-based line numbers. Symbol-level citations use sym_<id>. Every answer, field guide, living-wiki page, and compliance artifact uses the same format, so the VS Code plugin can jump to the exact line.
A tracked product or system need. Requirements can be imported from Markdown or CSV, or created directly in the web UI or VS Code extension. Each requirement has an external ID (auto-generated as REQ-<uuid> if not provided), a title, a description, and optional acceptance criteria. Requirements are linked to code symbols by the Python worker and are the basis for traceability and change-impact features.
An association between a requirement and a code symbol, carrying a confidence score. Links are created by the auto-linker or manually in the UI/extension. They are used to produce traceability matrices and coverage gap reports.
A multi-tenancy primitive. Each tenant has isolated repositories, requirements, jobs, and living-wiki state. tenant_id is a first-class field in the SurrealDB schema.
An async unit of work — knowledge generation, living-wiki cold-start, subsystem clustering, or indexing. Jobs have a status (queued, running, ready, failed), a progress percentage, and a result. The admin activity feed shows all running and recent jobs.
A generated understanding artifact:
- Cliff notes — summary at repository, module, file, or symbol scope
- Learning path — ordered reading list for onboarding
- Code tour — annotated walk-through of a conceptual flow
- Workflow story — end-to-end narrative for a feature or subsystem
- Architecture diagram — Mermaid-based structural view
Field guides are scoped (repo / file / symbol / requirement), audience-targeted (beginner / developer), and depth-controlled (Fast / Medium / Deep).
A named group of symbols detected by label-propagation clustering over the call graph. Subsystems are computed automatically after each index. The sourcebridge setup claude command uses clusters to produce per-subsystem sections in .claude/CLAUDE.md. The living wiki uses clusters as its primary "areas" signal so generated pages follow architectural boundaries rather than package paths.
A publish target for the living wiki. Each sink has a kind (git_repo, confluence, notion, github_wiki, gitlab_wiki, backstage_techdocs, mkdocs, docusaurus, vitepress) and per-repo configuration. Sinks that are wired and have active implementations: confluence, notion, git_repo. Others are defined but not yet fully implemented.
A living-wiki concept. Each wiki can target one or more audience types:
- engineer — technical depth, implementation details, citation density
- product — outcome-oriented, light on code references
- operator — deployment, configuration, operational runbook focus
The quality validators apply per-template per-audience thresholds.
The internal content model for living-wiki pages. A page is a tree of typed blocks (heading, paragraph, code, table, callout). Each block has a stable ID sticky to logical position (not derived from content) and a four-state ownership: generated, human-edited, human-edited-on-pr-branch, or human-only. Human-edited blocks are left alone during incremental regeneration.
internal/capabilities/registry.go is the single source of truth for what each edition offers. It drives the MCP tools/list filter, the initialize response's experimental.sourcebridge.features, GraphQL field gating, and REST route gating. Every surface checks the registry by capability name rather than branching on the edition string directly.
The relationship between requirements and code. SourceBridge surfaces traceability as:
- requirement → linked symbols (with confidence score)
- symbol → linked requirements
- coverage gaps (requirements with no links above threshold)
- change impact (which requirements are touched by a diff)
A summary of the likely downstream effects of recent code changes: changed files, affected requirements, stale generated knowledge. Useful before a risky commit or release.
SourceBridge is open-source, licensed under AGPL-3.0.
Repository · Issues · Discussions · CHANGELOG