Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ What actually happened. Include the JSON output if relevant.
- **OS:** (e.g., macOS 14, Ubuntu 24.04, Windows 11)
- **Contextception version:** (`contextception --version`)
- **Go version:** (`go version`, if built from source)
- **Language of analyzed files:** (Python, TypeScript, Go, Java, Rust)
- **Language of analyzed files:** (Python, TypeScript, Go, Java, Rust, C#)

## Additional Context

Expand Down
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Added

- **5-language support:** Python, TypeScript/JavaScript, Go, Java, Rust
- **6-language support:** Python, TypeScript/JavaScript, Go, Java, Rust, C#
- **CLI with 10 commands:** analyze, analyze-change, search, archetypes, history, index, reindex, extensions, status, mcp
- **MCP server** with 8 tools for integration with Claude Code, Cursor, Windsurf, and other AI tools
- **Schema 3.2 output** with confidence scoring, role classification, code signatures, and direction field
Expand Down
4 changes: 2 additions & 2 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co

**Contextception** is a code context intelligence engine written in **Go**. It answers: *"What code must be understood before making a safe change?"* It is not a code generator, AI assistant, or IDE — it determines what matters, not what to do.

Supports 5 languages: Python, TypeScript/JavaScript, Go, Java, and Rust. Available as a CLI (16 commands) and MCP server (9 tools).
Supports 6 languages: Python, TypeScript/JavaScript, Go, Java, Rust, and C#. Available as a CLI (16 commands) and MCP server (9 tools).

## Tech Stack

Expand Down Expand Up @@ -65,7 +65,7 @@ internal/
cli/ Command handlers (cobra subcommands)
config/ Configuration parsing (per-repo + global config)
db/ SQLite layer (migrations, store, search)
extractor/ Language extractors (python, typescript, golang, java, rust)
extractor/ Language extractors (python, typescript, golang, java, rust, csharp)
git/ Git history signal extraction
grader/ Internal quality evaluation framework
history/ Historical analysis, usage tracking, and feedback storage
Expand Down
13 changes: 8 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Contextception builds a dependency graph of your repository and returns ranked,
<strong>97% Precision (independent ground truth)</strong> &nbsp;&middot;&nbsp;
<strong>Tested across 16 repos</strong> &nbsp;&middot;&nbsp;
<strong>Sub-second Analysis</strong> &nbsp;&middot;&nbsp;
<strong>5 Languages</strong> &nbsp;&middot;&nbsp;
<strong>6 Languages</strong> &nbsp;&middot;&nbsp;
<strong>Free & Open Source</strong>
</p>

Expand Down Expand Up @@ -71,7 +71,7 @@ $ contextception index
Indexed 2,638 files, 10,087 edges in 0.9s
```

Scans your codebase, extracts imports across 5 languages, resolves dependencies, computes git history signals. **Incremental:** only changed files reprocessed.
Scans your codebase, extracts imports across 6 languages, resolves dependencies, computes git history signals. **Incremental:** only changed files reprocessed.

### 2. Analyze any file

Expand Down Expand Up @@ -347,6 +347,7 @@ These work by combining contextception's deterministic risk analysis with the LL
| **Go** | Regex | go.mod + go.work, same-package resolution | `.go` |
| **Java** | Regex | Package-to-directory, mirror-directory test discovery | `.java` |
| **Rust** | Regex | Cargo workspaces, mod.rs, crate/super/self paths, inline test detection | `.rs` |
| **C#** | Regex | .csproj project detection, namespace-to-file resolution, filename search fallback | `.cs` |

---

Expand Down Expand Up @@ -457,7 +458,7 @@ generated:

## Tested Across 16 Repositories

Indexed and analyzed real-world codebases spanning all 5 supported languages:
Indexed and analyzed real-world codebases spanning all 6 supported languages:

| Repository | Language | Files |
|-----------|----------|-------|
Expand All @@ -475,10 +476,12 @@ Indexed and analyzed real-world codebases spanning all 5 supported languages:
| Kafka | Java | 3,200+ |
| Tokio | Rust | 1,021 |
| Bevy | Rust | 2,400+ |
| EF Core | C# | 5,708 |
| Jellyfin | C# | 1,971 |
| Medusa | TypeScript | 1,800+ |
| supermemory | TypeScript | 200+ |

**Tested across 419 files spanning all 5 supported languages.**
**Tested across 419+ files spanning all 6 supported languages.**

---

Expand All @@ -488,7 +491,7 @@ Contextception is a standalone static analysis tool, not an AI coding assistant.

| Capability | Contextception | Aider repo-map | Repomix |
|------------|:-:|:-:|:-:|
| Static dependency graph | Full (5 languages) | Partial (tree-sitter) | No |
| Static dependency graph | Full (6 languages) | Partial (tree-sitter) | No |
| Per-file relevance ranking | Yes | PageRank-based | No (full dump) |
| Explainability (direction, symbols, role) | Yes | No | No |
| Blast radius / risk scoring | Yes | No | No |
Expand Down
55 changes: 42 additions & 13 deletions benchmarks/README.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,21 @@
# How Contextception Compares

A context quality comparison between Contextception, [Aider's repo-map](https://aider.chat/docs/repomap.html), and [Repomix](https://github.com/yamadashy/repomix), tested across 6 repos, 51 files, and 4 languages.
A context quality comparison between Contextception, [Aider's repo-map](https://aider.chat/docs/repomap.html), and [Repomix](https://github.com/yamadashy/repomix), tested across 7 repos, 68 files, and 5 languages.

## TL;DR

- On httpx (independent fixture ground truth): **97% recall vs. Aider@4K's 83%** (Aider@8K: 90%), at 5x fewer tokens
- Aider's recall drops from **97% → 0%** as repos grow from 60 → 7,978 files
- Contextception averages **1,091 tokens** per analysis; Aider@4K averages **3,600 tokens** with lower recall
- Contextception averages **1,174 tokens** per analysis; Aider@4K averages **3,600 tokens** with lower recall

## Limitations

Read these first — they're the reason you should (or shouldn't) trust these numbers.

1. **Aider's repo-map serves a different purpose.** It's designed as internal LLM context for Aider's own editing workflow, not as a standalone dependency analysis tool. We're evaluating it outside its intended use case.
2. **Independent ground truth exists only for httpx.** The httpx comparisons use 5 expert-verified [fixture files](data/fixtures/) with hand-curated `must_read` lists. For other repos, ground truth is Contextception's own output, validated at grade A (3.76–3.97) across 23 evaluation rounds.
3. **Aider gets 0% on Go, Java, and Rust** because its tree-sitter parser doesn't resolve module imports for these languages. This is a real limitation of the tool, not a testing artifact — but it means the comparison is lopsided for 3 of 6 repos.
4. **Sample size: 51 files across 6 repos**, selected by archetype diversity (one file per structural role per repo). Not exhaustive.
3. **Aider gets 0–3% on Go, Java, Rust, and C#** because its tree-sitter parser doesn't resolve module imports for these languages. This is a real limitation of the tool, not a testing artifact — but it means the comparison is lopsided for 4 of 7 repos.
4. **Sample size: 68 files across 7 repos**, selected by archetype diversity (one file per structural role per repo). Not exhaustive.

## What We Measured

Expand Down Expand Up @@ -44,12 +44,13 @@ The key finding: Aider's recall degrades as repository size increases, while Con
| Tokio | Rust | 763 | 0% | 1% | 812 |
| Terraform | Go | 1,885 | 3% | 8% | 941 |
| Zulip | Python/TS | 2,638 | 21% | 27% | 837 |
| EF Core | C# | 5,708 | 2% | 3% | 1,420 |
| Spring Boot | Java | 7,978 | 0% | 0% | 2,109 |

**Why this happens:**

- **Python (httpx → Zulip):** Aider uses PageRank on a global definition graph. In a small repo, globally important files overlap with local dependencies. In a large repo, globally popular files (models.py, utils.py) crowd out the specific imports that matter for a given file.
- **Go, Java, Rust:** Aider's tree-sitter parser doesn't resolve module specifiers to file paths. `import "internal/tfdiags"` doesn't create an edge to any file — it just notes that symbols are referenced. Contextception resolves these via go.mod, Java package conventions, and Cargo workspaces.
- **Go, Java, Rust, C#:** Aider's tree-sitter parser doesn't resolve module specifiers to file paths. `import "internal/tfdiags"` doesn't create an edge to any file — it just notes that symbols are referenced. Contextception resolves these via go.mod, Java package conventions, Cargo workspaces, and .csproj project detection.
- **TypeScript:** Aider doesn't resolve tsconfig paths, workspace packages, or barrel exports. `@excalidraw/utils` doesn't map to `packages/utils/src/index.ts`.

### httpx Deep Dive (Fixture Ground Truth)
Expand All @@ -67,12 +68,12 @@ Contextception matches or exceeds Aider's best recall while maintaining near-per

### Token Efficiency

| Tool | httpx | Zulip | Excalidraw | Terraform | Tokio | Spring Boot |
|------|------:|------:|-----------:|----------:|------:|------------:|
| **Contextception** | 748 | 837 | 990 | 941 | 812 | 2,109 |
| **Aider@4K** | ~3,300 | ~4,200 | ~3,300 | ~3,700 | ~3,200 | ~4,200 |
| **Aider@8K** | ~6,700 | ~7,200 | ~6,900 | ~6,900 | ~7,000 | ~8,400 |
| **Repomix** | 198K | 17.6M | 2.5M | 5.6M | 1.4M | 9.8M |
| Tool | httpx | Zulip | Excalidraw | Terraform | Tokio | EF Core | Spring Boot |
|------|------:|------:|-----------:|----------:|------:|--------:|------------:|
| **Contextception** | 748 | 837 | 990 | 941 | 812 | 1,420 | 2,109 |
| **Aider@4K** | ~3,300 | ~4,200 | ~3,300 | ~3,700 | ~3,200 | ~3,400 | ~4,200 |
| **Aider@8K** | ~6,700 | ~7,200 | ~6,900 | ~6,900 | ~7,000 | ~7,100 | ~8,400 |
| **Repomix** | 198K | 17.6M | 2.5M | 5.6M | 1.4M | 23.2M | 9.8M |

*Values are average tokens per file analysis.*

Expand Down Expand Up @@ -151,6 +152,33 @@ Rust's module system (crate paths, `mod` declarations, `use` re-exports) is invi

</details>

<details>
<summary>EF Core (C#, 5,708 files) — Aider avg recall: 3% @8K</summary>

| File | Archetype | CC must_read | Aider@4K Recall | Aider@8K Recall |
|------|-----------|-------------:|----------------:|----------------:|
| `SqlServerServiceCollectionExtensions.cs` | Service | 10 | 0% | 0% |
| `CountryRegion.cs` | Model | 10 | 0% | 10% |
| `IMemberTranslatorPlugin.cs` | Plugin | 10 | 0% | 0% |
| `IJsonValueReaderWriterSource.cs` | Utility | 10 | 0% | 0% |
| `ViewColumnBuilder.cs` | Endpoint | 10 | 0% | 0% |
| `SessionTokenStorageFactory.cs` | Auth | 10 | 10% | 10% |
| `RelationalConverterMappingHints.cs` | Leaf | 0 | 0% | 0% |
| `ConfigurationSourceExtensions.cs` | Config | 10 | 0% | 0% |
| `DbSetOperationTests.cs` | Test | 1 | 0% | 0% |
| `MigrationsOperations.cs` | Migration | 10 | 10% | 10% |
| `AdHocMapper.cs` | Serialization | 10 | 10% | 10% |
| `DefaultValueBinding.cs` | Error | 10 | 10% | 10% |
| `CosmosClientWrapper.cs` | CLI | 10 | 0% | 0% |
| `CommandErrorEventData.cs` | Event | 10 | 0% | 0% |
| `CSharpDbContextGenerator.Interfaces.cs` | Interface | 10 | 0% | 0% |
| `ITableBasedExpression.cs` | Orchestrator | 10 | 0% | 0% |
| `ComplexTypesTrackingSqlServerTest.cs` | Hotspot | 8 | 0% | 0% |

C# uses namespace-level `using` directives (`using Microsoft.EntityFrameworkCore;`), which Aider's tree-sitter parser cannot resolve to specific files. At 5,708 files, Aider outputs 130–160 files per query but finds only 2–3% of actual dependencies. Contextception resolves namespaces via `.csproj` project detection, namespace-to-directory mapping, and same-namespace sibling discovery.

</details>

<details>
<summary>Spring Boot (Java, 7,978 files) — Aider avg recall: 0% @8K</summary>

Expand Down Expand Up @@ -186,7 +214,7 @@ These fixtures were written by inspecting httpx source code directly — not der

### Tier 2: Validated CC Output (Other Repos)

For the remaining 5 repos, ground truth is Contextception's own `must_read` output, independently validated at grade A across 23 evaluation rounds and 16 repos. Validation grades:
For the remaining 6 repos, ground truth is Contextception's own `must_read` output, independently validated at grade A across evaluation rounds. Validation grades:

| Repo | Grade | Evaluation Rounds |
|------|-------|-------------------|
Expand All @@ -195,6 +223,7 @@ For the remaining 5 repos, ground truth is Contextception's own `must_read` outp
| Terraform | A (3.78) | Rounds 13–18 |
| Tokio | A (3.79) | Rounds 11, 20–23 |
| Spring Boot | A (3.76) | Rounds 13, 19 |
| EF Core | A (3.85) | C# Rounds 5–6 |

This creates a circular validation concern: Contextception gets 100% recall against its own output by definition. The comparison is still meaningful because Aider's recall is measured against the same ground truth — but "CC recall = 100%" should be read as "CC's output was validated as grade A" rather than "CC found everything."

Expand Down Expand Up @@ -232,6 +261,6 @@ See [methodology.md](methodology.md) for:

## Raw Data

- [`data/results.json`](data/results.json) — Complete results for all 6 repos, 51 files
- [`data/results.json`](data/results.json) — Complete results for all 7 repos, 68 files
- [`data/fixtures/`](data/fixtures/) — httpx fixture ground truth files
- [`scripts/compare/`](../scripts/compare/) — Comparison scripts
Loading