feat(backend): write changed-path Bloom filters to commit-graph#1198
Conversation
Writes --changed-paths when building the commit-graph, which adds Bloom filters that let git cheaply skip commits that did not touch a given path. This dramatically accelerates `git log -- <path>` and modestly helps `git blame` on large repos. Existing repos that already have a Bloom-less commit-graph get a one-time `--split=replace` rewrite on their next fetch, gated on a new `commitGraphChangedPathsBackfilledAt` timestamp stored in repo metadata. Subsequent fetches do a cheap incremental write. Also moves the `writeCommitGraph` call out of `cloneRepository` and into `RepoIndexManager.indexRepository` so clone and fetch paths handle the commit-graph symmetrically. Drops `--write-commit-graph` from the fetch invocation since that flag does not honor `--changed-paths`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
WalkthroughThis PR refactors commit-graph maintenance from automatic behavior during clone/fetch into explicit indexing operations. The ChangesCommit-graph Backfill Implementation
Code Preview Styling
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Fixes SOU-1041
Fixes SOU-1040
Summary
--changed-pathsto commit-graph writes so git can use Bloom filters to skip commits that didn't touch a given path, dramatically speeding upgit log -- <path>and modestly helpinggit blameon large repos.--split=replacerewrite, gated by a new optionalcommitGraphChangedPathsBackfilledAttimestamp onrepoMetadataSchema. Subsequent fetches do cheap incremental writes.writeCommitGraphcall out ofcloneRepositoryand intoRepoIndexManager.indexRepositoryso clone and fetch paths are symmetric. Drops--write-commit-graphfrom thegit fetchinvocation since that flag doesn't honor--changed-paths.py-0.5topy-[3px]to vertically align it.Test plan
yarn workspace @sourcebot/shared buildsucceeds.yarn workspace @sourcebot/backend buildsucceeds.yarn workspace @sourcebot/backend test— all 122 tests pass (mock forgit.jsupdated to exposewriteCommitGraph).objects/info/commit-graph(or split chain) is rewritten with Bloom filter chunks (BIDX/BDAT) viagit commit-graph verify.commitGraphChangedPathsBackfilledAttimestamp is populated on the repo'smetadataafter a successful index, and that the next fetch skips the forced--split=replacerewrite.git log -- <hot-path>on a large repo (e.g. UnrealEngine) before and after backfill to confirm the speedup.🤖 Generated with Claude Code
Summary by CodeRabbit
Refactor
Style
Tests
Documentation