fix(gbrain-sync): --full produces an empty code index on first run of a new repo#1584
Open
jetsetterfl wants to merge 1 commit into
Open
fix(gbrain-sync): --full produces an empty code index on first run of a new repo#1584jetsetterfl wants to merge 1 commit into
jetsetterfl wants to merge 1 commit into
Conversation
… on a fresh source
runCodeImport selected the code-stage command by mode: --full ran only
`gbrain reindex-code`, incremental ran `gbrain sync --strategy code`.
reindex-code re-embeds pages that already exist and never walks the
filesystem, so the first --full on a freshly-registered source found no
pages ("No code pages to reindex"), finished in ~1s, and left the code
index permanently empty while the stage still reported OK. Semantic code
search then silently returned nothing for that repo.
Always run the page-creating `sync --strategy code` walk first, then run
reindex-code when mode is --full. This honors the documented "full walk +
reindex" contract for both freshly-registered and already-populated
sources. Verified end-to-end: a fresh source that previously stayed at 0
pages now fully populates under --full.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
|
The behavior change looks right, but this needs a regression test before it is safe to keep. The previous bug was just the |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The first
/sync-gbrain --fullon a new repo produces a code index with0 pages while the stage reports
OK. Semantic code search(
gbrain code-def/code-refs/search) then silently returns nothingfor that repo.
gstack-gbrain-sync.ts:runCodeImportselects the code-stage command bymode:
gbrain reindex-codeonly re-embeds pages that already exist; it neverwalks the filesystem. On a source registered moments earlier (0 pages),
the
--fullbranch runsreindex-code, gbrain prints "No code pages toreindex", finishes in about a second, and the index stays empty. The
page-creating walk (
sync --strategy code) only runs on the incrementalpath.
This also contradicts the skill's own documented contract: the
--fullhelp text says "First-run; full walk + reindex" — but the code does
reindex only, no walk.
Fix
--fullruns the file-walk sync first (creating/refreshing pages), thenreindex-codefor the full re-embed. Incremental keeps the walk only.This matches the documented "full walk + reindex" contract and is correct
for both freshly-registered and already-populated sources.
Verification
Reproduced and fixed end-to-end: with a freshly-registered source,
--fullon the unpatched code finished in about a second with 0 pages;with the fix it runs the real walk and the source fully populates
(hundreds of pages, multi-minute walk as expected).
Notes
broken-dbbug found during thesame investigation; that one gates this skill from even reaching the
orchestrator inside web-app repos).
OKwithpage_count=0.Worth considering a WARN/fail in the verdict block when a code sync
completes with 0 pages on a non-empty repo, so this class of silent
emptiness can't recur unnoticed. Not included here to keep the PR
focused.