Summary
Follow up on the indexer performance work by removing duplicate tree-sitter extraction for SCIP-sourced files and pushing the remaining source-index work through the parallel worker path.
Problem
On the lore-self cached benchmark repo, the current full index spends most of its time in:
scip-indexer
source-index
The main avoidable cost is that SCIP-covered files are still tree-sitter extracted again in SourceIndexStage just to patch end_line and compute symbol_metrics. That pass was serial on the main thread.
There was also a worker-boundary hazard when trying to parallelize that path: extraction results included live astNode handles, which are not safe to transfer from worker threads.
Required changes
Benchmark target
Benchmark command:
node dist/cli.js index --root .benchmark/lore-self --db .benchmark/lore-self/.lore.db --history-depth 100
Current measured baseline before this follow-up:
- fresh current build: about
6.45s wall clock
source-index: about 2534ms
ScipRefStage: about 478ms
Target outcome:
- materially reduce
source-index time on SCIP-heavy repos
- do not regress
totalEdges, symbol_refs, or type_refs
Summary
Follow up on the indexer performance work by removing duplicate tree-sitter extraction for SCIP-sourced files and pushing the remaining source-index work through the parallel worker path.
Problem
On the
lore-selfcached benchmark repo, the current full index spends most of its time in:scip-indexersource-indexThe main avoidable cost is that SCIP-covered files are still tree-sitter extracted again in
SourceIndexStagejust to patchend_lineand computesymbol_metrics. That pass was serial on the main thread.There was also a worker-boundary hazard when trying to parallelize that path: extraction results included live
astNodehandles, which are not safe to transfer from worker threads.Required changes
ScipRefStagewhere possible so correctness is preservedlore-selfbenchmark repoBenchmark target
Benchmark command:
Current measured baseline before this follow-up:
6.45swall clocksource-index: about2534msScipRefStage: about478msTarget outcome:
source-indextime on SCIP-heavy repostotalEdges,symbol_refs, ortype_refs