Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 34 additions & 19 deletions language-server/src/core/indexer-worker.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import { parentPort, workerData } from 'node:worker_threads';
import { Logger, TreeSitterParser } from './tree-sitter-parser';
import { pLimit } from './p-limit';

if (!parentPort) {
throw new Error('This script must be run as a worker thread');
Expand All @@ -26,31 +27,45 @@ parentPort.on('message', async (message: { filePaths: string[]; chunkSize?: numb

const { filePaths } = message;
const BATCH_SIZE = message.chunkSize ?? 25;
const CONCURRENCY = 15;

for (let i = 0; i < filePaths.length; i += BATCH_SIZE) {
const chunk = filePaths.slice(i, i + BATCH_SIZE);
// ⚑ Bolt: Remove head-of-line blocking by using a concurrency task pool
// instead of chunked Promise.all
const limit = pLimit(CONCURRENCY);
// eslint-disable-next-line @typescript-eslint/no-explicit-any
let batchItems: any[] = [];
let batchProcessedFiles = 0;
let totalCompleted = 0;

// Parallelize file reading and parsing within the chunk
const results = await Promise.all(
chunk.map(async (filePath) => {
await Promise.all(
filePaths.map((filePath) =>
limit(async () => {
try {
return await parser.parseFile(filePath);
const items = await parser.parseFile(filePath);
// ⚑ Bolt: Fast manual loop pushing to avoid Maximum Call Stack Size Exceeded
for (let i = 0; i < items.length; i++) {
batchItems.push(items[i]);
}
} catch {
return [];
// Ignore
} finally {
totalCompleted++;
batchProcessedFiles++;

if (batchProcessedFiles >= BATCH_SIZE || totalCompleted === filePaths.length) {
parentPort?.postMessage({
type: 'result',
items: batchItems,
count: batchProcessedFiles,
isPartial: totalCompleted < filePaths.length,
});
batchItems = [];
batchProcessedFiles = 0;
}
}
}),
);

const chunkItems = results.flat();

// Send back chunk result immediately to keep main thread unblocked but processing
parentPort?.postMessage({
type: 'result',
items: chunkItems,
count: chunk.length,
isPartial: i + BATCH_SIZE < filePaths.length,
});
}
),
);
Comment on lines +40 to +68
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Verify how many files can be sent to one worker task at a time.
rg -n -C4 '\bassignWorkerTask\s*\(|\bgetWorkerChunkSize\s*\(|filePaths:\s*batchFiles' language-server/src/core/workspace-indexer.ts

Repository: AhmmedSamier/DeepLens

Length of output: 1818


🏁 Script executed:

# Check the actual type declarations in indexer-worker.ts, especially around batchItems
sed -n '1,70p' language-server/src/core/indexer-worker.ts

Repository: AhmmedSamier/DeepLens

Length of output: 2624


🏁 Script executed:

# Get the complete getWorkerChunkSize implementation
sed -n '752,770p' language-server/src/core/workspace-indexer.ts

Repository: AhmmedSamier/DeepLens

Length of output: 695


🏁 Script executed:

# Check CONCURRENCY constant definition
rg -n 'CONCURRENCY|const.*Concurrency' language-server/src/core/indexer-worker.ts

Repository: AhmmedSamier/DeepLens

Length of output: 146


Replace the eager map/limit pattern with a pull-based runner pool to bound memory allocation.

The code violates two guidelines:

  1. Type safety: Line 39 uses any[] for batchItems. This removes type safety from the worker-to-parent message contract and contradicts the guideline "Ensure no new any types are introduced during development". Use an explicit type (e.g., SymbolInfo[] or appropriate item type from the parser).

  2. Memory efficiency: filePaths.map(limit(...)) creates a closure and promise for every file upfront, even though only CONCURRENCY (15) tasks run concurrently. When a worker receives 200+ files (e.g., getWorkerChunkSize returns 60 for batches β‰₯200), this allocates ~200 closures that remain queued in memory. A pull-based model that spawns only 15 runners pulling files one-by-one keeps queue size O(CONCURRENCY) and reduces GC pressure in this hot path.

The suggested implementation spawns exactly Math.min(CONCURRENCY, filePaths.length) runners, each polling the next file index until exhausted, rather than eagerly mapping all files.

πŸ€– Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@language-server/src/core/indexer-worker.ts` around lines 40 - 68, The current
eager filePaths.map(limit(...)) approach should be replaced with a pull-based
runner pool to bound memory and fix the any[] type: change batchItems from any[]
to the parser item type (e.g., SymbolInfo[] or the exact type returned by
parser.parseFile) and remove the map/limit pattern; instead spawn N =
Math.min(CONCURRENCY, filePaths.length) async runners that loop while (true) {
const idx = nextIndex++; if (idx >= filePaths.length) break; try { const items =
await parser.parseFile(filePaths[idx]); for (let i = 0; i < items.length; i++)
batchItems.push(items[i]); } catch { /* ignore */ } finally { totalCompleted++;
batchProcessedFiles++; if (batchProcessedFiles >= BATCH_SIZE || totalCompleted
=== filePaths.length) { parentPort?.postMessage({ type: 'result', items:
batchItems, count: batchProcessedFiles, isPartial: totalCompleted <
filePaths.length, }); batchItems = []; batchProcessedFiles = 0; } } } and await
Promise.all(runners) to finish; keep references to parser.parseFile, batchItems,
totalCompleted, batchProcessedFiles, CONCURRENCY and BATCH_SIZE to locate and
update the code.

} catch (error) {
parentPort?.postMessage({
type: 'error',
Expand Down
Loading