chore(typescript): optimize fixImportsForEsm with single in-memory Volume processing#14124
Conversation
…exit, and string builder Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
There was a problem hiding this comment.
Claude Code Review
This repository is configured for manual code reviews. Comment @claude review to trigger a review.
Tip: disable this comment in your organization's Code Review settings.
…s slower at 10K+ files) Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
…ation) Process imports in the memfs Volume before writing to disk, eliminating one readFile pass for generated source files. The disk-based pass still runs after persist() to handle core utility files copied by copyCoreUtilities(). - Add fixImportsInVolume() using synchronous Volume APIs - Add fixEsmImports option to TypescriptProject.persist() - SdkGeneratorCli passes fixEsmImports to persist() and keeps disk-based fixImportsForEsm for post-persist core files Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
…ocessing - Add getCoreUtilityFilePaths() to CoreUtilitiesManager and SdkGenerator to pre-compute paths of files that copyCoreUtilities will create - Modify fixImportsInVolume() to accept extraExistencePaths so generated file imports to core (e.g. ../../core/index) resolve during in-memory pass - Add fixImportsForCoreFiles() for targeted disk processing of files written after persist (core utilities + public exports) - Update SdkGeneratorCli to pass core paths to persist() and use targeted post-persist processing instead of full project scan Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
…hallow src/ fixImportsForCoreFiles now accepts dirs (recursive) and shallowDirs (top-level only). PostProcess calls fixImportsForCoreFiles([src/core/], [src/]) instead of scanning all 5000+ generated files. Targeted pass processes ~12 files in 3.5ms vs 504ms for full disk scan — 46.6% overall speedup at Stripe scale (5K files). Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
…tories Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
| volume.mkdirSync("/src/api", { recursive: true }); | ||
| volume.writeFileSync("/src/api/a.ts", 'import { B } from "./b";\n'); | ||
| volume.writeFileSync("/src/api/b.ts", 'import { A } from "./a";\n'); | ||
| volume.writeFileSync("/src/api/a.ts", 'import { B } from "./b";\n'); |
There was a problem hiding this comment.
Duplicate file write detected - line 347 and 350 both write to /src/api/a.ts with identical content. This is likely a copy-paste error.
volume.writeFileSync("/src/api/a.ts", 'import { B } from "./b";\n');
volume.writeFileSync("/src/api/b.ts", 'import { A } from "./a";\n');
// Remove this duplicate line:
// volume.writeFileSync("/src/api/a.ts", 'import { B } from "./b";\n');While this won't cause test failure, it makes the test confusing and the third line serves no purpose.
| volume.mkdirSync("/src/api", { recursive: true }); | |
| volume.writeFileSync("/src/api/a.ts", 'import { B } from "./b";\n'); | |
| volume.writeFileSync("/src/api/b.ts", 'import { A } from "./a";\n'); | |
| volume.writeFileSync("/src/api/a.ts", 'import { B } from "./b";\n'); | |
| volume.mkdirSync("/src/api", { recursive: true }); | |
| volume.writeFileSync("/src/api/a.ts", 'import { B } from "./b";\n'); | |
| volume.writeFileSync("/src/api/b.ts", 'import { A } from "./a";\n'); |
Spotted by Graphite
Is this helpful? React 👍 or 👎 to let us know.
Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
… post-persist disk pass - Add copyCoreUtilitiesToVolume() to CoreUtilitiesManager: copies core utility files from disk into the memfs Volume instead of writing directly to disk - Define MemfsVolume interface in commons so sdk-generator doesn't need memfs dep - Update SdkGeneratorCli to call copyCoreUtilitiesToVolume() BEFORE persist(), so core utilities are processed in-memory alongside generated source files - Remove extraExistencePaths param from fixImportsInVolume (no longer needed) - Remove getCoreUtilityFilePaths forwarding from SdkGenerator (no longer called) - Simplify persist() options (remove coreUtilityPaths parameter) - Update tests to reflect new architecture (core utils written to Volume) Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
…SrcToVolume via beforeFixImports hook Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
| } else { | ||
| const ext = path.extname(name); | ||
| if (ALL_EXTENSIONS.has(ext)) { | ||
| files.add(path.resolve(fullPath)); |
There was a problem hiding this comment.
Critical bug: Using path.resolve() on virtual volume paths breaks on Windows. The Volume uses virtual paths starting with "/" (e.g., "/src/api/client.ts"), but path.resolve("/src/api/client.ts") on Windows converts this to a Windows absolute path like "C:\src\api\client.ts". When volume.readFileSync(filePath, "utf-8") is called at line 254, it fails because the Volume doesn't have a file at the Windows path.
Fix: Remove path.resolve() since volume paths are already absolute:
files.add(fullPath);This same issue exists throughout collectVolumeFilesSync - all volume paths should remain as-is without OS-specific resolution.
| files.add(path.resolve(fullPath)); | |
| files.add(fullPath); |
Spotted by Graphite
Is this helpful? React 👍 or 👎 to let us know.
There was a problem hiding this comment.
This is a valid theoretical concern for Windows, but the generator always runs in Docker/Linux containers where path.resolve("/src/api/foo.ts") is a no-op (returns the same absolute path).
Removing path.resolve from collectVolumeFilesSync alone would be insufficient — the determineModification function (line 209) uses join(AbsoluteFilePath.of(currentDir), RelativeFilePath.of(...)) which internally resolves paths. The cache keys need to be consistent with how lookups work. Removing path.resolve from one place but not the other could cause cache misses.
Since this is Linux-only code and the path.resolve is a no-op on absolute paths, I'll leave this as-is to avoid introducing an inconsistency.
Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
| // - src/core/ (existence cache only) — already processed, no writes | ||
| // - src/*.ts (shallow) — public exports written by generatePublicExports | ||
| const srcDir = persistedTypescriptProject.getSrcDirectory(); | ||
| await fixImportsForCoreFiles([path.join(srcDir, "core")], [srcDir]); |
There was a problem hiding this comment.
Core files are being redundantly processed again on disk even though they were already processed in-memory during persist(). The comment on line 344 states "src/core/ (existence cache only) — already processed, no writes", but fixImportsForCoreFiles() implementation adds files from the first parameter (dirs) to both the existence cache AND the files-to-process set (lines 300-302 in fixImportsForEsm.ts). This causes unnecessary disk I/O reading and processing core files that were already fixed in-memory.
// Only pass core dir to populate existence cache for resolving imports,
// not as a directory to process. Need to modify fixImportsForCoreFiles signature
// to separate existence-cache-only dirs from dirs-to-process.Spotted by Graphite
Is this helpful? React 👍 or 👎 to let us know.
There was a problem hiding this comment.
This is intentional — the core files are re-processed on disk but result in no-ops (their imports were already fixed in-memory). The reason src/core/ is passed as a recursive dir rather than as an existence-cache-only dir is that fixImportsForCoreFiles needs the core file paths in the existence cache so that shallow src/*.ts files (like exports.ts) can resolve imports like from "./core" → from "./core/index.js".
Separating existence-cache-only dirs from process dirs would add API complexity for ~3.5ms of no-op processing. The comment has been updated in the latest commit to accurately describe this: "src/core/ (recursive) — already processed in-memory, included for existence cache; re-processed but results in no-ops".
Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
…move dead code - Delete dead getCoreUtilityFilePaths() method (55 lines, never called) - Extract collectCoreUtilityFiles() shared helper used by both copyCoreUtilitiesToVolume and copyCoreUtilities (~90% overlap removed) - Merge updateImportPathsInMemory/updateImportPaths into single replaceImportPaths method - Change volume from public to protected, pass via callback arg - Fix misleading 'existence cache only' comment in postProcess - Remove redundant Set copy in fixImportsInVolume - Add version comment to memfs dirent type assertion - Add versions.yml entry for v3.59.5 Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
…oreUtilities - Combine fixEsmImports + beforeFixImports into single esmImportHook param - Replace Promise.all with for...of for sync Volume writes - Remove unused pathToSrc from copyCoreUtilities and all callers - Remove dead fixImportsForEsm from production exports - Document UTF-8 assumption in collectCoreUtilityFiles Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
- Add generatePublicExportsToVolume() to PublicExportsManager for in-memory public export generation (scans Volume for exports.ts files, generates parent re-export files, updates src/index.ts) - Add writeTemplateFilesToVolume() to process .template.ts files in-memory with Eta before fixImportsInVolume runs (fixes getFetchFn/getResponseBody import resolution) - Add generatePublicExportsToVolume() to SdkGenerator as public method - Update esmImportHook to: copyCoreUtilities → writeTemplates → generatePublicExports - Remove fixImportsForCoreFiles from postProcess (no longer needed) - Remove fixImportsForCoreFiles from barrel export - Add unlinkSync to MemfsVolume interface - Expand MemfsVolume with readFileSync, readdirSync, existsSync for Volume reads - Update versions.yml entry Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
…lume and writeTemplateFilesToVolume Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Swimburger
left a comment
There was a problem hiding this comment.
is there more sync stuff we can do async?
| /** | ||
| * Minimal interface for an in-memory filesystem volume. | ||
| * Structurally compatible with memfs Volume so consumers don't need a | ||
| * direct dependency on the memfs package. |
There was a problem hiding this comment.
why don't we just add memfs package as a dependency?
There was a problem hiding this comment.
Fixed in 2fef785 — memfs is already a direct dependency ("memfs": "catalog:" in package.json), so I replaced the structural MemfsVolume interface with a direct import type { Volume } from "memfs/lib/volume" in the internal files (CoreUtilitiesManager, PublicExportsManager, writeTemplateFiles, fixImportsForEsm).
For the SDK packages that don't have memfs as a direct dependency, I re-export Volume as MemfsVolume from @fern-typescript/commons so they can continue importing MemfsVolume from commons without needing to add memfs to their own package.json.
|
|
||
| // Process in parallel batches for I/O throughput | ||
| // Process files in batches to balance parallelism with I/O backpressure. | ||
| // Unbounded Promise.all on 10K+ files causes slower throughput due to |
There was a problem hiding this comment.
Do we still need batches when we're modifying in memory?
There was a problem hiding this comment.
Fixed in 2fef785 — removed the BATCH_SIZE = 100 batching logic from fixImportsForEsm. This is now the legacy disk-based path (only used when useLegacyExports=true). The primary path uses fixImportsInVolume which processes everything in-memory before writing to disk, so no batching is needed there either.
The legacy path now uses a simple unbounded Promise.all for simplicity.
| } | ||
|
|
||
| // Process in parallel batches for I/O throughput | ||
| // Process files in batches to balance parallelism with I/O backpressure. |
There was a problem hiding this comment.
Does it still make sense to batch now that we do it all in memory, and then write to disk?
There was a problem hiding this comment.
The sync code in the Volume-based functions (copyCoreUtilitiesToVolume, writeTemplateFilesToVolume, generatePublicExportsToVolume, fixImportsInVolume) is intentional — they use sync memfs APIs (mkdirSync, writeFileSync, readFileSync, etc.) because the memfs Volume is an in-memory filesystem with no actual I/O. Making these async would add overhead (promise creation, microtask scheduling) without any benefit since there's nothing to yield to the event loop for.
The disk-based functions (copyCoreUtilities, collectCoreUtilityFiles, fixImportsForEsm, fixImportsForCoreFiles) are already async since they do real filesystem I/O.
…rt, remove batching from legacy path Address 3 PR review comments: 1. Replace structural MemfsVolume interface with direct Volume import from memfs (memfs is already a dependency). Re-export as MemfsVolume from commons for SDK packages that don't have memfs as a direct dependency. 2. Remove BATCH_SIZE=100 batching from fixImportsForEsm (legacy disk path) — the primary path is now fully in-memory via fixImportsInVolume. 3. Sync code in Volume operations is intentional (in-memory = no real I/O, sync is faster and simpler). No code changes needed. Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
…cExportsManager The MemfsVolume structural interface previously typed readdirSync returns explicitly. Now using Volume directly, the return type is TDataOut | Dirent which requires a cast to access .name and .isDirectory() — same pattern already used in fixImportsForEsm.ts and writeTemplateFiles.ts. Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
…nd non-legacy exports - Always pass volumeHook to persist() (no more conditional undefined) - Move fixImportsInVolume into volumeHook (conditional on !useLegacyExports) - Remove disk-based copyCoreUtilities/generatePublicExports from SdkGenerator - Remove disk-based fixImportsForEsm, fixImportsForCoreFiles, writeTemplateFiles - Remove disk-based generatePublicExportsFiles from PublicExportsManager - Clean up unused imports (fs, glob, ts-morph Directory/Project, AbsoluteFilePath) - Remove 11 tests for dead disk-based functions (58 tests remain, all passing) Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Strips leading/trailing slashes to match SdkGenerator.getRelativePackagePath() normalization, preventing silent import-fix skip for paths like '/my-package/'. Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Description
Refs FER-8691
Consolidates all SDK file generation and ESM import fixup into a single in-memory pass. Source files, core utilities, public exports, and template files are all processed in the memfs Volume before a single write to disk — eliminating all post-persist disk passes (
fixImportsForEsm,fixImportsForCoreFiles,generatePublicExports).Both legacy and non-legacy export modes now use the same in-memory
volumeHookcode path. The only difference is that legacy exports skipfixImportsInVolume(since they don't need.jsextensions). All disk-based fallback functions have been removed.Benchmark (5K files, Stripe-scale)
fixImportsForEsm)Updates since last revision
Consolidated to single code path for both legacy and non-legacy exports:
volumeHookis now always passed topersist()(no more conditionalundefinedfor legacy exports)fixImportsInVolumefixImportsForEsm(full disk scan)fixImportsForCoreFiles(targeted disk scan)writeTemplateFiles(disk-based template rendering)generatePublicExportsFiles(disk-based public exports via ts-morph)collectFiles/collectFilesRecursivehelperswriteTemplateFilesfrom barrel export (index.ts)fs,fs/promises,glob,ts-morphDirectory/Project,AbsoluteFilePathwhere no longer neededChanges Made
Single in-memory code path (volumeHook)
The
volumeHookcallback onpersist()now runs for all export modes. Operations execute in this order:copyCoreUtilitiesToVolume(volume)— Copies core utility files into the VolumewriteTemplateFilesToVolume(volume, templateVariables)— Renders.template.tsfiles in-memory using EtageneratePublicExportsToVolume(volume, packagePath)— Createsexports.tsre-export chain in-memoryfixImportsInVolume(volume, packagePath)— Adds.jsextensions (only for non-legacy exports)Removed dead code
fixImportsForEsm()— full-project disk scan, replaced byfixImportsInVolumefixImportsForCoreFiles()— targeted disk scan for core utilities, no longer neededwriteTemplateFiles()— disk-based template rendering, replaced bywriteTemplateFilesToVolumegeneratePublicExportsFiles()— disk-based ts-morph exports, replaced bygeneratePublicExportsToVolumecollectFiles()/collectFilesRecursive()— disk directory traversal helpersgetCoreUtilityFilePaths()— dead forwarding method (55 lines)pathToSrcparameter fromcopyCoreUtilitiesand all callers (Express + SDK generators)String processing optimizations (Tier 2)
IMPORT_REGEXwith two capture groups)parts[]+join) instead of offset-trackingslice+concatCode quality
collectCoreUtilityFiles()shared helper — deduplicates ~90% between Volume and disk copy pathsupdateImportPathsInMemory/updateImportPaths→ singlereplaceImportPathsmethodpersist()API — singlevolumeHook?callback replaces fragile two-option objectvolumekeptprotected— passed via callback argumentvolumeHookdocumenting the 4-step execution requirementUnit tests (58 tests)
fixImportsInSource(29 tests):.jsextension,/index.jsdirectory resolution,.ts→.js, dynamic/side-effect imports, re-exports, comment stripping, cachingfixImportsInVolume(8 tests): in-memory Volume processing, nested structures, core utility files in VolumegeneratePublicExportsToVolume(8 tests): exports.ts discovery, parent file generation, index.ts re-export, no-core-dir handling, custom package path, deep nesting, deduplication guardswriteTemplateFilesToVolume(6 tests): template rendering with Eta,.template.ts→.tsconversion, template removal, multiple files, nested directories, non-template file preservationAdditional tests for deduplicateExamples (7 tests)
Updated README.md generator (if applicable) — N/A
Human Review Checklist
volumeHook: The hook runs:copyCoreUtilitiesToVolume→writeTemplateFilesToVolume→generatePublicExportsToVolume→fixImportsInVolume. Template processing MUST happen before import fixing so rendered files exist in the existence cache. Public exports MUST happen before import fixing so generated re-export files get their imports fixed. Verified by 19/19 seed tests producing byte-identical output. Inline comments document this ordering requirement.useLegacyExports=truefell back to disk-based functions. Now both modes use the samevolumeHook— legacy just skipsfixImportsInVolume. Verify that core utility copying, template rendering, and public exports generation still work correctly for legacy mode (seed tests cover this).fixImportsForEsm,fixImportsForCoreFiles,writeTemplateFiles,generatePublicExportsFiles,collectFilesare all deleted. Verify no remaining callers exist outside the removed test code.copyCoreUtilities()(disk-based) afterpersist()with novolumeHook. ThepathToSrcparameter was removed but Express never used the Volume path. Verify Express seed tests pass.MemfsVolumere-export:export type { Volume as MemfsVolume } from "memfs/lib/volume"incommons/src/index.ts— SDK packages import this alias. Verify that the realVolumetype satisfies all callsites.writeTemplateFilesToVolumecorrectness: UsesunlinkSyncto delete.template.tsoriginals after rendering. ThecollectTemplateFilesSyncfunction traverses the entire Volume recursively — should only match files containing.template.in the name. Eta rendering uses the same config as the deleted disk-basedwriteTemplateFiles.generatePublicExportsToVolumecorrectness: Scans Volume forexports.tsinsrc/core/subdirectories and generates parent re-export files. Replaces disk-basedgeneratePublicExportsFileswhich used ts-morph. Verified by seed tests.persist()API change: Changed frompersist(options?)topersist(volumeHook?). Express generator callspersist()with no args (backward compatible). SDK generator is the only caller passing the hook.collectVolumeFilesSync,collectTemplateFilesSync, andgeneratePublicExportsToVolumeall cast memfs dirent entries as{ name: string | Buffer; isDirectory(): boolean }. Depends on memfs 3.x behavior.collectCoreUtilityFiles: Old code used binary-safecp(). New code reads as UTF-8 forreplaceImportPaths. Safe for.tssource files but would corrupt binary files if any were added to core utilities. Documented with inline comment.Testing
fixImportsInSource,fixImportsInVolume,generatePublicExportsToVolume,writeTemplateFilesToVolumesimple-api19,pagination2,file-upload6) pass with zero.tsfile diffspnpm run checkpasses (4273 files, no issues)Link to Devin session: https://app.devin.ai/sessions/875263aabe0c4dcbbdacaabc7478ce1c
Requested by: @Swimburger