fix(security): refuse to follow symlinks that escape the project root in MCP source-returning reads#349
Open
casanel-net wants to merge 1 commit into
Conversation
… in MCP source-returning reads Both MCP source-returning handlers re-read each matched file via fs.readFileSync(absPath) with only a logical path-prefix check via validatePathWithinRoot. An attacker-planted in-tree symlink (e.g. evil.ts -> ../../.env) at a logically-in-tree path passes that check; the read follows the link and the project-external file's contents reach the agent. Same CWE-59 class as the colbymchenry#280 session-marker fix. - src/context/index.ts extractNodeCode — codegraph_context source emission - src/mcp/tools.ts handleExplore source loop — codegraph_explore emission The repo already had isPathWithinRootReal (src/utils.ts:132) that does the realpath comparison; this patch just calls it from both read sites and logs a debug breadcrumb when an escape is refused. Two regression tests added under a new describe block in __tests__/security.test.ts, one per handler. The sentinel lives in a function body (not a signature) so the assertion isolates the re-read leak from any index-time metadata exposure. Tests skip on platforms where the user can't create symlinks, matching the existing session-marker symlink tests.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two MCP source-returning handlers re-read each matched file via
fs.readFileSync(absPath)after only a logical path-prefix check (validatePathWithinRoot). An attacker-planted in-tree symlink at a logically-in-tree path passes that check; the read follows the link and the project-external file's contents reach the agent. This PR adds a realpath containment check at both read sites using the existingisPathWithinRootRealhelper.Same CWE-59 class as the #280 session-marker fix (different mechanic — that one uses
O_NOFOLLOWat write time; this one uses realpath comparison at read time).Threat
An attacker who can plant a file in the indexed tree (compromised PR, prompt-injected agent in the running session, hostile dependency dropping files outside
node_modules/) can drop something likeevil.ts -> ../../.env— a logically-in-tree path that resolves outside the project. The two read sites this hits:src/context/index.ts:933 extractNodeCode—codegraph_contextsource emission.validatePathWithinRootreturns the resolved logical path;fs.readFileSync(filePath)then follows the symlink.src/mcp/tools.ts:1112-1117—codegraph_exploresource-emission loop. Same shape.In both cases the file's contents end up in the tool result that goes back to the calling agent — the high-bandwidth exfiltration channel.
The existing
isPathWithinRoot/validatePathWithinRoothelpers compare logical path prefixes only.isPathWithinRootRealinsrc/utils.ts:132already does the realpath comparison — it just wasn't being called from these two read sites.Change
Two-site fix, ~10 lines of actual code:
isPathWithinRootRealinsrc/context/index.tsandsrc/mcp/tools.ts.validatePathWithinRoot+existsSyncguard at both read sites with|| !isPathWithinRootReal(filePath, projectRoot).logDebugbreadcrumb at the context site so security skips are diagnosable without leaking contents.Comment at each site explains the threat and references the #280 fix as the same CWE-59 class.
Tests
Two regression tests under a new
codegraph_context source-emission symlink escape resistancedescribe block in__tests__/security.test.ts, one per handler.Fixture shape:
The sentinel lives in a function BODY (not a
const X = '...'signature) so the negative assertion isolates the read-time leak from any index-time symbol-signature DB metadata exposure. Positive controls assert the symbol/file IS referenced in the response (otherwise the negative could pass trivially because the search missed).Tests skip on platforms where the user can't create symlinks (Windows without dev mode / admin), matching the convention used by the existing session-marker symlink tests.
Verified locally:
npx tsc --noEmitclean.Known residual risk (out of scope for this PR — flagging for separate work)
The walker fix only closes the read-time exfil at these two handlers. There are other read paths that still follow symlinks today and could expose external file content via different surfaces:
getGitVisibleFiles(git ls-files -o) feeds untracked symlinks straight to extraction parsing; symbol signatures from external content can land in the DB.indexFile(),sync()/getChangedFiles()git fast paths, and the framework-detection wrappers aroundsrc/extraction/index.ts:493-503all read viapath.join(rootDir, relPath)after only logical validation.isPathWithinRootRealreturnstrueon realpath failure (src/utils.ts:143, fail-open). The TOCTOU window betweenrealpathSyncandreadFileSyncis small but non-zero; an attacker racing the symlink target could in theory slip through.codegraph_search) until the user re-indexes.These are all separate seams worth their own PRs. I have notes on each and am happy to send follow-ups in whatever order / shape you'd prefer — wanted to keep this one focused on the highest-bandwidth path with the smallest possible patch.
Context
Found during an independent supply-chain / threat-model review before adopting CodeGraph internally. Happy to expand on any of the residual items above or put together additional repro context on request.
Thanks for the visible focus on security hardening this spring — that's what gave me confidence to dig in rather than stay away.