feat: library_hijacking detection for npm sibling-package writes by RalianENG · Pull Request #24 · RalianENG/kojuto

RalianENG · 2026-05-13T15:53:35Z

Summary

Adds a placement-detection rule for an attack class kojuto previously missed: scanned package A writes a backdoor into the source tree of another already-installed package B (e.g. argon2's postinstall overwrites node_modules/lodash/index.js). The harm fires when a later workflow imports B — outside kojuto's scan window. Placement is the only opportunity to detect it.

The rule complements static analyzers (which catch obvious AST patterns) by also catching runtime-decoded target paths that source-level inspection cannot resolve.

Changes

New CategoryLibraryHijack (HIGH) — fires when a write targets /install/node_modules/<other_pkg>/... where <other_pkg> is not in the scan target set.
analyzer.SetScanPkgs(pkgs) mirrors the existing sandbox.SetScanPkgs; cmd/root.go calls both at the same site. When unset, the rule is inert (older code/tests unaffected).
extractNpmInstalledPkg parses the target package identifier, handling scoped (@scope/name) and non-scoped names and rejecting .-prefixed npm bookkeeping (.package-lock.json, .bin/, .cache/).
isBenignInstalledPackageWrite filters at the isBenign layer — self-pkg writes (legitimate build output like argon2/build/Release/argon2.node) and npm bookkeeping never reach the classifier. Only cross-package writes do.
parseOpenat now emits openat events for writes into /install/node_modules/<pkg>/... (new isInstalledPackageWrite parser helper).
PyPI (/usr/local/lib/python*/site-packages/) is intentionally out of scope — pip's wheel extraction writes many files across many dirs and needs a different discriminator (probably PID lineage to the offending lifecycle hook). Tracked as follow-up.

Test Plan

Unit tests pass (make test)
TestExtractNpmInstalledPkg covers scoped, non-scoped, npm bookkeeping, malformed/edge inputs
TestAnalyze_LibraryHijack{CrossWrite,SelfWrite,Scoped,NpmBookkeepingIgnored,DisabledWithoutSetScanPkgs} exercises each classification path
TestIsInstalledPackageWrite + TestParseOpenat_InstalledPackageWrite cover the parser's new emit condition
argon2 smoke (native module, baseline) — still CLEAN after rule addition
Light-batch clean-corpus regression (10 pkg, native + pure JS mix: argon2, bcrypt, lodash, express, chalk, axios, helmet, handlebars, jsonwebtoken, commander) — CLEAN in 2:05
go vet, golangci-lint run — 0 new issues (4 pre-existing staticcheck SA5011 in test files unchanged)
Manual testing with a synthetic hijack package (would require crafting a testdata package; deferred)

Related Issues

npm lifecycle script parallelization (npmLifecycleScript) — orthogonal optimization, tracked separately
PyPI library hijack detection — needs PID-lineage discriminator for pip wheel extraction; deferred

…kage writes Adds a placement-detection rule for an attack class that kojuto previously missed: scanned package A writes a backdoor into the source tree of another already-installed package B (e.g. argon2's postinstall overwrites node_modules/lodash/index.js). The harm fires when a later workflow imports B — outside kojuto's scan window. Placement is the only opportunity to detect it. Three pieces: 1. types.CategoryLibraryHijack (HIGH severity). Documented as the placement-only attack class it targets. 2. strace_parse.go parseOpenat now emits events for writes into /install/node_modules/<pkg>/... in addition to the existing sensitive-path / home / system-binary cases. npm's own bookkeeping entries (.package-lock.json, .bin/, .cache/) are filtered at parse time; they are never attacker-installed packages. 3. analyzer.go: - SetScanPkgs(pkgs) records the package set (mirror of the existing sandbox.SetScanPkgs). cmd/root.go calls both at the same site. - extractNpmInstalledPkg parses the target package identifier from /install/node_modules/<pkg>/..., handling scoped (@scope/name) and non-scoped names and rejecting `.`-prefixed bookkeeping. - isBenignInstalledPackageWrite filters at the isBenign layer: self-pkg writes (legitimate build output like argon2/build/Release/argon2.node) and npm bookkeeping never reach the classifier. Only cross-package writes do. - classifyOpenat's new branch (placed before the binary-hijack check) fires CategoryLibraryHijack for the surviving cross- package writes. - The rule is inert when scannedPkgs is empty so older call sites and tests are unaffected until they opt in. PyPI (/usr/local/lib/python*/site-packages/) is intentionally out of scope here. pip's wheel extraction during install legitimately writes many files across many package directories; distinguishing hijack writes from pip's own work needs a different discriminator (probably PID lineage to the offending lifecycle hook). Tracked as follow-up. Tests added: - extractNpmInstalledPkg covers scoped, non-scoped, bookkeeping, malformed and edge-case inputs. - TestAnalyze_LibraryHijack{CrossWrite,SelfWrite,Scoped, NpmBookkeepingIgnored,DisabledWithoutSetScanPkgs} exercises each classification path. - TestIsInstalledPackageWrite + TestParseOpenat_InstalledPackageWrite exercise the parser's new emit condition (write emitted, read not emitted, bookkeeping not emitted). Smoke: ./kojuto scan argon2 -e npm stays CLEAN after the rule addition (its self-writes to argon2/build/Release/argon2.node and peer dep dirs flow through isBenignInstalledPackageWrite without firing the new category). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

codecov · 2026-05-13T15:57:23Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

RalianENG merged commit 5ec5660 into main May 13, 2026
12 checks passed

RalianENG deleted the feat/library-hijacking-detection branch May 13, 2026 16:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: library_hijacking detection for npm sibling-package writes#24

feat: library_hijacking detection for npm sibling-package writes#24
RalianENG merged 1 commit into
mainfrom
feat/library-hijacking-detection

RalianENG commented May 13, 2026

Uh oh!

codecov Bot commented May 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

RalianENG commented May 13, 2026

Summary

Changes

Test Plan

Related Issues

Uh oh!

codecov Bot commented May 13, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant