Skip to content

feat: library_hijacking detection for npm sibling-package writes#24

Merged
RalianENG merged 1 commit into
mainfrom
feat/library-hijacking-detection
May 13, 2026
Merged

feat: library_hijacking detection for npm sibling-package writes#24
RalianENG merged 1 commit into
mainfrom
feat/library-hijacking-detection

Conversation

@RalianENG
Copy link
Copy Markdown
Owner

Summary

Adds a placement-detection rule for an attack class kojuto previously missed: scanned package A writes a backdoor into the source tree of another already-installed package B (e.g. argon2's postinstall overwrites node_modules/lodash/index.js). The harm fires when a later workflow imports B — outside kojuto's scan window. Placement is the only opportunity to detect it.

The rule complements static analyzers (which catch obvious AST patterns) by also catching runtime-decoded target paths that source-level inspection cannot resolve.

Changes

  • New CategoryLibraryHijack (HIGH) — fires when a write targets /install/node_modules/<other_pkg>/... where <other_pkg> is not in the scan target set.
  • analyzer.SetScanPkgs(pkgs) mirrors the existing sandbox.SetScanPkgs; cmd/root.go calls both at the same site. When unset, the rule is inert (older code/tests unaffected).
  • extractNpmInstalledPkg parses the target package identifier, handling scoped (@scope/name) and non-scoped names and rejecting .-prefixed npm bookkeeping (.package-lock.json, .bin/, .cache/).
  • isBenignInstalledPackageWrite filters at the isBenign layer — self-pkg writes (legitimate build output like argon2/build/Release/argon2.node) and npm bookkeeping never reach the classifier. Only cross-package writes do.
  • parseOpenat now emits openat events for writes into /install/node_modules/<pkg>/... (new isInstalledPackageWrite parser helper).
  • PyPI (/usr/local/lib/python*/site-packages/) is intentionally out of scope — pip's wheel extraction writes many files across many dirs and needs a different discriminator (probably PID lineage to the offending lifecycle hook). Tracked as follow-up.

Test Plan

  • Unit tests pass (make test)
  • TestExtractNpmInstalledPkg covers scoped, non-scoped, npm bookkeeping, malformed/edge inputs
  • TestAnalyze_LibraryHijack{CrossWrite,SelfWrite,Scoped,NpmBookkeepingIgnored,DisabledWithoutSetScanPkgs} exercises each classification path
  • TestIsInstalledPackageWrite + TestParseOpenat_InstalledPackageWrite cover the parser's new emit condition
  • argon2 smoke (native module, baseline) — still CLEAN after rule addition
  • Light-batch clean-corpus regression (10 pkg, native + pure JS mix: argon2, bcrypt, lodash, express, chalk, axios, helmet, handlebars, jsonwebtoken, commander) — CLEAN in 2:05
  • go vet, golangci-lint run — 0 new issues (4 pre-existing staticcheck SA5011 in test files unchanged)
  • Manual testing with a synthetic hijack package (would require crafting a testdata package; deferred)

Related Issues

  • npm lifecycle script parallelization (npmLifecycleScript) — orthogonal optimization, tracked separately
  • PyPI library hijack detection — needs PID-lineage discriminator for pip wheel extraction; deferred

…kage writes

Adds a placement-detection rule for an attack class that kojuto
previously missed: scanned package A writes a backdoor into the
source tree of another already-installed package B (e.g. argon2's
postinstall overwrites node_modules/lodash/index.js). The harm
fires when a later workflow imports B — outside kojuto's scan
window. Placement is the only opportunity to detect it.

Three pieces:

1. types.CategoryLibraryHijack (HIGH severity). Documented as the
   placement-only attack class it targets.

2. strace_parse.go parseOpenat now emits events for writes into
   /install/node_modules/<pkg>/... in addition to the existing
   sensitive-path / home / system-binary cases. npm's own
   bookkeeping entries (.package-lock.json, .bin/, .cache/) are
   filtered at parse time; they are never attacker-installed
   packages.

3. analyzer.go:
   - SetScanPkgs(pkgs) records the package set (mirror of the
     existing sandbox.SetScanPkgs). cmd/root.go calls both at the
     same site.
   - extractNpmInstalledPkg parses the target package identifier
     from /install/node_modules/<pkg>/..., handling scoped
     (@scope/name) and non-scoped names and rejecting `.`-prefixed
     bookkeeping.
   - isBenignInstalledPackageWrite filters at the isBenign layer:
     self-pkg writes (legitimate build output like
     argon2/build/Release/argon2.node) and npm bookkeeping never
     reach the classifier. Only cross-package writes do.
   - classifyOpenat's new branch (placed before the binary-hijack
     check) fires CategoryLibraryHijack for the surviving cross-
     package writes.
   - The rule is inert when scannedPkgs is empty so older call
     sites and tests are unaffected until they opt in.

PyPI (/usr/local/lib/python*/site-packages/) is intentionally out
of scope here. pip's wheel extraction during install legitimately
writes many files across many package directories; distinguishing
hijack writes from pip's own work needs a different discriminator
(probably PID lineage to the offending lifecycle hook). Tracked as
follow-up.

Tests added:
- extractNpmInstalledPkg covers scoped, non-scoped, bookkeeping,
  malformed and edge-case inputs.
- TestAnalyze_LibraryHijack{CrossWrite,SelfWrite,Scoped,
  NpmBookkeepingIgnored,DisabledWithoutSetScanPkgs} exercises each
  classification path.
- TestIsInstalledPackageWrite + TestParseOpenat_InstalledPackageWrite
  exercise the parser's new emit condition (write emitted, read not
  emitted, bookkeeping not emitted).

Smoke: ./kojuto scan argon2 -e npm stays CLEAN after the rule
addition (its self-writes to argon2/build/Release/argon2.node and
peer dep dirs flow through isBenignInstalledPackageWrite without
firing the new category).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@RalianENG RalianENG merged commit 5ec5660 into main May 13, 2026
12 checks passed
@RalianENG RalianENG deleted the feat/library-hijacking-detection branch May 13, 2026 16:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant