fix: reduce false positives surfaced by 200-package clean-corpus measurement#23
Merged
Merged
Conversation
strace right-pads child PIDs with spaces to align columns (e.g. `[pid 12]`, `[pid 1]`). strconv.ParseUint rejects leading whitespace, so every container PID parsed as 0 — silently disabling downstream PID-aware analysis. TrimSpace before parse. Surfaces during clean-corpus FP measurement: the planned V8 JIT mprotect filter (long-standing TODO in strace_parse.go) attributes events by PID, so without this fix the filter would be a no-op for every event in a real scan. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…as file Two sandbox-side changes that eliminate distinct false-positive classes surfaced by clean-corpus measurement, without weakening any analyzer rule: 1. Package-manager caches pinned outside HOME via a dedicated /var/cache/kojuto tmpfs: - --tmpfs=/var/cache/kojuto:nosuid,mode=1777,size=200m - --env=NPM_CONFIG_CACHE=/var/cache/kojuto/npm - --env=PIP_CACHE_DIR=/var/cache/kojuto/pip npm's _logs and _cacache, plus pip's wheel cache, previously wrote under /home/dev/.npm and /home/dev/.cache/pip. Both are correctly flagged by the persistence backstop (any /home/ write is illegitimate). Redirecting at the sandbox layer keeps the strict detection rule intact instead of relaxing it via a path-based allowlist, which would have opened a "smuggle payload under a benign-looking cache prefix" bypass. 2. Probe install script is staged to /var/cache/kojuto/install.sh and invoked as `sh /var/cache/kojuto/install.sh` rather than `sh -c <inline script>`. The cmdline shape difference matters: the analyzer's classifyExecve treats `sh -c <content>` as a positive attack signature when isShellCmdBenign rejects the content. Kojuto's own install loop (find + while + npm run ...) cannot pass the benign check, so it produced a guaranteed FP on every npm scan. Switching to `sh <file>` lets isBenignExec recognize sh from /bin/ as benign and filter the outer probe shell entirely — without any allowlist, marker, or PID-based filtering. Attackers cannot mimic the shape because npm/yarn/pnpm always spawn lifecycle hooks as `sh -c <package script>`; the file-path form is reserved for kojuto's own launch path. InstallCommand and InstallAllCommand signatures change to take context.Context and return (cmd, error) — the new stageInstallScript helper writes the script via dockerWriteFile before strace attaches. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two analyzer changes that move execve detection from "default-bucket
HIGH severity" to "positive signatures HIGH, residual LOW", plus a
new PID-attributed filter for V8 JIT pages.
1. CategoryUnknownBinary (SeverityLow):
The default branch of classifyExecve — execve of any binary that
does not match a positive attack signature — is now recorded as
CategoryUnknownBinary instead of CategoryCodeExecution. The event
still appears in the report for forensic chain visibility, but
the verdict is decided by the syscall-level rules that observe
the binary's actual behavior (network, credential, persistence,
RWX, audit hook).
Maintaining an allowlist of "legitimate execve binaries during
install" is intractable: the legitimate set spans coreutils,
language runtimes, compiler toolchains (gcc/cc/ld/as/ar/make/
cmake/ninja/autoconf/...), VCS tools, and arbitrary preinstall
hook commands. Every new ecosystem or build tool would force a
list update. The harm-firing syscalls of every meaningful
execve-driven attack are already observed independently with
HIGH-severity categories.
What stays HIGH (positive signatures):
- execve from /tmp/, /dev/shm/, /proc/self/fd/ (fileless attack)
- interpreter with inline -c/-e flag (code injection)
- sh -c failing isShellCmdBenign (already negative-
space-filtered for env/curl, find -exec, $()/`` substitution,
cp/ln/mv to /usr/local/bin, sensitive-path args)
/tmp/ added to suspiciousExecDirs to align with the documented
behavior in README (basename spoofing detection no longer relies
on the catch-all default branch).
A detailed rationale block sits above classifyExecve documenting
the dynamic/static defense split and what coverage is explicitly
moved off the verdict path (pure-computation native payloads,
handled by sandbox containment per SECURITY.md known limitations).
2. V8 JIT mprotect/mmap filter:
A new pre-pass collectPIDComm builds PID → comm from execve
events. isV8JITPageOp returns true for simultaneous-RWX
mprotect/mmap calls whose PID resolves to a known JIT interpreter
(node, nodejs, deno, bun). Such events are skipped entirely.
Two safety properties:
- The PID must appear as the target of a prior execve in the
same scan — kojuto observed the interpreter launch directly.
An attacker cannot inject fake PIDs into the strace stream.
- Only the listed JIT interpreters get the pass. RWX from any
other binary remains HIGH memory_execution.
PIDs that did not produce an execve (PID=0 from the main strace
target, or parser misses) fall through to the existing detection,
preserving original behavior for shellcode injection.
Retires the long-standing TODO comment in strace_parse.go
documenting V8 JIT as a known FP source.
Tests:
- TestAnalyze_V8JITFilter: node PID gets pass, non-JIT PID still
fires memory_execution, PID=0 still fires (no regression on
shellcode detection)
- TestAnalyze_CurlMultiLayerDefense: documents the multi-layer
defense model (containment / harm-firing / forensic chain) for
network tools demoted from intent-based detection
- TestAnalyze_SedShellExecDefense: sed alone is LOW, sed spawning
sh -c on a sensitive path trips the existing sh -c HIGH branch
- TestAnalyze_UnknownBinaryStaysClean: default-branch demotion
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rement Adds Unreleased entries for the four fixes surfaced by scanning 100 clean PyPI + 100 clean npm packages: - New CategoryUnknownBinary (LOW) and V8 JIT mprotect/mmap filter - Probe install script staged as file (not sh -c <inline>) - Unrecognized execve demoted from CodeExecution HIGH to UnknownBinary - /tmp/ added to suspiciousExecDirs (align with README) - Package-manager caches redirected to /var/cache/kojuto tmpfs - extractPID space-padding bug (every container PID parsed as 0) The series targets only false positives — every analyzer rule that catches real attacks is preserved at its original severity. Design rationale lives inline at classifyExecve and stageInstallScript rather than in a separate ADR document. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Initial V8 JIT filter looked up jitInterpreters by the basename of
the execve binary. That allowed an attacker to plant a binary named
"node" under /install/<pkg>/bin/ and have its mprotect RWX events
suppressed as "V8 JIT".
Three concrete changes:
1. collectPIDComm now stores the full execve binary path (not just
basename) so the filter can inspect the directory at lookup time.
2. isV8JITPageOp checks both:
- basename ∈ jitInterpreters
- dirname ∈ jitInterpreterTrustedDirs (/usr/bin/, /usr/local/bin/,
/bin/)
A binary named "npm" or "node" under /install/, /tmp/, or any
attacker-controlled path is NOT filtered.
3. jitInterpreters expanded to include npm/npx/yarn/pnpm. These are
JS scripts with a `#!/usr/bin/env node` shebang; Linux's
binfmt_script transparently re-execs node, but strace only sees
the original npm/npx/yarn/pnpm execve. Treating them as
V8-equivalent is what makes the filter cover npm install flows
in practice — without this, every npm-driven scan continued to
produce mprotect RWX false positives despite the PID infrastructure
working correctly.
Discovered during clean-corpus re-measurement when hapi.json
reported `pidComm[220] = "/usr/local/bin/npm"` while npm was the
shebang launcher for node and the JIT events were attributable
to that PID.
Test additions:
- jitFromNpm: npm/usr/local/bin/ → filtered (new path coverage)
- bypassAttempt: /install/.../bin/npm, /install/.../bin/node,
/tmp/node → NOT filtered (proves path constraint blocks the
bypass)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The path-aware V8 JIT filter from the prior commit still missed
shellcode-style false positives on every real npm scan because V8
spawns JIT-compilation worker threads via clone() — these never
call execve, so the PID→comm map had no entry for them and their
mprotect RWX events leaked past the filter. memory_execution is
HIGH severity, so one leaking event was enough to flip the verdict.
Discovered via instrumented re-measurement of argon2 (which uses
node-gyp-build): the parent node process at PID 137 correctly
mapped to /usr/bin/node and was filtered, but worker threads at
PIDs 633/648/663 returned MISS in the lookup. 6 events leaked,
verdict stayed suspicious.
Three layers of plumbing close the gap:
1. types.SyscallEvent gains a ChildPID field and a new EventClone
syscall constant. Clone events carry only PID-correlation
signal (no harm); isBenign drops them so they never reach
verdict accounting even if they survive other passes.
2. container_strace.go adds clone/clone3 to the strace -e trace
list. Parser parses both forms plus vfork.
Failed clones (negative return) are rejected by the regex (it
requires `\d+`), so resource exhaustion attempts (clone returning
EAGAIN) still flow through other detection paths.
3. analyzer.collectPIDComm becomes a two-pass:
- Pass 1: PID → execve binary path (unchanged).
- Pass 2: fixed-point propagation of parent comm to clone
children that lack their own execve entry. Iterates until no
new entries; handles parent → child → grandchild chains
regardless of clone-event order.
A child that does its own execve keeps its execve attribution
— the propagation pass only writes when the child has no
existing entry. This preserves the shellcode-detection path
for fork-then-exec-payload patterns.
Tests added:
- TestParseClone: clone/clone3/vfork variants, [pid X] prefix
with column alignment spaces, failed clones rejected
- jitFromWorkerThread: V8 worker scenario (the actual FP that
motivated this work) — clone child without execve still filtered
- jitFromGrandchild: clone chain with events listed out of order
— fixed-point pass still resolves
- childExecveWins: clone-then-execve payload still HIGH — the
propagation cannot mask a child's own malicious execve
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
strace -f injects "Process N attached" INLINE into the originating clone() trace, splitting it across two lines and causing parseClone to miss every real clone event. The V8 worker-thread propagation added in 0362865 was silently producing zero matches as a result. --quiet=attach suppresses the inline notice and keeps clone() traces single-line so parseClone matches them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…in-target aliasing + sh -c demote
Three coupled changes that bring npm clean-corpus packages with
native module builds (argon2/bcrypt/sharp/...) down to clean
verdict. Validated end-to-end on argon2 — the package that exposed
each problem in turn during instrumented re-measurement.
1. strace --quiet=attach
Without it strace prints "Process N attached" inline in the
middle of the originating clone() line, splitting the trace
across two lines. parseClone needs the full call on one line to
match. Without this flag the clone propagation pass had nothing
to propagate from.
2. Main-target aliasing in collectPIDComm
strace prints the main traced process's syscalls WITHOUT
`[pid X]` (extracted as PID=0) until ambiguity forces a switch,
from which point the SAME process appears as `[pid X]` with its
real kernel PID. The two PIDs name the same process. The
streaming pass now propagates m[0] to any non-zero PID that
emits a non-clone event without ever appearing as a clone child
— that PID is the disambiguated main target. Without this,
import-phase node's worker-thread clones from the disambiguated
PID had no parent attribution and every package's mprotect RWX
events leaked past the V8 JIT filter.
3. sh -c demote to CategoryUnknownBinary (LOW)
The negative-space first-token filter in isShellCmdBenign
cannot enumerate the legitimate set of node-ecosystem build
tools (cross-env, node-gyp-build, prebuild-install,
prebuildify, ...). Every native-module package's preinstall
hook fires `sh -c "cross-env FOO=bar node-gyp-build"` and gets
flagged on cmdline content alone, despite no harm-firing
syscall ever appearing downstream.
Each previously-caught attack pattern has a dedicated
harm-firing rule:
- /tmp exec -> fileless attack (L546)
- curl/wget -> c2_communication on connect
- $() / `` -> spawned process's syscalls
- find -exec /tmp/x -> /tmp execve (L546)
- cp/mv/ln to /usr/local/bin -> binary_hijacking on openat
(parser emits openat specifically for system-binary write
targets; see strace_parse.go parseOpenat)
- cat ~/.ssh/X -> credential_access on openat
- bind/listen -> backdoor
TestAnalyze_ShellCMultiLayer documents each pairing
explicitly so a future maintainer can see which downstream
rule covers which historical sh -c case.
The argon2 smoke went 6 memory_execution events suspicious ->
clean across this and the prior commits, with no test regressions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CHANGELOG additions cover the work shipped in this branch beyond
the initial entries: clone-attributed thread comm propagation,
main-target PID aliasing, the strace --quiet=attach fix, and the
sh -c demote alongside the L570 default-branch demote.
README Detection Benchmarks updated with the measured post-fix
rates from the 100+100 clean-corpus re-measurement:
- PyPI: 84/84 in-scope clean (16 install-resolution failures
excluded as scan-infrastructure issue, not detection)
- npm: 89/89 in-scope clean; 2 documented bull/bullmq DNS-
lookup FP across the entire taxonomy (separate issue)
The pre-fix npm baseline on the same corpus was 93/93 suspicious
(100%); the same packages now register clean within the
attack-detection categories targeted by this PR.
The code_execution and memory_execution rows of the "Detected
attack categories" table updated to reflect the new behavior: the
sh -c branch records as unknown_binary LOW, and the memory_execution
rule attributes by PID to filter V8/JIT noise from trusted-path
node-ecosystem interpreters.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Re-measurement of the clean-corpus false-positive rate against 100 popular non-corporate PyPI packages + 100 popular non-corporate npm packages exposed FP sources hidden by the small (50+20) prior baseline. The npm side ran at 93/93 suspicious (100% FP) due to three independent issues compounding: V8 JIT mprotect noise, probe-scaffolding shell, and npm cache writes to
$HOME. This PR closes every in-scope FP without weakening any harm-firing detection rule and without introducing maintained allowlists.Measured impact:
bull/bullmqDNS issue, tracked separately)Changes
extractPIDstrips leading whitespace — strace right-pads[pid 12]and every container PID parsed as 0, silently disabling PID-aware analysis.strings.TrimSpacebefore parse.$HOME— new/var/cache/kojutotmpfs (200 MiB) withNPM_CONFIG_CACHEandPIP_CACHE_DIRenv. Cache writes no longer trip the persistence backstop.Sandbox.InstallCommand/InstallAllCommandwrite the script to/var/cache/kojuto/install.shand invokesh <path>instead ofsh -c <inline>. Outer probe shell is nowisBenignExec-filtered. Both methods gaincontext.Contextand return(cmd, error).CategoryUnknownBinary(LOW) —classifyExecve's default branch AND itssh -cbranch record under this category for forensic chain visibility instead of flipping the verdict. Each historicalsh -cattack pattern has a dedicated harm-firing rule downstream;TestAnalyze_ShellCMultiLayerdocuments the mapping./tmp/added tosuspiciousExecDirs— basename-spoofing detection (/tmp/python3) now relies on the positivecode_executionHIGH rule, not the demoted default branch.node/nodejs/deno/bun/npm/npx/yarn/pnpmlaunched from/usr/bin/,/usr/local/bin/, or/bin/. Path constraint blocks bypass via/install/<pkg>/bin/node. Retires the long-standing TODO instrace_parse.go.EventClone+ clone propagation —clone/clone3/vforkadded to the strace trace list. NewChildPIDfield onSyscallEvent. Streaming pre-pass copies the parent's execve comm to children that never execve (V8 worker threads,posix_spawnhelpers), preserving execve attribution if the child later runs its own binary.[pid X](extracts asPID=0) until ambiguity forces a switch, after which the same process appears as[pid X]. The pre-pass propagatesm[0]to any non-zero PID that emits a non-clone event without being a clone child. Without this, import-phase node had two PIDs in our stream and every worker thread cloned from the disambiguated PID leaked past the JIT filter.--quiet=attach— strace otherwise inserts the "Process N attached" notice INSIDE the originating clone() trace, splitting it across two lines and breaking single-line regex parsers.classifyExecvecovers the dynamic/static defense split, why a positive execve allowlist is intractable, the mapping from each historical sh -c case to its downstream rule, and what coverage is intentionally moved off the verdict path (pure-computation native payloads, addressed by sandbox containment per SECURITY.md Known Limitations).Test Plan
make test)go vet ./...cleangolangci-lint run ./...— 0 new issues (4 pre-existing staticcheck SA5011 in test files unchanged)package-audit/clean-pypi.txt): 84/84 in-scope clean, identical to pre-fix baselinepackage-audit/clean-npm.txt): 89/89 in-scope clean, down from 93/93 suspicious; remaining 2 are the documentedbull/bullmqinstall-time DNS lookup FP (separate issue)cross-env/node-gyp-buildpreinstall): suspicious → cleanTestAnalyze_V8JITFiltercovers node from trusted dir, npm shebang launcher, worker-thread propagation, disambiguated main-target, clone chain, and the/install/<pkg>/bin/nodebypass-attempt path constraintTestAnalyze_ShellCMultiLayercovers/tmpexec, curl/wget connect, nc bind, ssh-key openat, system-binary openat, system-binary rename, and.bashrcwrite — each via the dedicated downstream harm ruleTestAnalyze_ShellCStandaloneIsLowregression-guards the demotion:sh -c cross-env .../sh -c node-gyp-build/ standalone substitutions stay clean without a follow-up harm eventTestParseClonecoversclone/clone3/vfork, column-aligned PID prefixes, negative-return rejectionRelated Issues
Follow-up work surfaced by this measurement (not in scope here):
bull/bullmqinstall-time DNS lookup registers asc2_communication— needs DNS-payload parsing onsendto:53to distinguish legitimate hostname lookups from C2 preppip install --no-index --find-linksbacktracks indefinitely on some transitive-dep graphs (16/100 errors in measurement) — scan-infrastructure issue, unrelated to detection logic/install/node_modules/<other_pkg>/";scanPkgsis already plumbed into the sandbox and could drive alibrary_hijackingcategorychmodandsymlinkatnot yet traced — covered by static analyzers today, but a dynamic rule would close the obfuscated-runtime case