Skip to content

Enhance/package metadata hardening#21

Merged
RalianENG merged 11 commits into
mainfrom
enhance/package-metadata-hardening
May 8, 2026
Merged

Enhance/package metadata hardening#21
RalianENG merged 11 commits into
mainfrom
enhance/package-metadata-hardening

Conversation

@RalianENG
Copy link
Copy Markdown
Owner

Summary

Security hardening sweep against attacker-controlled inputs that flow through the sandbox lifecycle. Two-phase audit: (1) defense-in-depth around package metadata identified by an initial review of the merged PR #18 surface, then (2) a broader re-audit across the full codebase that turned up three concrete detection bypasses missed by the narrower scope. All findings closed with regression tests pinning each invariant; ten commits, no breaking changes for legitimate use.

Changes

Injection / RCE paths closed

  • Heredoc → stdin pipe. cat << 'KOJUTO_EOF' … KOJUTO_EOF could be terminated early by a \nKOJUTO_EOF\n<cmd> payload smuggled through a malicious --local filename, executing <cmd> as root inside the sandbox and letting the package neutralise sitecustomize.py before its own code ran. Replaced all 9 heredoc sites with a new dockerWriteFile helper that delivers content via docker exec -i … cat > <path> stdin. (131bb3f)
  • --local package-name validation. runLocalScan now passes the derived name through downloaderValidate, blocking shell metacharacters and newlines at the boundary instead of relying on downstream sanitisation. (131bb3f)
  • Shell single-quoting in npmLifecycleScript. New shQuote helper wraps the cd target so a package.json key like foo$(id) cannot break out. (71d6d0a)
  • Depfile name validation. parseRequirementsTxt and parsePackageJSON reject names outside the npm/PyPI registry-allowed character sets (npmNameRe, pypiNameRe) at parse time. (71d6d0a)

Detection-bypass paths closed

  • Strace stderr scanner overflow. bufio.NewScanner(stderr) used the default 64 KiB cap and never checked scanner.Err(). A package writing >64 KiB to the shared docker-exec stderr without a newline tripped ErrTooLong, exited the parse loop, and silently dropped every subsequent strace event — yielding a CLEAN verdict. Now buffers at 16 MiB and converts any scanner.Err() to dropped++, flipping the verdict to inconclusive. Both parseStraceOutput and StraceFallback. (64788b2)
  • Rename regex \" escape. "([^"]+)" failed on filenames containing literal " (escaped by strace as \"), suppressing the entire rename event. A package could rename /tmp/x"y over /usr/local/bin/python3 invisibly. Capture is now ((?:\\.|[^"\\])+). (64788b2)
  • eBPF basename-only rename → benign. vfs_rename's d_name gives basename only, so eBPF mode delivered DstPath = "python3" and path.Dir("python3")+"/" = "./" never matched any allowed dir — every rename in eBPF mode was treated as benign. isBenignRename now fail-safes when DstPath lacks a leading /. (64788b2)
  • Wire-protocol newline injection. sitecustomize.py's _w() strips \n/\r/\0 so an attacker-controlled co_filename cannot smuggle a fake follow-up KOJUTO: line onto the wire. (71d6d0a)
  • Frame-walk depth cap removed. _FRAMES = 8 let a malicious package bury its own frame under stdlib wrappers and escape the + user-origin marker. The walk is now unbounded; the common-case fast path (early return on first user frame) is unchanged. (71d6d0a)

Defense-in-depth / visibility

  • Lenient-mode kojuto.yml warning. --strict defaults to false on the CLI; an attacker-planted kojuto.yml in cwd could silently shrink the detection surface. preRunLoadConfig now emits a stderr warning whenever exclusions are honored, listing the count and paths and pointing at --strict. (9fc198f)
  • Control-byte escape on report fields. code_snippet, comm, and cmdline could carry ANSI escape sequences from a malicious package; jq -r .events[].code_snippet would feed real ESC bytes to a terminal. Now sanitised in two places: sitecustomize.py:_t() escapes 0x00-0x1f / 0x7f at source; report.WriteJSON sanitises every string field of every event via reflection at output. (f623391)
  • getHostUsername sanitisation. Mirrors the existing getHostHostname → sanitizeDockerArg pattern. sanitizeDockerArg now takes an explicit fallback string so each caller picks its own. (ba9449b)
  • Owner-only mode for output files. Scan reports (-o) and --pin manifests now write at 0o600 instead of 0o644/0o666. (09f223f)

Action / docs / tooling

  • GitHub Action inputs. config, quiet, no-color exposed to match CLI flags. runtime description tightened to drop the now-obsolete empty-string fallback note. (7269ecf)
  • Changelog backfill. [Unreleased] / Added gains the severity-tiered verdict and caller-aware audit hook entries; README sensitive-paths count corrected (50 → 60). (0264378)
  • Lint cleanup. Cleared all 9 golangci-lint v2.1.6 findings the hardening series introduced (gofmt, govet shadow, intrange, gocritic, misspell). (e71d423)

Test Plan

  • Unit tests pass (make test)
  • Race-detection tests pass (go test -race -coverprofile coverage.txt -covermode atomic ./...)
  • golangci-lint v2.1.6 reports 0 issues.
  • Cross-compile clean for linux/amd64, linux/arm64, darwin/amd64, darwin/arm64, windows/amd64
  • go mod verify and govulncheck ./... clean (0 vulnerabilities reachable from kojuto code)
  • Each finding has a regression test pinning the security invariant
    • TestParse_RejectsInvalidNpmName, TestParse_RejectsInvalidPyPIName, TestParse_AcceptsScopedNpmName
    • TestShQuote, TestNpmLifecycleScript_QuotesShellMetachars, TestNpmLifecycleScript_ScopedPackage
    • TestDockerWriteFile_StructureNoHeredoc, TestDockerWriteFile_QuotesPath
    • TestRunLocalScan_RejectsUnsafeFilename
    • TestSanitizeDockerArg, TestGetHostUsername_Sanitizes
    • TestPreRunLoadConfig_LenientWarnsOnExclude
    • TestSanitizeControl, TestWriteJSON_StripsAnsiFromAttackerFields
    • TestOpenOutput_File, TestOutputFiles_OwnerOnly
    • TestParseStraceOutput_LongLineFlipsToInconclusive
    • TestParseStraceLine_RenameWithEscapedQuotes
    • TestAnalyze_RenameTrustedBinary (basename-only sub-cases)
  • Integration test (Docker sandbox end-to-end on Linux) — runs in CI; not run locally
  • Documentation updated (CHANGELOG entries included)

Related Issues

RalianENG and others added 10 commits May 8, 2026 16:54
…data

Four defense-in-depth boundaries around inputs derived from
attacker-controlled package metadata. None is currently exploitable —
they're forward-looking guards that make the trust boundaries explicit
so future refactors can't regress silently.

1. npmLifecycleScript now single-quotes the cd target via a new
   shQuote() helper. The previous double-quoted form still expanded
   $(...), backticks, and \, so a package.json key like "foo$(id)"
   could break out of the cd argument and run on the sandbox shell.
   Single quotes are literal — only ' is special and is escaped via
   the standard '\'' idiom.

2. depfile.Parse rejects names outside the registry-allowed character
   sets at parse time (npm: lowercase + optional @scope/, PyPI:
   PEP 508 ASCII). Hard error so a malformed name surfaces clearly
   instead of producing a degenerate shell command or
   KOJUTO_SCAN_PKGS env var. PyPI's ';' is not exercised in tests —
   the parser already strips PEP 508 environment markers before
   validation runs.

3. sitecustomize.py's _w() strips \n/\r/\0 from every emission. The
   Go parser splits on \n, so an attacker-controlled co_filename or
   compile() filename arg containing \n could smuggle a fake KOJUTO:
   line onto the wire. _t() already escapes the snippet field; this
   covers the filename field (concatenated raw from _origin()) plus
   any future tag.

4. The _FRAMES=8 cap on _origin()'s stack walk is removed. A
   malicious package burying its own frame under >8 stdlib wrappers
   would have escaped the "+" user-origin marker, letting the parser
   drop the resulting dynamic_exec event as benign. The walk is now
   unbounded; the common-case fast path (early return on the first
   user frame) is unchanged, so practical cost is the same.

Tests:
- TestParse_RejectsInvalidNpmName, TestParse_AcceptsScopedNpmName
- TestParse_RejectsInvalidPyPIName
- TestShQuote, TestNpmLifecycleScript_QuotesShellMetachars
- TestNpmLifecycleScript_ScopedPackage updated for single-quote form

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The GitHub Action wrapper was missing wiring for three CLI flags
that already work on the binary itself: `--config`, `--quiet`, and
`--no-color`. Workflows that wanted these had to drop down to a
custom shell step.

- `config` — passes through to `--config` so an alternative
  `kojuto.yml` path can be used (CI matrices with per-language
  configs, or a vendored config kept under a non-default path).
- `quiet` — suppresses phase progress lines so CI logs only contain
  the final verdict block. Pairs well with PR-comment workflows
  that post the report as a comment.
- `no-color` — disables ANSI escapes; mirrors the `NO_COLOR`
  environment variable spec, useful for log aggregators that don't
  strip escapes.

`runtime` description is also tightened to drop the now-obsolete
"empty string also resolves to auto" parenthetical (the auto path
is the documented default since #14) and the YAML default is set
to `auto` explicitly so the rendered docs match the CLI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…th count

The two unreleased fixes that shipped in #18 (a5cab0e) were missing
from CHANGELOG.md, and the README's credential-access table still
quoted the older "~50 sensitive paths" figure. This commit catches
the docs up to the merged code; no behavior change.

- CHANGELOG: add Severity-tiered verdict and Caller-aware audit hook
  entries under [Unreleased] / Added.
- README: bump the sensitive-path count from ~50 to ~60 to reflect
  the current entry list.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…l scans

The dockerExecRoot heredoc pattern used to write probe scripts and
honeypot credentials had a stray-terminator weakness:

    sh -c "cat > <path> << 'KOJUTO_EOF'\n<body>KOJUTO_EOF"

If <body> contained a line consisting only of `KOJUTO_EOF`, the shell
terminated the heredoc early and parsed the rest of the body as
commands — running as root inside the sandbox via `docker exec
--user=root`. <body> includes the package name interpolated into a
JS/Python `'%s'` literal in WriteProbeScripts*; a name carrying
"\n" + "KOJUTO_EOF" + "\n" + "<cmd>" produced exactly the lines needed
to terminate and execute <cmd>.

Reachability:

- `kojuto scan <pkg>`         gated by downloaderValidate (regex blocks \n)
- `kojuto scan -f <file>`     gated by depfile validators (71d6d0a)
- `kojuto scan --local <path>` was NOT gated — pkg derived from
  filepath.Base(localPath) carried any byte the filesystem allowed,
  so a malware sample distributed with a crafted filename would have
  achieved root in the sandbox during WriteProbeScripts. The most
  damaging payload would `rm /usr/local/lib/python3.12/site-packages/
  sitecustomize.py`, neutralising kojuto's own audit hook before the
  import phase actually ran the malicious code — defeating detection
  with no surface visibility.

Two-layer fix:

1. dockerWriteFile() helper replaces all 9 heredoc sites (5 honeypot
   credentials + 4 probe scripts). Content is delivered via stdin to
   `cat`, so the body never appears on the shell command line. Path
   is single-quoted as defense-in-depth though all current callers
   pass constants.

2. runLocalScan validates the derived pkg with downloaderValidate
   before any sandbox interaction. Newlines and shell metacharacters
   surface as a clear error rather than silently producing a
   degenerate container command.

Tests:

- TestDockerWriteFile_StructureNoHeredoc pins that the new helper
  emits `docker exec -i ... sh -c "cat > '<path>'"` with no `<<` or
  `KOJUTO_EOF` token anywhere in the args.
- TestDockerWriteFile_QuotesPath confirms shQuote is applied to the
  path slot for future callers.
- TestRunLocalScan_RejectsUnsafeFilename creates a `.tgz` whose
  basename contains "\nKOJUTO_EOF\n" and asserts the scan errors
  out before reaching Docker.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
getHostUsername read USER/USERNAME/LOGNAME and returned the raw value;
getHostHostname routed os.Hostname through sanitizeDockerArg before
returning. The two are used in symmetric slots —

  s.mountPoint = "/home/" + hostUser + "/projects"
  hostname     = sanitizeDockerArg(hostHostname)

— and mountPoint then flows into a `-v <packageDir>:<mountPoint>:ro`
docker volume spec and a `pip install --no-index --no-deps
--no-build-isolation <mountPoint>/*` shell glob. A USER value
containing `:` would break the volume separator, and one containing
shell metacharacters would derail the glob expansion.

Env vars are normally trusted, so this is not a concrete vulnerability
in any threat model kojuto currently supports — but the inconsistency
between the two helpers was load-bearing only because nobody set USER
to anything weird. Mirroring the hostname sanitizer removes the
implicit trust dependency at zero cost.

Side change: sanitizeDockerArg now takes an explicit fallback string
instead of hardcoding "localhost". The hostname caller passes
"localhost"; the new username caller passes "user". Removes the
hostname-specific bias from a function that is otherwise a generic
charset filter, and lets the test parameterise both fallback paths.

Tests: TestGetHostUsername_Sanitizes pins the new sanitization with
the same metacharacter cases the hostname test already covered, plus
non-ASCII fallback. TestSanitizeDockerArg expanded to exercise
parameterised fallback and the volume-spec / glob metacharacters.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`--strict` defaults to false on the CLI but true in the GitHub Action.
Local interactive runs therefore silently honor `sensitive_paths.exclude`
in the cwd's kojuto.yml. An untrusted repo (or extracted tarball) can
plant a kojuto.yml that shrinks the detection surface without the user
noticing — a "clean" verdict from such a run is not actually clean for
the excluded categories.

`--strict` already exists as the right answer for that threat, but the
existing UX only surfaces exclusion activity when --strict is ON
(printing the paths it's overriding). The lenient path was silent.

Now preRunLoadConfig emits a stderr warning whenever exclusions are
honored, listing the count and the actual paths and pointing at
--strict. The honoring behavior is unchanged — this is purely a
visibility fix so an attacker-planted exclude appears in CI logs and
interactive output before the verdict is trusted.

Test: TestPreRunLoadConfig_LenientWarnsOnExclude captures stderr
across a temp config with two excluded paths and asserts the warning
contains the count, both paths, and the --strict pointer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A malicious package could embed ANSI escape sequences (or any C0
control byte) into source compiled via `compile()` or `exec()`. The
audit hook captured the snippet via repr() / str() and emitted it on
the wire; the Go side put it in `code_snippet` and json.Marshal
encoded it as `�...`. Structurally the JSON was fine, but a
common downstream consumer pattern is `jq -r .events[].code_snippet`,
which decodes the JSON string literally — handing a real ESC byte to
the user's terminal. The attacker could then clear the screen and
paint a fake "verdict: clean" line in green, hiding the real
suspicious verdict.

Same vector exists for `comm` (a malicious process can `prctl(PR_SET_
NAME, "evil\x1b[31m")` with control bytes) and any future attacker-
reachable string field.

Two-layer fix:

1. sitecustomize.py's _t() now escapes every C0 control byte and DEL
   (0x00-0x1f, 0x7f) to a printable `\xNN` form, with the conventional
   \n / \r / \t shorthand for the common cases. The wire-side _w()
   already strips line terminators for framing safety; this extends
   the same idea to the snippet field's content for raw-consumer
   safety. Closes the gap at the source.

2. report.WriteJSON now walks every event and runs each string field
   through sanitizeControl before encoding. Reflection picks up every
   string field of types.SyscallEvent so future fields are covered
   automatically. kojuto-generated strings (Syscall, Category, Phase,
   etc.) never contain control bytes, so the pass is a no-op for them.

Tests:

- TestSanitizeControl exercises the helper across plain ASCII, UTF-8
  multibyte, individual control bytes, and a mixed-byte vector.
- TestWriteJSON_StripsAnsiFromAttackerFields plants control bytes in
  CodeSnippet, Comm, Cmdline, and FilePath; round-trips through
  WriteJSON + json.Unmarshal; asserts no decoded field contains a raw
  byte below 0x20 or 0x7f, and that the on-the-wire JSON contains a
  printable backslash-escaped form for the ESC byte.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
gosec G306 flagged three output writers using 0o644 (or os.Create's
default 0o666 then umask-clipped). All three carry data that
shouldn't leak to other accounts on a multi-user host:

- openOutput / -o report.json: events embed attacker-supplied code
  snippets, dst addrs, file paths, and the full dependency tree of
  the scanned package — anything a malicious package touches at
  install/import time. Switched os.Create to os.OpenFile with an
  explicit 0o600 so the report is owner-only by default; users who
  want to share can chmod after the fact.

- writePinnedPyPI / writePinnedNpm: --pin output captures the
  resolved manifest from the last clean scan. Internal/private
  registry deps and version metadata can be load-bearing for
  inferring the user's stack, so default 0o600 too.

The other gosec G306 hits (sandbox/staging package.json files,
seccomp.json, the in-flight tarball copy) all stay at world-readable
intentionally — the docker daemon and the in-container `dev` user
need to read them, and the tests would silently miss every install-
phase event without the wider mode. Those sites are out of scope
here.

Tests: TestOpenOutput_File and a new TestOutputFiles_OwnerOnly
syscall.Umask(0) the test process and stat the resulting files,
asserting Mode().Perm() == 0o600.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Broad re-audit (post-PR-#18 sweep) surfaced three concrete bypasses
where a malicious package could escape kojuto's monitoring without
tripping any current detector. All three are now closed; each ships
with a regression test pinning the security invariant.

1. strace stderr scanner used bufio.Scanner with the default 64 KiB
   line cap and never checked scanner.Err(). The docker-exec stderr
   is a multiplex of strace trace lines AND the tracee's own writes
   to fd 2, so a single `process.stderr.write("A".repeat(70000))` in
   a preinstall script tripped ErrTooLong, the for-loop exited, and
   every subsequent rename / openat / connect / execve recorded by
   strace was silently dropped — yielding a CLEAN verdict on a
   package that did anything observable after the long write.

   Both call sites (container_strace.parseStraceOutput and
   fallback.StartWithPID) now `Buffer()` to a 16 MiB cap and treat
   any post-loop scanner.Err() as `dropped++`. The dropped counter
   already converts the verdict to inconclusive in cmd/root, so the
   bypass is closed without changing the verdict logic.

2. straceRenameRe / straceRenameatRe used `"([^"]+)"` to capture
   filenames, which fails on strace's C-style backslash-escaped
   quotes. A package that renames `/tmp/x"y` (a legal Unix filename
   containing `"`) over `/usr/local/bin/python3` produced an strace
   line `rename("/tmp/x\"y", "/usr/local/bin/python3") = 0` that the
   regex did not match — no event reached the analyzer, the binary
   hijack category never fired, and a subsequent execve of the
   hijacked path was filtered as benign by `isBenignExec`.

   The capture is now `((?:\\.|[^"\\])+)` for both `rename` and
   `renameat[2]` paths.

3. The eBPF vfs_rename probe writes `d_name` (a qstr basename) into
   evt.path / evt.path2, so eBPF mode delivers a rename event with
   `DstPath = "python3"` and no parent directory. `isBenignRename`
   computed `destDir = path.Dir("python3") + "/" = "./"` which never
   matched the allowed dirs (e.g. `/usr/local/bin/`) — the function
   fell through to `return true` and treated every rename as benign.
   Result: in `--probe-method=ebpf`, NO rename was ever flagged,
   regardless of its target.

   `isBenignRename` now fail-safes when DstPath has no leading "/":
   a basename matching a tracked binary stays suspicious. Strace
   mode emits absolute paths and is unaffected.

Tests:

- TestParseStraceOutput_LongLineFlipsToInconclusive feeds a
  straceMaxLine+1024-byte single line via a chunkReader and asserts
  cs.dropped > 0 after parseStraceOutput returns.
- TestParseStraceLine_RenameWithEscapedQuotes exercises rename and
  renameat lines whose src/dst contain `\"`, asserting the event is
  produced with the expected SrcPath/DstPath.
- TestAnalyze_RenameTrustedBinary gains three basename-only sub-cases
  covering python3/sh as suspicious and an unknown name as clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The hardening commits (71d6d0a..64788b2) collected nine linter
violations the local go test pass didn't surface. Local CI parity
run with golangci-lint v2.1.6 — the version pinned in .github/
workflows/ci.yml — turned them up; this commit clears them all.

- gocritic regexpSimplify in depfile.go: drop the unnecessary `\/`
  escape inside a non-character-class regex.
- gofmt in strace_parse_extra_test.go and sandbox.go: trailing
  blank line / shQuote comment alignment.
- govet shadow in cmd/root.go and cmd/root_test.go: rename the
  inner err binding so it doesn't shadow an outer one.
- intrange in report.go (×2) and report_test.go (×1): switch
  bounded for-loops to `for i := range N` (Go 1.22+).
- misspell in report_test.go: serialised → serialized.

Re-running `golangci-lint run` reports `0 issues.`. Cross-compile
across all 5 GOOS/GOARCH matrix entries, `go test -race ./...`,
`go mod verify`, and `govulncheck ./...` all pass locally too.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 8, 2026

Codecov Report

❌ Patch coverage is 94.28571% with 6 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
internal/probe/fallback.go 0.00% 4 Missing ⚠️
internal/report/report.go 93.75% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

CI on windows-latest failed to compile cmd/root_test.go because
syscall.Umask is undefined on Windows. The umask call exists in
TestOpenOutput_File and TestOutputFiles_OwnerOnly so the asserted
mode is the requested 0o600 rather than the user's umask-clipped
result — Windows has no umask and Mode().Perm() doesn't reflect
Unix-style bits there, so the assertion is meaningless on Windows
even if it built.

Move the perm-check tests to a new cmd/root_unix_test.go behind
`//go:build !windows`. TestOpenOutput_File stays in root_test.go
in its original cross-platform shape (verifies the file is created
and is not os.Stdout); the mode assertion now lives in the new
TestOpenOutput_FileMode in the unix file. TestOutputFiles_OwnerOnly
moves wholesale.

Verified locally:
- GOOS=windows GOARCH=amd64 go test -c ./cmd/ now compiles
- GOOS=linux   GOARCH=amd64 go test -c ./cmd/ still compiles
- go test -race ./cmd/ green at 72.6% coverage
- golangci-lint v2.1.6 reports 0 issues

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@RalianENG RalianENG merged commit bd7571b into main May 8, 2026
12 checks passed
@RalianENG RalianENG deleted the enhance/package-metadata-hardening branch May 8, 2026 10:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant