fix(sandbox): surface fingerprint-erasure and probe-script failures by RalianENG · Pull Request #22 · RalianENG/kojuto

RalianENG · 2026-05-08T16:47:53Z

Summary

Fix a silent-fail class of bug in the post-start sandbox preparation that could produce false-clean verdicts. When eraseFingerprints, plantHoneypotFiles, restoreLocalBin, or the probe-script writers failed at the docker exec layer, the error was discarded (_ = cmd.Run()). The scan would continue against a partially-prepared sandbox — /.dockerenv still present, honeypot files missing, or an import phase running without its OS-spoofing wrapper — and any sandbox-aware payload that simply detected the gap and stayed dormant would surface as clean. The detection contract ("clean means no malicious behavior was observed") was therefore not enforced.

Changes

dockerExecRoot / dockerWriteFile now return errors, wrapped with the command (or path) for diagnosis. Honeypot content is omitted from the error to avoid spilling generated credentials into logs.
restoreLocalBin, eraseFingerprints, plantHoneypotFiles, WriteProbeScripts, and WriteProbeScriptsMulti now fail-fast on the first underlying error.
Start / StartPaused route through a shared prepareSandboxState helper that propagates any setup failure.
The three call sites in cmd/root.go (batch screening, eBPF probe, container-strace probe) abort the scan on prep failure rather than continuing.
Fail-loud regression tests added (TestEraseFingerprints_FailLoud, TestPlantHoneypotFiles_FailLoud, TestRestoreLocalBin_FailLoud, TestWriteProbeScripts_FailLoud, TestWriteProbeScriptsMulti_FailLoud) — they pin the contract that a failing docker command must surface as an error. Existing happy-path tests updated to assert nil.
CHANGELOG entry added under [Unreleased] / Fixed.

Test Plan

Unit tests pass (go test ./... -short -race -count=1)
go vet ./... clean
gofmt -l . clean
New *_FailLoud regression tests cover every affected function
Manual end-to-end: scan a known-clean package and a known-malicious package; verify happy-path behavior is unchanged and that a deliberately-broken docker exec (e.g. by revoking the container ID mid-run) now produces an error instead of clean
Documentation updated: CHANGELOG [Unreleased] / Fixed entry added

Related Issues

N/A — surfaced during a v1.0 readiness review; not tracked in an existing issue.

eraseFingerprints, plantHoneypotFiles, restoreLocalBin, WriteProbeScripts, and WriteProbeScriptsMulti previously discarded the result of every docker-exec call (`_ = cmd.Run()`). A failure here — e.g. /.dockerenv removal rejected by a corrupted runtime, or a probe-script write that silently dropped — would let the scan continue against a partially prepared sandbox: /.dockerenv intact, honeypot files missing, or an import phase that runs without its OS-spoofing wrapper. Sandbox-aware malware that simply detected the gap and stayed dormant would surface as a "clean" verdict — breaking the detection contract that "clean means no malicious behavior was observed." This change makes the entire post-start setup fail loud: - dockerExecRoot / dockerWriteFile return errors wrapped with the command (or path) for diagnosis. Honeypot content is omitted from the error to avoid spilling generated credentials into logs. - The three setup helpers and both probe-script writers fail-fast on the first underlying error. - StartPaused and Start route through a shared prepareSandboxState helper that propagates any setup failure. Callers in cmd/root.go abort the scan rather than reporting clean. - TestEraseFingerprints_FailLoud / TestPlantHoneypotFiles_FailLoud / TestRestoreLocalBin_FailLoud / TestWriteProbeScripts*_FailLoud pin the contract: a failing docker command must surface as an error. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

codecov · 2026-05-08T16:52:01Z

Codecov Report

❌ Patch coverage is 73.52941% with 27 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
internal/sandbox/sandbox.go	78.94%	10 Missing and 10 partials ⚠️
cmd/root.go	0.00%	7 Missing ⚠️

📢 Thoughts on this report? Let us know!

Real-world scan of `six` failed with the previous commit's fail-loud change: Error: starting sandbox: erasing sandbox fingerprints: \ docker exec rm -f /.dockerenv: exit status 1 Root cause: the rootfs is mounted --read-only, so `rm -f /.dockerenv` inside the container has never succeeded. The original silent-fail hid the bug; surfacing the error exposed it. Anti-fingerprinting was effectively never applied for /.dockerenv since --read-only was introduced. Fix: bind-mount an empty regular file from the host over /.dockerenv at container creation time. The mask file lives in the per-scan seccomp temp dir (already cleaned up by Cleanup), is created by writeSeccompProfile, and is wired in via `--mount=type=bind,src=...,dst=/.dockerenv,readonly`. Sandbox-aware payloads that read /.dockerenv now see empty content. A regular empty file is used (not /dev/null) so stat().S_ISREG() still returns true — a char device would itself be a fingerprint. eraseFingerprints had only this one operation, so the function and its two tests are removed; prepareSandboxState now calls only restoreLocalBin and plantHoneypotFiles. SECURITY.md is corrected to describe the actual mechanism. TestContainerArgs gains an assertion that the bind-mount flag and host-side mask file are present. Path-existence checks (`os.path.exists("/.dockerenv")`) still see something at the path; --runtime=runsc remains the recommendation for defeating those. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- gocritic appendCombine: combine the seccomp and dockerenv-mask appends into a single args = append(...) call (introduced when the bind-mount line was added in the previous commit). - goconst: extract "test-container" into testContainerID — the new *_FailLoud tests pushed the literal's count from 3 to 7. No behaviour change. Net lint count drops by one vs. the pre-series baseline. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

RalianENG and others added 2 commits May 9, 2026 02:06

RalianENG merged commit e98021b into main May 9, 2026
11 of 12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(sandbox): surface fingerprint-erasure and probe-script failures#22

fix(sandbox): surface fingerprint-erasure and probe-script failures#22
RalianENG merged 3 commits into
mainfrom
fix/sandbox-fail-loud

RalianENG commented May 8, 2026

Uh oh!

codecov Bot commented May 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

RalianENG commented May 8, 2026

Summary

Changes

Test Plan

Related Issues

Uh oh!

codecov Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov Bot commented May 8, 2026 •

edited

Loading