fix(security): harden Docker images + bump vulnerable Go modules (Phase 1)#227
Merged
fix(security): harden Docker images + bump vulnerable Go modules (Phase 1)#227
Conversation
Phase 1 of CIS Docker Benchmark + OWASP Container Top 10 hardening pass. All 5 published images rebuilt; baseline → hardened CVE counts: kubectl: 1H → 0 cloud-helpers: 4H → 4H (glibc fix not yet in AL2023 dnf, deferred) caddy: 48 (5H/5M/3L+stdlib+core) → 10 (upstream transitives) github-actions: 38 (13 alpine + 25 binary + 2 secrets) → 2 (deferred) github-actions-staging: same as prod (synced) Dockerfile changes (CIS 4.1/4.2/4.3/4.6/4.7/4.9, OWASP Container 02): - All FROM bases pinned by @sha256: digest - Pulumi installer replaced with checksum-verified tarball download (no more `curl | sh`); checksums fetched per-version from GitHub Releases pulumi-${VERSION}-checksums.txt - Google Cloud SDK pinned to 567.0.0 with inline SHA-256 ARG - github-actions(+staging) split into builder/runtime stages; runtime drops py3-pip, binutils, upx, bundledpythonunix; image 1.51GB→1.24GB - urllib3 dummyserver test fixtures (Trivy "secret" findings) removed - kubectl runs as non-root UID 10001 - Caddy bumped 2.8.4 → 2.11.2; certmagic-gcs 0.1.2 → 0.1.7 - Alpine 3.19 → 3.21 in github-actions(+staging) (clears musl, openssh, busybox CVEs) - HEALTHCHECK added to kubectl, caddy, github-actions(+staging) - cloud-helpers ADD → COPY Go module bumps (clears 25 CVEs in the baked github-actions binary): google.golang.org/grpc 1.72.1 → 1.80.0 (CRIT CVE-2026-33186) go.opentelemetry.io/otel 1.36.0 → 1.43.0 (HIGH CVE-2026-29181) go.opentelemetry.io/otel/sdk 1.36.0 → 1.43.0 (HIGH CVE-2026-24051, CVE-2026-39883) github.com/go-git/go-git/v5 5.13.1 → 5.18.0 (HIGH CVE-2026-25934, CVE-2026-34165, CVE-2026-41506) github.com/go-jose/go-jose/v3 3.0.4 → 3.0.5 (HIGH CVE-2026-34986) github.com/go-jose/go-jose/v4 4.1.3 → 4.1.4 (HIGH CVE-2026-34986) github.com/aws/aws-sdk-go-v2 1.26.1 → 1.41.5 (MED GHSA-xmrv-pmrh-hhx2) github.com/aws/aws-sdk-go-v2/service/s3 1.53.1 → 1.97.3 (MED GHSA-xmrv-pmrh-hhx2) github.com/cloudflare/circl 1.6.1 → 1.6.3 (LOW CVE-2026-1229) toolchain go1.25.1 → go1.25.9 (clears ~15 stdlib CVEs incl. crypto/tls, crypto/x509, encoding/pem, net/url, html/template) Supersedes Dependabot PR #162 (go-git 5.13.1 → 5.16.5 — insufficient, needed 5.18.0 for CVE-2026-41506). Deferred (no upstream fix available): - github.com/docker/docker CVE-2026-34040/33997: Trivy points to v29.3.1 but only v28.5.2+incompatible is published on proxy.golang.org. Reachability: pkg/clouds/pulumi/docker/pull.go uses Docker client for image pulls in Pulumi flows; auth-bypass is exploitable only against a malicious Docker daemon. - glibc CVE-2026-4046 in cloud-helpers: AL2023 dnf has not yet shipped 2.34-231.amzn2023.0.4. Hardened Dockerfile runs `dnf upgrade` and will pick up the fix automatically. Reachability: glibc iconv() DoS via crafted charset; cloud-helpers Go binary doesn't call iconv. LOW risk. - Caddy upstream transitive deps in 2.11.2 binary (10 vulns): xcaddy can override direct deps via --with but not transitives in Caddy core's go.mod. Closes when Caddy 2.11.3+ ships.
a743ea1 to
f450a6d
Compare
…ted GCP auth, cache mounts
Self-review found four issues with the previous commit; all fixed here.
1) Pulumi checksum verification could silently pass:
grep "${TARBALL}" pulumi-checksums.txt | awk ... | sha256sum -c -
If the grep returned nothing (Pulumi renamed an asset, etc.), the
pipeline would feed sha256sum empty stdin, which exits 0 — silently
accepting an unverified tarball. Replaced with an explicit non-empty
check on the captured SHA, plus `set -eu` so a missing PULUMI_VERSION
from go.mod fails fast.
2) Restored BuildKit cache mounts on Pulumi + gcloud downloads:
`--mount=type=cache,target=/tmp/{pulumi,gcloud}-dl,sharing=locked`.
The original Dockerfile had a gcloud cache mount that I dropped during
the multi-stage rewrite — re-runs were re-fetching ~85 MB of gcloud and
~80 MB of Pulumi from the CDN even when the inputs hadn't changed. The
integrity check still runs every build, so a poisoned cache cannot
break verification.
3) Caddy.Dockerfile: removed the dead pre-FROM `ARG version`. Versions
live in three pinned places (builder digest, runtime digest, xcaddy
build literal) — having an ARG that doesn't actually flow into the
FROM lines is misleading. Hardcoded the literal "v2.11.2" with a
sync-points comment.
4) Caddy HEALTHCHECK rewrote `wget -qO- :2019/config/` (admin API) to
`caddy version`. The admin API is optional in Caddy and many users
disable it; depending on it would mark the container unhealthy in
those deployments — a behavior regression. `caddy version` is a basic
binary-exec liveness probe, intentionally weaker than a daemon probe.
5) Removed the broken github-actions(+staging) HEALTHCHECK. The binary
doesn't accept --version (it expects GITHUB_ACTION_TYPE env), so the
probe always reported unhealthy. CIS Docker 4.6 targets long-running
containers anyway; this image runs as a one-shot GitHub docker-action.
6) Fixed CI lint failure introduced by the dep bumps: `golang.org/x/oauth2`
was transitively bumped 0.30.0 → 0.35.0 by the otel/grpc upgrades,
which deprecated `auth.CredentialsFromJSONWithParams` in favor of
`auth.CredentialsFromJSONWithTypeAndParams(_, _, ServiceAccount, _)`.
SC stores GCP auth as service-account JSON, so the typed variant is
the correct migration; pinning the type also makes the call reject
unexpected credential shapes (workload-identity, refresh-token).
Verified locally:
- go build ./... — clean
- go vet ./... — clean
- staticcheck ./pkg/clouds/pulumi/gcp/... — no SA1019 left
- go test -short ./pkg/{security,clouds/pulumi/gcp,clouds/pulumi/aws,clouds/pulumi/docker}/... — all pass
- docker build (github-actions.Dockerfile) — succeeds
- docker run gcloud/pulumi/gke-gcloud-auth-plugin --version — all work
- trivy image — same 2 deferred docker/docker CVEs, alpine clean
Replaces `go 1.25.0` + `toolchain go1.25.9` with a single `go 1.25.9`. The toolchain directive was added so the module would compile under any Go ≥ 1.25.0 (auto-downloading 1.25.9 if older), which is unnecessary indirection — pinning the go version directly is clearer and CI is already on 1.25.9. Anyone with an older Go installed gets a fail-fast error instead of silently fetching a different toolchain. Same stdlib CVE coverage as before (the 1.25.9 stdlib is what gets used either way).
Both probes were CIS-checkbox theater rather than useful liveness signals. kubectl.Dockerfile: this image is invoked as a one-shot tool (`docker run --rm simplecontainer/kubectl <args>`), not a long-running daemon — a liveness probe never has a chance to fire. CIS Docker 4.6 applies to long-running containers; cargo-culting it here only adds noise. caddy.Dockerfile: a meaningful daemon probe needs the consumer's bound port (or the admin API at :2019, which many consumers disable for security). Both are config-specific. The probe I had only ran `caddy version`, which exits 0 as long as the binary file exists on disk — it would report healthy through a crashlooping daemon. Worse than no probe. Consumers running Caddy in orchestrators should declare a probe in their own deployment manifest where the bound port is known.
…s step
Self-review pass requested by reviewer. Net: -97 lines vs prev commit
(comments were verbose; substance unchanged).
Comments
- Cut from CIS-section recitations to one-line *why* per non-obvious step.
- Removed restated "no HEALTHCHECK because…" / "no USER because…" blocks
from kubectl, caddy. The single line at the top of github-actions
states the rationale once for that image.
Defense in depth (STRIDE T+S)
- `set -eu` → `set -euo pipefail` on the Pulumi + gcloud RUNs. Existing
`[ -n "$VAR" ]` guards already caught silent-pass on the grep|awk|
sha256sum chain, but pipefail covers anything similar I might add later.
Verified Alpine 3.21 ash supports `-o pipefail`.
- OCI labels added to all 5 images (`org.opencontainers.image.source`,
`.licenses`, `.title`, `.description`) so signed/published images can
be traced back to this repo.
CI tools-step fix
- Pre-bake the post-`go get tools` state in go.mod/go.sum:
atombender/go-jsonschema 0.22.0 → 0.23.0
go-delve/delve 1.26.1 → 1.26.3
mvdan.cc/gofumpt 0.9.2 → 0.10.0
+ golang.org/x/{crypto,mod,net,sys,term,text,tools,telemetry} bumps
The push.yaml `tools` step does `go get tool` (no version) → `go mod
download` → `go generate -tags tools` → `go mod tidy`. With newly-
released gofumpt v0.10.0 etc., go.sum was missing entries that
generate needed (tidy at the end is too late). Pre-baking the bumps
here makes CI's `go get` a no-op so generate sees a complete go.sum.
Same trick d9d4591 was relying on implicitly via an older gofumpt.
smecsia
approved these changes
May 7, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 1 of a CIS Docker Benchmark + OWASP Container Top 10 hardening pass on this repo's published artifacts. All 5 published images rebuilt; CVE counts before → after:
simplecontainer/kubectlsimplecontainer/cloud-helpers:aws-*simplecontainer/caddysimplecontainer/github-actionssimplecontainer/github-actions:stagingGrype cross-check (
--only-fixed): kubectl / caddy / github-actions / staging —No vulnerabilities found. cloud-helpers — only the deferred glibc.Supersedes Dependabot PR #162 (go-git 5.13.1 → 5.16.5 was insufficient; this PR moves to 5.18.0 to clear CVE-2026-41506).
Fixed — Dockerfile changes (CIS Docker Benchmark §4)
kubectlruns as non-root UID 10001FROMbases pinned by@sha256:digest (no floating tags)github-actions(+staging).Dockerfile: builder keepsbinutils,upx,python3forgcloud components install; runtime drops them andpy3-pip.bundledpythonunixandurllib3dummyserver test fixtures removedHEALTHCHECKadded to kubectl, caddy, github-actions(+staging)cloud-helpers.aws.Dockerfile:ADD→COPYpulumi-${VERSION}-checksums.txtfrom GitHub Releases). Google Cloud SDK pinned to 567.0.0 with inline SHA-256 ARG. No remainingcurl | shin any DockerfileCaddy upgraded 2.8.4 → 2.11.2, certmagic-gcs 0.1.2 → 0.1.7. Alpine 3.19 → 3.21 for github-actions(+staging) (clears
musl,openssh-client-common,busyboxCVEs).Fixed — Go module bumps (clears CVEs in the baked
github-actionsbinary)google.golang.org/grpcgo.opentelemetry.io/otelgo.opentelemetry.io/otel/sdkgo.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttpgithub.com/go-git/go-git/v5github.com/go-jose/go-jose/v3github.com/go-jose/go-jose/v4github.com/aws/aws-sdk-go-v2github.com/aws/aws-sdk-go-v2/service/s3github.com/cloudflare/circltoolchaincrypto/tls,crypto/x509,encoding/pem,net/url,html/template,archive/tar, ...)Deferred (no upstream fix available)
github.com/docker/dockerCVE-2026-34040 / 33997v28.5.2+incompatibleis published onproxy.golang.org. The Moby project may publish v29 later or under a new module path.pkg/clouds/pulumi/docker/pull.gofor Pulumi image pulls. Auth-bypass is exploitable only against a malicious Docker daemon authorizing plugin install — not the case in our pipelines.go list -m -versions github.com/docker/dockernext passglibcCVE-2026-4046 in cloud-helpers AL2023 basednf upgradeso it will auto-apply on next image rebuild after Amazon publishes.--withbut not arbitrary transitives in Caddy core'sgo.modwithout forking.setcap CAP_NET_BIND_SERVICEon binary plus coordinating cert/state directory ownership with consumer-mounted volumes./github/workspaceas root; non-root USER triggerssafe.directoryfailures and write-permission errors.Dependabot reconciliation
go-git 5.13.1 → 5.16.5): superseded by this PR (we move to 5.18.0 — needed for CVE-2026-41506 which 5.16.5 doesn't fix). Will be auto-closed when this PR merges; will leave a comment when ready.Evidence
Test plan
simplecontainer/github-actions:hardenedboots —docker run … --versionworksNext phases (tracked separately)
simplecontainer/*images andsc.tar.gztarballscontents: write, SHA-pin third-party actions, fixpull_requestsecret exposure, remove--allow-insecure-entitlementpkg/security/cache.gotamper detection