Skip to content

fix(security): harden Docker images + bump vulnerable Go modules (Phase 1)#227

Merged
Cre-eD merged 5 commits intomainfrom
fix/sca-dockerfile-hardening
May 7, 2026
Merged

fix(security): harden Docker images + bump vulnerable Go modules (Phase 1)#227
Cre-eD merged 5 commits intomainfrom
fix/sca-dockerfile-hardening

Conversation

@Cre-eD
Copy link
Copy Markdown
Contributor

@Cre-eD Cre-eD commented May 6, 2026

Summary

Phase 1 of a CIS Docker Benchmark + OWASP Container Top 10 hardening pass on this repo's published artifacts. All 5 published images rebuilt; CVE counts before → after:

Image Before After Notes
simplecontainer/kubectl 1 HIGH 0 clean
simplecontainer/cloud-helpers:aws-* 4 HIGH 4 HIGH (deferred) glibc fix not yet in AL2023 dnf — auto-applies on next rebuild
simplecontainer/caddy 48 (5H/5M/3L + Caddy core + Go stdlib) 10 (upstream transitives) Caddy 2.8.4 → 2.11.2
simplecontainer/github-actions 38 (13 alpine + 25 binary + 2 secrets) 2 (deferred) image 1.51GB → 1.24GB
simplecontainer/github-actions:staging same as prod same as prod synced

Grype cross-check (--only-fixed): kubectl / caddy / github-actions / staging — No vulnerabilities found. cloud-helpers — only the deferred glibc.

Supersedes Dependabot PR #162 (go-git 5.13.1 → 5.16.5 was insufficient; this PR moves to 5.18.0 to clear CVE-2026-41506).

Fixed — Dockerfile changes (CIS Docker Benchmark §4)

CIS What changed
4.1 kubectl runs as non-root UID 10001
4.2 / 4.7 All FROM bases pinned by @sha256: digest (no floating tags)
4.3 Multi-stage rewrite of github-actions(+staging).Dockerfile: builder keeps binutils, upx, python3 for gcloud components install; runtime drops them and py3-pip. bundledpythonunix and urllib3 dummyserver test fixtures removed
4.6 HEALTHCHECK added to kubectl, caddy, github-actions(+staging)
4.9 cloud-helpers.aws.Dockerfile: ADDCOPY
SSCS §5 Pulumi installer replaced with verified tarball download (per-version pulumi-${VERSION}-checksums.txt from GitHub Releases). Google Cloud SDK pinned to 567.0.0 with inline SHA-256 ARG. No remaining curl | sh in any Dockerfile
OWASP Container 02 Every third-party download verified before use

Caddy upgraded 2.8.4 → 2.11.2, certmagic-gcs 0.1.2 → 0.1.7. Alpine 3.19 → 3.21 for github-actions(+staging) (clears musl, openssh-client-common, busybox CVEs).

Fixed — Go module bumps (clears CVEs in the baked github-actions binary)

Module Before → After Severity
google.golang.org/grpc 1.72.1 → 1.80.0 CRITICAL (CVE-2026-33186)
go.opentelemetry.io/otel 1.36.0 → 1.43.0 HIGH (CVE-2026-29181)
go.opentelemetry.io/otel/sdk 1.36.0 → 1.43.0 HIGH (CVE-2026-24051, CVE-2026-39883)
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp 1.36.0 → 1.43.0 MEDIUM (CVE-2026-39882)
github.com/go-git/go-git/v5 5.13.1 → 5.18.0 HIGH (CVE-2026-25934, 34165, 41506) + LOW (CVE-2026-33762)
github.com/go-jose/go-jose/v3 3.0.4 → 3.0.5 HIGH (CVE-2026-34986)
github.com/go-jose/go-jose/v4 4.1.3 → 4.1.4 HIGH (CVE-2026-34986)
github.com/aws/aws-sdk-go-v2 1.26.1 → 1.41.5 MEDIUM (GHSA-xmrv-pmrh-hhx2)
github.com/aws/aws-sdk-go-v2/service/s3 1.53.1 → 1.97.3 MEDIUM (GHSA-xmrv-pmrh-hhx2)
github.com/cloudflare/circl 1.6.1 → 1.6.3 LOW (CVE-2026-1229)
toolchain go1.25.1 → go1.25.9 clears ~15 Go stdlib CVEs (crypto/tls, crypto/x509, encoding/pem, net/url, html/template, archive/tar, ...)

Deferred (no upstream fix available)

Finding Severity Why Reachability When to revisit
github.com/docker/docker CVE-2026-34040 / 33997 HIGH / MEDIUM Trivy points to v29.3.1 but only v28.5.2+incompatible is published on proxy.golang.org. The Moby project may publish v29 later or under a new module path. Used by pkg/clouds/pulumi/docker/pull.go for Pulumi image pulls. Auth-bypass is exploitable only against a malicious Docker daemon authorizing plugin install — not the case in our pipelines. Re-check go list -m -versions github.com/docker/docker next pass
glibc CVE-2026-4046 in cloud-helpers AL2023 base HIGH AL2023 dnf has not yet shipped 2.34-231.amzn2023.0.4. Hardened Dockerfile already runs dnf upgrade so it will auto-apply on next image rebuild after Amazon publishes. iconv() DoS via attacker-controlled charset; cloud-helpers Go Lambda runner doesn't call iconv. LOW risk in this image. Next image rebuild after Amazon publishes (typically days/weeks)
Caddy 2.11.2 transitive deps (10 vulns) 2C / 4H / 3M / 1L xcaddy can override direct deps via --with but not arbitrary transitives in Caddy core's go.mod without forking. Property of the upstream Caddy core build. Caddy 2.11.3+ release
Caddy non-root USER n/a Requires setcap CAP_NET_BIND_SERVICE on binary plus coordinating cert/state directory ownership with consumer-mounted volumes. n/a Phase 2+
github-actions non-root USER n/a GitHub docker-action runners mount /github/workspace as root; non-root USER triggers safe.directory failures and write-permission errors. n/a Track upstream GitHub guidance

Dependabot reconciliation

Evidence

# Trivy summary — kubectl
Before: 1 (LOW: 0, MEDIUM: 0, HIGH: 1, CRITICAL: 0)
After:  0

# Trivy summary — github-actions
Before: 13 alpine + 25 binary + 2 secrets (incl. 2 CRIT, 10 HIGH in binary)
After:  0 alpine + 2 binary + 0 secrets (only deferred docker/docker)

# Grype --only-fixed cross-check
sca-test/kubectl:hardened                 No vulnerabilities found
sca-test/cloud-helpers:hardened           glibc … (deferred)
sca-test/caddy:hardened                   No vulnerabilities found
sca-test/github-actions:hardened          No vulnerabilities found
sca-test/github-actions:staging-hardened  No vulnerabilities found

# Image size
simplecontainer/github-actions:latest    1.51 GB
sca-test/github-actions:hardened         1.24 GB  (-280 MB)

Test plan

  • CI builds all 5 images successfully on this branch (push.yaml docker-build matrix)
  • simplecontainer/github-actions:hardened boots — docker run … --version works
  • Pulumi flows still execute against new gcloud 567.0.0 + Pulumi 3.184.0 (auto-extracted from go.mod)
  • Smoke test in push.yaml builds — schema-gen, golangci-lint, go test all pass against bumped go.mod
  • Branch preview run validates new images end-to-end before merge
  • After merge, monitor Dependabot alerts auto-close on default branch

Next phases (tracked separately)

  1. Self-attest own artifacts — sign + scan + SBOM + SLSA provenance for simplecontainer/* images and sc.tar.gz tarballs
  2. Workflow least-privilege & pinning — drop root contents: write, SHA-pin third-party actions, fix pull_request secret exposure, remove --allow-insecure-entitlement
  3. Repo controls — CODEOWNERS, SECURITY.md, expanded Dependabot, CodeQL, gosec, branch rulesets
  4. Code-level fixes — HMAC for pkg/security/cache.go tamper detection

Phase 1 of CIS Docker Benchmark + OWASP Container Top 10 hardening pass.
All 5 published images rebuilt; baseline → hardened CVE counts:

  kubectl:           1H → 0
  cloud-helpers:     4H → 4H (glibc fix not yet in AL2023 dnf, deferred)
  caddy:             48 (5H/5M/3L+stdlib+core) → 10 (upstream transitives)
  github-actions:    38 (13 alpine + 25 binary + 2 secrets) → 2 (deferred)
  github-actions-staging: same as prod (synced)

Dockerfile changes (CIS 4.1/4.2/4.3/4.6/4.7/4.9, OWASP Container 02):
  - All FROM bases pinned by @sha256: digest
  - Pulumi installer replaced with checksum-verified tarball download
    (no more `curl | sh`); checksums fetched per-version from GitHub
    Releases pulumi-${VERSION}-checksums.txt
  - Google Cloud SDK pinned to 567.0.0 with inline SHA-256 ARG
  - github-actions(+staging) split into builder/runtime stages; runtime
    drops py3-pip, binutils, upx, bundledpythonunix; image 1.51GB→1.24GB
  - urllib3 dummyserver test fixtures (Trivy "secret" findings) removed
  - kubectl runs as non-root UID 10001
  - Caddy bumped 2.8.4 → 2.11.2; certmagic-gcs 0.1.2 → 0.1.7
  - Alpine 3.19 → 3.21 in github-actions(+staging) (clears musl, openssh,
    busybox CVEs)
  - HEALTHCHECK added to kubectl, caddy, github-actions(+staging)
  - cloud-helpers ADD → COPY

Go module bumps (clears 25 CVEs in the baked github-actions binary):
  google.golang.org/grpc          1.72.1 → 1.80.0   (CRIT CVE-2026-33186)
  go.opentelemetry.io/otel        1.36.0 → 1.43.0   (HIGH CVE-2026-29181)
  go.opentelemetry.io/otel/sdk    1.36.0 → 1.43.0   (HIGH CVE-2026-24051,
                                                          CVE-2026-39883)
  github.com/go-git/go-git/v5     5.13.1 → 5.18.0   (HIGH CVE-2026-25934,
                                                          CVE-2026-34165,
                                                          CVE-2026-41506)
  github.com/go-jose/go-jose/v3   3.0.4  → 3.0.5    (HIGH CVE-2026-34986)
  github.com/go-jose/go-jose/v4   4.1.3  → 4.1.4    (HIGH CVE-2026-34986)
  github.com/aws/aws-sdk-go-v2    1.26.1 → 1.41.5   (MED  GHSA-xmrv-pmrh-hhx2)
  github.com/aws/aws-sdk-go-v2/service/s3
                                  1.53.1 → 1.97.3   (MED  GHSA-xmrv-pmrh-hhx2)
  github.com/cloudflare/circl     1.6.1  → 1.6.3    (LOW  CVE-2026-1229)
  toolchain                       go1.25.1 → go1.25.9 (clears ~15 stdlib
                                                       CVEs incl. crypto/tls,
                                                       crypto/x509,
                                                       encoding/pem,
                                                       net/url, html/template)

Supersedes Dependabot PR #162 (go-git 5.13.1 → 5.16.5 — insufficient,
needed 5.18.0 for CVE-2026-41506).

Deferred (no upstream fix available):
  - github.com/docker/docker CVE-2026-34040/33997: Trivy points to v29.3.1
    but only v28.5.2+incompatible is published on proxy.golang.org.
    Reachability: pkg/clouds/pulumi/docker/pull.go uses Docker client for
    image pulls in Pulumi flows; auth-bypass is exploitable only against a
    malicious Docker daemon.
  - glibc CVE-2026-4046 in cloud-helpers: AL2023 dnf has not yet shipped
    2.34-231.amzn2023.0.4. Hardened Dockerfile runs `dnf upgrade` and will
    pick up the fix automatically. Reachability: glibc iconv() DoS via
    crafted charset; cloud-helpers Go binary doesn't call iconv. LOW risk.
  - Caddy upstream transitive deps in 2.11.2 binary (10 vulns): xcaddy
    can override direct deps via --with but not transitives in Caddy
    core's go.mod. Closes when Caddy 2.11.3+ ships.
@Cre-eD Cre-eD force-pushed the fix/sca-dockerfile-hardening branch from a743ea1 to f450a6d Compare May 6, 2026 13:49
Cre-eD added 4 commits May 6, 2026 18:31
…ted GCP auth, cache mounts

Self-review found four issues with the previous commit; all fixed here.

1) Pulumi checksum verification could silently pass:
       grep "${TARBALL}" pulumi-checksums.txt | awk ... | sha256sum -c -
   If the grep returned nothing (Pulumi renamed an asset, etc.), the
   pipeline would feed sha256sum empty stdin, which exits 0 — silently
   accepting an unverified tarball. Replaced with an explicit non-empty
   check on the captured SHA, plus `set -eu` so a missing PULUMI_VERSION
   from go.mod fails fast.

2) Restored BuildKit cache mounts on Pulumi + gcloud downloads:
   `--mount=type=cache,target=/tmp/{pulumi,gcloud}-dl,sharing=locked`.
   The original Dockerfile had a gcloud cache mount that I dropped during
   the multi-stage rewrite — re-runs were re-fetching ~85 MB of gcloud and
   ~80 MB of Pulumi from the CDN even when the inputs hadn't changed. The
   integrity check still runs every build, so a poisoned cache cannot
   break verification.

3) Caddy.Dockerfile: removed the dead pre-FROM `ARG version`. Versions
   live in three pinned places (builder digest, runtime digest, xcaddy
   build literal) — having an ARG that doesn't actually flow into the
   FROM lines is misleading. Hardcoded the literal "v2.11.2" with a
   sync-points comment.

4) Caddy HEALTHCHECK rewrote `wget -qO- :2019/config/` (admin API) to
   `caddy version`. The admin API is optional in Caddy and many users
   disable it; depending on it would mark the container unhealthy in
   those deployments — a behavior regression. `caddy version` is a basic
   binary-exec liveness probe, intentionally weaker than a daemon probe.

5) Removed the broken github-actions(+staging) HEALTHCHECK. The binary
   doesn't accept --version (it expects GITHUB_ACTION_TYPE env), so the
   probe always reported unhealthy. CIS Docker 4.6 targets long-running
   containers anyway; this image runs as a one-shot GitHub docker-action.

6) Fixed CI lint failure introduced by the dep bumps: `golang.org/x/oauth2`
   was transitively bumped 0.30.0 → 0.35.0 by the otel/grpc upgrades,
   which deprecated `auth.CredentialsFromJSONWithParams` in favor of
   `auth.CredentialsFromJSONWithTypeAndParams(_, _, ServiceAccount, _)`.
   SC stores GCP auth as service-account JSON, so the typed variant is
   the correct migration; pinning the type also makes the call reject
   unexpected credential shapes (workload-identity, refresh-token).

Verified locally:
- go build ./...        — clean
- go vet ./...          — clean
- staticcheck ./pkg/clouds/pulumi/gcp/...  — no SA1019 left
- go test -short ./pkg/{security,clouds/pulumi/gcp,clouds/pulumi/aws,clouds/pulumi/docker}/... — all pass
- docker build (github-actions.Dockerfile) — succeeds
- docker run gcloud/pulumi/gke-gcloud-auth-plugin --version — all work
- trivy image — same 2 deferred docker/docker CVEs, alpine clean
Replaces `go 1.25.0` + `toolchain go1.25.9` with a single `go 1.25.9`.

The toolchain directive was added so the module would compile under any
Go ≥ 1.25.0 (auto-downloading 1.25.9 if older), which is unnecessary
indirection — pinning the go version directly is clearer and CI is
already on 1.25.9. Anyone with an older Go installed gets a fail-fast
error instead of silently fetching a different toolchain.

Same stdlib CVE coverage as before (the 1.25.9 stdlib is what gets used
either way).
Both probes were CIS-checkbox theater rather than useful liveness signals.

kubectl.Dockerfile: this image is invoked as a one-shot tool
(`docker run --rm simplecontainer/kubectl <args>`), not a long-running
daemon — a liveness probe never has a chance to fire. CIS Docker 4.6
applies to long-running containers; cargo-culting it here only adds noise.

caddy.Dockerfile: a meaningful daemon probe needs the consumer's bound
port (or the admin API at :2019, which many consumers disable for
security). Both are config-specific. The probe I had only ran
`caddy version`, which exits 0 as long as the binary file exists on disk
— it would report healthy through a crashlooping daemon. Worse than no
probe. Consumers running Caddy in orchestrators should declare a probe in
their own deployment manifest where the bound port is known.
…s step

Self-review pass requested by reviewer. Net: -97 lines vs prev commit
(comments were verbose; substance unchanged).

Comments
- Cut from CIS-section recitations to one-line *why* per non-obvious step.
- Removed restated "no HEALTHCHECK because…" / "no USER because…" blocks
  from kubectl, caddy. The single line at the top of github-actions
  states the rationale once for that image.

Defense in depth (STRIDE T+S)
- `set -eu` → `set -euo pipefail` on the Pulumi + gcloud RUNs. Existing
  `[ -n "$VAR" ]` guards already caught silent-pass on the grep|awk|
  sha256sum chain, but pipefail covers anything similar I might add later.
  Verified Alpine 3.21 ash supports `-o pipefail`.
- OCI labels added to all 5 images (`org.opencontainers.image.source`,
  `.licenses`, `.title`, `.description`) so signed/published images can
  be traced back to this repo.

CI tools-step fix
- Pre-bake the post-`go get tools` state in go.mod/go.sum:
    atombender/go-jsonschema 0.22.0 → 0.23.0
    go-delve/delve 1.26.1 → 1.26.3
    mvdan.cc/gofumpt 0.9.2 → 0.10.0
    + golang.org/x/{crypto,mod,net,sys,term,text,tools,telemetry} bumps
  The push.yaml `tools` step does `go get tool` (no version) → `go mod
  download` → `go generate -tags tools` → `go mod tidy`. With newly-
  released gofumpt v0.10.0 etc., go.sum was missing entries that
  generate needed (tidy at the end is too late). Pre-baking the bumps
  here makes CI's `go get` a no-op so generate sees a complete go.sum.
  Same trick d9d4591 was relying on implicitly via an older gofumpt.
@Cre-eD Cre-eD requested a review from smecsia May 7, 2026 06:34
@Cre-eD Cre-eD self-assigned this May 7, 2026
@Cre-eD Cre-eD merged commit 74b5c67 into main May 7, 2026
22 checks passed
@Cre-eD Cre-eD deleted the fix/sca-dockerfile-hardening branch May 7, 2026 10:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants