fix: symmetric SynthOrg/DHI image verification cache#1878
Conversation
After 'synthorg update', the next 'synthorg start' showed 'Verify SynthOrg Images (cached)' while DHI images re-verified. Three asymmetries combined: - hasSynthOrgDigests only checked key presence (no expected-source comparison), unlike hasDHIDigests which compares cached digest against the binary-baked Renovate pin. - verifyAndPinForUpdate verified SynthOrg images only, never DHI. - pullAndPersist replaced state.VerifiedDigests instead of merging, wiping all dhi:* keys on every update. Add VerifiedImageTag sentinel on State to record which tag the SynthOrg pins were verified against; hasSynthOrgDigests now rejects on mismatch, mirroring DHI's strictness. Extract verifyImagesWithCache helper that runs cache-aware verification of both groups and route start, update, and wipe through it. pullAndPersist merges instead of replaces and sets the sentinel. config set image_tag clears both fields together. Wipe's state-aliasing issue (pulls falling back to floating tag refs after a SynthOrg-only verify) is fixed by reloading wc.state after persist.
Pre-reviewed by 4 agents, 5 valid findings addressed. - cli/CLAUDE.md + docs/reference/cli-persistence-backends.md: describe the verified_digests map as shared by SynthOrg and DHI pins (not DHI-only) and document verified_image_tag as the SynthOrg-side cache sentinel. - cli/cmd/verify_pipeline_test.go: consolidate the five hasSynthOrgDigests unit tests into one table-driven test with subtests. - docs/reference/cli-config-subcommands.md: drop default_nats_url (env-only, no longer settable) from the tunables list and the compose-affecting list; correct the settable-key count from 37 to 40 (matches supportedConfigKeys).
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
There was a problem hiding this comment.
Code Review
This pull request refactors the image verification process into a unified, cache-aware pipeline shared across the start, update, and wipe commands. It introduces a VerifiedImageTag sentinel in the configuration state to ensure cached SynthOrg digests remain valid for the active tag, alongside updated documentation and expanded test coverage. Reviewer feedback recommends adopting Go 1.21 maps idioms for cleaner map deletions and more robust mutation checks in tests, while also cautioning against sharing mutable data across parallel subtests to avoid potential data races.
| for k := range merged { | ||
| if !strings.HasPrefix(k, "dhi:") { | ||
| delete(merged, k) | ||
| } | ||
| } |
There was a problem hiding this comment.
For improved readability and to leverage modern Go idioms, you could replace this loop with maps.DeleteFunc (available since Go 1.21). This would make the intent of deleting specific keys from the map more explicit.
maps.DeleteFunc(merged, func(k, _ string) bool { return !strings.HasPrefix(k, "dhi:") })| missingSidecar := fullSandboxedPins(false, "") | ||
| delete(missingSidecar, "sidecar") |
There was a problem hiding this comment.
Sharing mutable test data like missingSidecar across parallel subtests can be fragile. If a future test case were added that also uses and modifies this map, it could introduce a data race. To make the test more robust, consider creating a fresh copy of the map for each test case that needs it, for example by defining it inside the subtest's t.Run.
| if existing["backend"] != "sha256:0000000000000000000000000000000000000000000000000000000000000000" { | ||
| t.Error("mergeVerifiedDigests must not mutate the existing map") | ||
| } | ||
| if fresh["backend"] != "sha256:1111111111111111111111111111111111111111111111111111111111111111" { | ||
| t.Error("mergeVerifiedDigests must not mutate the fresh map") | ||
| } |
There was a problem hiding this comment.
This check for mutation is good, but it's a bit brittle as it only checks a single key. If other keys were added to the test maps, a mutation in them would go unnoticed. For a more robust check, you could clone the maps before calling mergeVerifiedDigests and then compare them for equality afterwards. For example:
originalExisting := maps.Clone(existing)
originalFresh := maps.Clone(fresh)
_ = mergeVerifiedDigests(existing, fresh)
if !maps.Equal(existing, originalExisting) {
t.Error("mergeVerifiedDigests must not mutate the existing map")
}
if !maps.Equal(fresh, originalFresh) {
t.Error("mergeVerifiedDigests must not mutate the fresh map")
}This requires importing the maps package (available since Go 1.21).
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@cli/cmd/start.go`:
- Around line 189-199: The code currently reloads the stored config
unconditionally after setting state.VerifiedDigests when
result.SynthOrgReverified || result.DHIReverified, which can discard the
in-memory verified pins if config.Save fails; change the flow so that you only
call config.Load/GetGlobalOpts and replace state when config.Save succeeds
(i.e., check the error from config.Save and skip the reload on failure),
ensuring state.VerifiedDigests and state.VerifiedImageTag set in this invocation
are preserved when config.Save returns an error; reference the
result.SynthOrgReverified/result.DHIReverified branch,
state.VerifiedDigests/state.VerifiedImageTag, config.Save, and
config.Load/GetGlobalOpts to locate and update the logic.
In `@cli/cmd/verify_pipeline_test.go`:
- Around line 33-34: The test currently calls fullSandboxedPins(false, "") and
then mutates the returned map via delete(missingSidecar, "sidecar"), which risks
sharing mutated state; instead create a local modified copy or inline the
fixture for the subtest so you never mutate the original map returned by
fullSandboxedPins. Locate the usage of fullSandboxedPins and missingSidecar in
the test, clone the map into a new variable (or construct the modified map
inline) and remove the "sidecar" key from that copy before passing it to the
subtest, ensuring no package-scoped map is mutated prior to t.Parallel().
In `@docs/reference/cli-config-subcommands.md`:
- Around line 15-16: The docs table incorrectly states that `get <key>` exposes
40 keys; update that count to 42 to reflect the two additional read-only keys
`memory_backend` and `persistence_backend` so the line "`get <key>` | Get a
single config value (40 gettable keys)" becomes "(42 gettable keys)". Ensure the
corresponding `set <key>` line remains unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8b7763a0-b439-494a-8b7a-73d0c05b911e
📒 Files selected for processing (11)
cli/CLAUDE.mdcli/cmd/config.gocli/cmd/config_test.gocli/cmd/start.gocli/cmd/update.gocli/cmd/verify_pipeline.gocli/cmd/verify_pipeline_test.gocli/cmd/wipe.gocli/internal/config/state.godocs/reference/cli-config-subcommands.mddocs/reference/cli-persistence-backends.md
📜 Review details
🧰 Additional context used
📓 Path-based instructions (7)
cli/**/*.go
📄 CodeRabbit inference engine (CLAUDE.md)
For CLI work: use
go -C clicommand, nevercd cli; seecli/CLAUDE.mdfor CLI-specific rules
Files:
cli/cmd/config_test.gocli/internal/config/state.gocli/cmd/wipe.gocli/cmd/config.gocli/cmd/verify_pipeline_test.gocli/cmd/verify_pipeline.gocli/cmd/update.gocli/cmd/start.go
cli/cmd/**/*.go
📄 CodeRabbit inference engine (cli/CLAUDE.md)
All CLI commands accept persistent flags with precedence order: flag > env var > config > default; support
--data-dir,--skip-verify,--quiet,--verbose,--no-color,--plain,--json,--yes, and--help-allExit codes: 0=success, 1=runtime error, 2=usage error, 3=unhealthy (backend/containers), 4=unreachable (Docker), 10=updates available
Per-command flags must follow the documented table:
initrequires--backend-port,--web-port,--sandbox,--log-levelfor non-interactive mode;startsupports--no-wait,--timeout,--no-pull,--dry-run,--no-detach,--no-verify; and so on per the full Per-Command Flags table
Files:
cli/cmd/config_test.gocli/cmd/wipe.gocli/cmd/config.gocli/cmd/verify_pipeline_test.gocli/cmd/verify_pipeline.gocli/cmd/update.gocli/cmd/start.go
{README.md,docs/**/*.md}
📄 CodeRabbit inference engine (CLAUDE.md)
Source numerics in README and public docs from
data/runtime_stats.yamlvia<!--RS:NAME-->markers per data/README.md
Files:
docs/reference/cli-persistence-backends.mddocs/reference/cli-config-subcommands.md
docs/**/*.{md,d2}
📄 CodeRabbit inference engine (CLAUDE.md)
Use
d2for architecture and nested container diagrams; usemermaidfor flowcharts, sequence diagrams, and pipelines; use Markdown tables for tabular data. D2 theme 200 (Dark Mauve), D2 CLI pinned to v0.7.1.
Files:
docs/reference/cli-persistence-backends.mddocs/reference/cli-config-subcommands.md
cli/internal/{config,verify}/**/*.go
📄 CodeRabbit inference engine (cli/CLAUDE.md)
Overriding any of
registry_host,image_repo_prefix,dhi_registry,postgres_image_tag, ornats_image_tagdisables image signature and SLSA verification for that invocation only and writes a stderr warning
Files:
cli/internal/config/state.go
cli/internal/config/**/*.go
📄 CodeRabbit inference engine (cli/CLAUDE.md)
Backend auto-wire precedence: when both
SYNTHORG_DATABASE_URLandSYNTHORG_DB_PATHare present,SYNTHORG_DATABASE_URLwins and Postgres is initialized; malformed URLs must raise loudly at startup rather than silently falling back
Files:
cli/internal/config/state.go
cli/internal/{verify,config}/**/*.go
📄 CodeRabbit inference engine (cli/CLAUDE.md)
Image verification cache uses a
verified_digestsmap shared by SynthOrg and DHI pins, withverified_image_tagas the SynthOrg-side cache sentinel; reference indocs/reference/cli-persistence-backends.md
Files:
cli/internal/config/state.go
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-12T17:43:45.275Z
Learning: Package structure for CLI: `cmd/` contains Cobra commands and global options; `internal/` contains version, config, docker, compose, health, diagnostics, images, selfupdate, completion, ui, verify, and backup modules
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-12T17:43:45.275Z
Learning: Environment variable namespace `SYNTHORG_*` covers backend/channel overrides, image/registry overrides, timeouts/retry tuning, byte caps, and ports; full reference in `docs/reference/cli-env-vars.md`
🔇 Additional comments (7)
cli/cmd/update.go (3)
616-628: LGTM! Clean merge implementation with proper nil handling.The
mergeVerifiedDigestsfunction correctly:
- Returns
nilwhen both inputs are empty (preserving compose's nil-pin fallback)- Creates a new map to avoid aliasing the original inputs
- Uses
maps.Copyfor clean overlay semantics (fresh wins on conflict)
630-670: LGTM! Well-structured separation of SynthOrg compose pins vs full cache.The refactored
verifyAndPinForUpdateproperly:
- Uses the cache-aware
verifyImagesWithCachefor consistent verification- Writes only SynthOrg pins to compose via
synthOrgPins(result.Pins)- Returns the full pin map (including DHI keys) for state persistence
This correctly addresses the asymmetric caching issue described in the PR objectives.
587-614: LGTM! Proper merge-before-pull and sentinel tracking.The updated
pullAndPersistflow correctly:
- Merges fresh pins with existing (preserving cached DHI keys)
- Uses merged pins for the pull operation
- Sets
VerifiedImageTagsentinel after successful pull to enable cache validationThis addresses the PR objective of preventing stale SynthOrg cache across ImageTag changes.
cli/cmd/wipe.go (2)
385-426: LGTM! Cache-aware verification with proper state reload.The
verifyAndPinmethod correctly:
- Honors
--skip-verifyconsistently with other commands- Conditionally writes compose only when SynthOrg was reverified
- Sets the
VerifiedImageTagsentinel alongsideVerifiedDigests- Reloads state from disk after persist (lines 420-424) to fix the state-aliasing bug mentioned in the PR objectives
The reload-after-persist pattern ensures
wc.statereflects the freshly-pinned references for subsequent pull operations, avoiding the stale reference issue.
376-383: LGTM!Clean integration of the cache-aware verification step before pulling images.
cli/cmd/verify_pipeline_test.go (2)
106-132: LGTM! Comprehensive DHI key filtering test.Good coverage of the
synthOrgPinsfiltering behavior, verifying that:
- Only bare-name SynthOrg keys are retained
- All
dhi:-prefixed keys are excluded- Various DHI key patterns (platform, attestation, signature) are handled
134-181: LGTM! Thorough merge behavior tests.The tests for
mergeVerifiedDigestsproperly verify:
- Fresh pins overlay existing ones for shared keys
- DHI keys are preserved from the existing map
nilinputs returnnil(preserving compose's fallback path)- Input maps are not mutated (important for avoiding aliasing bugs)
These tests directly validate the fix for the state-aliasing bug mentioned in the PR objectives.
…ing diagnostics) - cli/cmd/start.go: only reload config when Save succeeds; preserve in-memory VerifiedDigests/VerifiedImageTag when Save fails (CodeRabbit, MAJOR) - cli/cmd/verify_pipeline.go: replace SynthOrg + DHI delete-loops with maps.DeleteFunc Go-1.21 idiom (Gemini) - cli/cmd/verify_pipeline_test.go: inline the "missing sidecar" pin map in TestHasSynthOrgDigests instead of mutating a package-scope variable before parallel subtests run (Gemini + CodeRabbit) - cli/cmd/verify_pipeline_test.go: harden TestMergeVerifiedDigests_DoesNotMutateInputs with maps.Clone + maps.Equal so any future key mutation is caught (Gemini) - docs/reference/cli-config-subcommands.md: correct `config get` key count to 42 (memory_backend + persistence_backend are read-only but gettable) (CodeRabbit) - cli/cmd/update.go: use errors.AsType[*exec.ExitError] (Go 1.26 stdlib) - cli/cmd/config_test.go: use strings.SplitSeq for range iteration
ci(cli-bench): bump BENCH_COUNT 5 -> 10 to clear benchstat's n>=6 floor The `IsValidImageTag` benchmark tripped the +15% regression gate at +16.17% on this PR, but the function source is byte-identical between the merge-base and PR HEAD (the PR only adds a new field to the State struct, no impact on the IsValidImageTag hot path). benchstat itself warned `need >= 6 samples for confidence interval at level 0.95` on every comparison row. Bumping BENCH_COUNT 5 -> 10 in both the workflow env and the script default crosses benchstat's n>=6 threshold so confidence intervals actually compute, and tightens the standard error by ~sqrt(2). Real 15%+ regressions still surface; sub-100ns microbenchmark wobble on shared CI runners stops being a false positive. Threshold stays at 15%.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1878 +/- ##
==========================================
- Coverage 84.97% 84.96% -0.01%
==========================================
Files 1804 1804
Lines 105548 105548
Branches 9206 9206
==========================================
- Hits 89687 89683 -4
- Misses 13632 13634 +2
- Partials 2229 2231 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Why
After
synthorg updateto v0.8.3-dev.15, the nextsynthorg startshowedVerify SynthOrg Images (cached)while DHI images re-verified. The two image groups should always cache (or invalidate) symmetrically: either both(cached)or both fresh.Root cause (3 stacked bugs)
hasSynthOrgDigests(cli/cmd/start.go) only checked key presence by bare service name.hasDHIDigestswas much stricter: it compared the cached digest againstverify.DHIPinnedIndexDigest(the Renovate-baked binary pin). On a tag bump, SynthOrg cache would silently go stale acrossImageTagchanges (masked today only because the update path always overwrites the map; accidental safety).verifyAndPinForUpdate(cli/cmd/update.go) only verified SynthOrg images viaverify.BuildImageRefs, never DHI. DHI images were pulled bypullAllImagesbut skipped verification entirely during update.pullAndPersist(cli/cmd/update.go) didupdatedState.VerifiedDigests = digestPins, replacing the whole map. Everydhi:*key was wiped on every update. Combined with bug 2, the post-update cache contained only SynthOrg pins, so the nextstartfound DHI keys missing and re-verified.The wipe path (
cli/cmd/wipe.go) had the same Bug-1 shape and a latent state-aliasing issue (wc.statewas loaded once and never reloaded after the persist, so subsequent pulls fell back to floating tag refs for sandbox/sidecar/fine-tune).What changed
cli/internal/config/state.go: newVerifiedImageTag stringsentinel field (json:"verified_image_tag,omitempty"). Records whichImageTagthe SynthOrg pins were verified against. Existing on-diskstate.jsonfiles unmarshal it as"", which never matches a real tag, so the firststartafter upgrade triggers a one-time re-verify and the cache becomes self-healing.cli/cmd/verify_pipeline.go(new):imagesVerifyResultstruct +verifyImagesWithCachehelper +verifySynthOrgGroup+synthOrgPinsfilter. One cache-aware verification helper for both SynthOrg and DHI image groups, reused by start / update / wipe.cli/cmd/start.go:hasSynthOrgDigestsnow requiresstate.VerifiedImageTag == state.ImageTag(mirrorshasDHIDigests's strictness).verifyAndPullStartImagesrefactored to route through the new helper. Dead helpers (verifyAndPinImages,renderVerifyBox) deleted.cli/cmd/update.go:verifyAndPinForUpdateroutes through the new helper (now also verifies DHI during update).pullAndPersistmerges pins via a newmergeVerifiedDigestshelper instead of replacing, and setsVerifiedImageTag = tag.cli/cmd/wipe.go: extractedwipeContext.verifyAndPin, routes through the new helper, reloads state after persist (fixes the floating-tag-pull latent bug).cli/cmd/config.go:invalidatesVerifiedDigestspaths now also clearstate.VerifiedImageTagso the sentinel drops alongside the pin map.docs/reference/cli-persistence-backends.md+cli/CLAUDE.md: describe the sharedverified_digestsmap and the newverified_image_tagsentinel.Plus pre-existing docs drift surfaced by the pre-PR docs-consistency review:
default_nats_urlremoved from the settable/compose-affecting lists (env-only); settable-key count corrected from 37 to 40 to matchsupportedConfigKeys.Behaviour after this PR
startshows both panels as(cached). Symmetric.(cached). Symmetric.synthorg config set image_tag X.Y.Z: clearsverified_digestsANDverified_image_tag; next start re-verifies both. Symmetric.Test plan
cli/cmd/verify_pipeline_test.gowith a table-drivenTestHasSynthOrgDigestscovering sentinel-mismatch, missing-sentinel (legacy state shape), missing-pin, full cache hit, sandbox-disabled hit. PlusTestSynthOrgPins_ExcludesDHIKeys, threeTestMergeVerifiedDigests_*cases (preserves DHI, nil-in/nil-out, non-mutation).TestConfigSetImageTag_ClearsVerifiedDigestsAndImageTagincli/cmd/config_test.go.go -C cli vet ./...clean.golangci-lint run0 issues.go -C cli test ./...all packages pass.go -C cli build ./...succeeds.End-to-end manual verification suggested after merge:
synthorg updateto a new dev release; observe both panels verified DURING update.synthorg start; both panels should show(cached).synthorg config set image_tag 0.0.0-fake && synthorg start; both panels should re-verify (sentinel mismatch).synthorg start; both panels cached again.Review coverage
Pre-reviewed by 4 agents:
docs-consistency: surfaced 4 docs drift items; all 4 fixed (2 caused by this PR's code change, 2 pre-existing).comment-quality-rot: clean.go-reviewer: clean ("no critical findings, ready for merge").go-conventions-enforcer: surfaced 4 items; 1 valid (table-driven consolidation, applied), 3 misapplied (flagged in triage as NOT VALID with rationale).Total: 5 valid findings, all 5 implemented in this PR.