build: declarative build.patches via git worktree isolation (FR-026)#179
Open
barbatos2011 wants to merge 6 commits into
Open
build: declarative build.patches via git worktree isolation (FR-026)#179barbatos2011 wants to merge 6 commits into
barbatos2011 wants to merge 6 commits into
Conversation
Adds FR-026/027/028: intent's build: block accepts a `patches: []string`
list of unified-diff files. When non-empty, trond builds the source in
an isolated `git worktree` at the resolved revision with patches
applied, NOT in the user's primary source tree.
Use case (concrete): `replay` (tools/replay/) requires 4 java-tron
source patches to skip mainnet-time-baked checks (expiration, TaPos,
network-layer check). Previously these were manual `git apply` steps
the user had to run before trond build. Now they're declared once in
the intent and trond handles isolation + cache.
Why this is more than ergonomics:
- Cross-machine cache reuse. Today's working-tree-derived PatchHash
is sha256(`git diff` + `git status -uall`), which depends on the
operator's local .DS_Store / IDE temp files / WIP edits. Two
operators with the same intent get DIFFERENT cache keys. With
declarative patches, PatchHash = sha256(canonical concat of patch
contents in order) — a pure function of inputs. Same intent +
same patches → same cache_key on every machine.
- Reproducibility. Intent becomes the single source of truth for
what artifact a deploy needs.
- Agent-friendliness. An MCP agent can drive a patched-build deploy
without ever shelling out to git apply (which isn't on trond's
allowlist anyway).
- User source preserved. The worktree is `git worktree add --detach`
at $TROND_STATE_DIR/builds/worktrees/<cache-key>/; the user's
primary checkout's `git status` stays untouched. IDE windows don't
churn, in-flight WIP is safe.
Lifecycle:
resolveBuild (cache miss) →
setupWorktree (mkdir, git worktree add, git apply --check, git apply)
→ runner.RunBuild (gradle in worktree path)
→ defer cleanup (git worktree remove --force)
Cache hits skip setup/cleanup entirely — the JAR is already in out/.
Failed patches surface as PATCH_FAILED with the offending patch path
and `git apply`'s stderr.
Behavior when build.patches is absent/empty: unchanged. The build
runs in the user's source dir directly with today's dirty-tree
PatchHash semantics — preserves the iterative dev-loop where you
edit source, run trond apply, get the artifact with your edits.
Schema changes:
- intent.BuildSpec gains optional `patches []string` (FR-026).
- build.Request gains `Patches []string`.
- Manifest gains `patches []string` for inspect/list traceability.
- schemas/output/build-{inspect,list,prune}.schema.json declare
the patches field (optional / omitempty).
- SchemaVersion 1.4.0 → 1.5.0 (MINOR — additive).
Patches shipped: tools/replay/patches/{01..04}-*.patch covering
TransactionCapsule.checkExpiration, Manager validateTapos/Common,
TransactionsMsgHandler network-layer check, ProposalCapsule.hasExpired.
All 4 verified to apply cleanly against current java-tron HEAD
(d7a90ca21) via the worktree mechanism.
Tests (16 sub-cases, all passing):
- TestComputePatchHash_Deterministic (4 sub) — cross-machine hash
determinism; order matters; identical contents hash same
regardless of filename
- TestValidatePatchFile (7 sub) — header sniffing accepts git/posix/
svn formats, rejects yaml/binary/garbage
- TestSetupWorktree_AppliesPatchesAndCleansUp — full worktree
lifecycle with patch application + cleanup
- TestSetupWorktree_BadPatchSurfacesStructuredError — PATCH_FAILED
envelope with patch path + git stderr
- TestResolveBuild_PatchesOverridesPatchHash — confirms unrelated
dirty-tree noise doesn't shift the cache key when Patches is set
Documentation:
- specs/002-trond-build-pipeline/spec.md FR-026/027/028 + key entities
- specs/002-trond-build-pipeline/quickstart.md: full "Patching source
before build" section with declarative + manual paths
- AGENTS.md Workflow 5 gains a Building-with-patches subsection
- tools/replay/README.md recommends the declarative path; manual
git apply preserved as alternative
HIGH tronprotocol#1: validatePatchFile rejected git format-patch files Long commit messages push `diff --git` past the prior 4 KB sniff window. Now sniffs 64 KB AND recognizes the `From <40-hex-sha>` mbox prologue that `git format-patch` writes at byte 0. Test case added: git format-patch with a 6 KB body before the diff. HIGH tronprotocol#2: Manifest.Patches stored absolute filesystem paths Made the field stale when patches were moved/renamed/pulled from a shared TROND_STATE_DIR — undermined the cross-machine cache reuse story for `trond build inspect`. Replaced with []PatchRecord = {Name (basename), SHA256 (content hash)}. Schema updated. The cache-key-deriving PatchHash now flows through the same records (computePatchHashFromRecords), so the per-patch sha256 is the single fingerprint feeding both cache key + inspect traceability. One pass over each patch file; no double-reads. MED tronprotocol#3: No regression test for "second call hits cache" Added TestResolveBuild_PatchesProducesStableCacheKey: two separate copies of the same patch contents at unrelated filesystem paths → identical cache_key. Pins the headline cross-machine cache reuse benefit. MED tronprotocol#5: Manifest field comment was inaccurate Original comment said "absolute paths declared in intent.build.patches" — but the apply layer resolved them to absolute before recording. Comment now describes the new PatchRecord shape correctly. Refactor: dropped computePatchHash([]string), replaced with buildPatchRecords + computePatchHashFromRecords pair. resolved struct gains a patchRecords field so all three Manifest creation sites (builder.go, image.go, image_wrap.go) populate from the same canonical list. Tests added (10 new sub-cases): - TestComputePatchHash_Deterministic adapted (4 sub) for the records-based API - TestBuildPatchRecords (1) — Name/SHA256 shape, basename only - TestValidatePatchFile gained 2: git format-patch with long body (must accept); `From <not-a-sha>` (must reject) - TestResolveBuild_PatchesProducesStableCacheKey (1) Schemas updated: build-list / build-inspect / build-prune patches field changed from string[] to {name, sha256}[]. Baseline regenerated; SchemaVersion stays 1.5.0 (the PR's own bump — this review pass amends before merge).
LOW tronprotocol#1: TestResolveBuild_PatchesProducesStableCacheKey was one-sided Asserted keyA == keyB (same patches different paths) but didn't prove patches actually participate in the key — would have passed trivially if a refactor silently dropped them. Added the `keyA != keyWithoutPatches` assertion. Combined with the sibling PatchHash test, the patches→cache-key invariant is now pinned from both directions. LOW tronprotocol#2: No JSON round-trip test for Manifest.Patches []PatchRecord Added TestManifest_PatchRecordRoundTrip + TestManifest_NilPatchesOmitted in a new manifest_test.go. Pins: - The wire field names (`patches`, `name`, `sha256`) so a tag typo can't silently drift Manifest away from schemas/output/build-*.schema.json - Marshal → Unmarshal → DeepEqual round-trip for a 2-record list - omitempty behavior on nil Patches (no `"patches": null` or `"patches": []` noise emitted for the no-patches case) LOW tronprotocol#3 (backwards-compat): no fix — pre-merge. Old cache entries from the in-development 52abc0f shape would fail to deserialize into the new []PatchRecord shape. Acceptable since PR tronprotocol#179 hasn't shipped; failed-load entries just trigger a rebuild via Lookup's standard cache-miss path.
Review pass 6 changed Manifest.Patches from []string (absolute paths)
to []PatchRecord{Name, SHA256}, and computePatchHash from "sha256 of
concatenated patch contents" to "sha256 of per-patch SHA256s folded
in declared order". The functional invariant (cross-machine
deterministic patch_hash) is preserved, but the docs still described
the old formula and didn't mention the new PatchRecord shape that
`trond build inspect` now surfaces.
Three docs updated:
spec.md
- FR-026 patch_hash formula reworded to describe the per-patch-
sha256 fold; clarified that the exact byte format is an impl
detail and the operator-facing invariant is "same patch CONTENTS
→ same cache_key on every machine".
- Key Entities gained a new PatchRecord entry describing what
Manifest.Patches now carries.
- Build entity's patches property updated to "list of
{name, sha256} records" (was "list of paths").
quickstart.md
- Step 2 of the patches walkthrough now describes the per-patch
record flow and explains why the manifest stores portable
fingerprints (basename + sha256) rather than absolute paths.
- Inspect section gained a worked example showing the patches
array in JSON output, and a tip on how a teammate can verify
their local patches match a cached artifact (compare sha256s).
AGENTS.md
- Workflow 5 "Building with patches" section: same patch_hash
reword; added a bullet describing the manifest's patches
array shape so agents driving `build_inspect` know the field
is `[{name, sha256}]` not absolute paths.
No code changes. `go test -tags e2e ./cmd -run TestE2E_AgentsMD`
passes — every command shown in AGENTS.md still resolves.
…tocol#179 HIGH H1: no test for jar-wrap + patches recursion Impl was correct (jarReqForWrap copies req whole so Patches flows into the inner JAR build), but a future refactor could silently drop Patches alongside the cleared ImageTag/ImageStrategy and produce a wrapped image with un-patched code — the worst regression class because builds succeed and images run. Added TestJarReqForWrap/patches_flow_through_to_inner_JAR_build. HIGH H2: orphan worktree dirs leaked forever setupWorktree creates <state>/builds/worktrees/<key>/ but SIGKILL between defer-registration and execution leaves an orphan that prune never reclaimed. 100-200 MB per leak. Fixed by: - EnsureCacheDirs pre-creates worktrees/ alongside the existing out/images/manifest/locks dirs - Prune gained a final gcOrphanWorktrees pass: walks worktrees/, drops dirs whose cache key has no live manifest AND no active flock. The flock check prevents ripping a worktree out from under a concurrent in-flight build. - Removed worktrees show up in result.Removed with artifact_kind="worktree" + orphaned=true so JSON consumers can distinguish from manifest removals. - Two new tests: TestPrune_GCsOrphanWorktrees, TestPrune_SkipsLockedOrphanWorktree. MED M1: validatePatchFile 64 KB sniff too small Release-note commit bodies + signed-off-by chains in git format-patch can push `diff --git` past 64 KB. Bumped to 256 KB. Error message + comment updated accordingly. MED M4: quickstart told users to `git checkout -- .` after `git diff` Destructive — wipes local work. Replaced with a callout warning + correct guidance: the diff file is enough, leave the working tree alone OR use `git stash` (recoverable) instead of `git checkout` (destructive). MED M5: no test for patches against a pinned non-HEAD revision Added TestSetupWorktree_AgainstNonHeadRevision: repo with two commits, patch matches FIRST commit content, HEAD drifted to different content. Worktree pinned to first revision; apply must succeed. Pins FR-026's "checks out the resolved revision" semantics. MED M6: no test for binary-file patches Added TestSetupWorktree_BinaryPatch: generates a real `git diff --binary` patch via the test repo, validates it through validatePatchFile (must accept the `diff --git` header), then setupWorktree → assert worktree's binary content matches the post-patch bytes. LOW L3: isGitFormatPatchHeader accepted only lowercase hex git format-patch always emits lowercase, but a downstream signer / cosign attest pipeline may normalize to uppercase. Defensive: now accepts [0-9a-fA-F]. Spec note: trust model of build.patches path access validatePatchFile reads any file the trond user can read for its 256 KB sniff + content hashing. Same trust level as build.source / build.gradle_args / build.env. FR-028 gained an explicit trust-model paragraph stating: intent files are trusted input; trond does not defend against a malicious intent author. Test count change: +6 new sub-cases (H1 + H2 ×2 + M5 + M6, TestValidatePatchFile gained no new cases — the 256 KB bump is backward-compatible with the existing 6 KB fixture).
HIGH H-1 (BLOCKER): orphan worktree GC produced schema-invalid Entries
Review pass 8's gcOrphanWorktrees synthesized Entry records with
ArtifactKind="worktree" + zero CreatedAt and appended them to
result.Removed. build-prune.schema.json requires created_at AND
constrains artifact_kind to enum [jar, image] — so any actual
worktree reclaim emitted JSON that failed validation against the
PR's own schema. Caught by holistic re-review.
Fix: introduce a distinct surface for orphan worktrees rather
than overloading Entry.
- New type OrphanWorktree { CacheKey, SizeBytes } in cache_ops.go
- New PruneResult.OrphanWorktrees []OrphanWorktree (omitempty)
- Schema declares the new field + a sibling $defs/orphan_worktree
- Test asserts on result.OrphanWorktrees (not result.Removed)
Entry's invariant ("always a Manifest-tied build cache entry") is
preserved; agents that only consume Removed see only kind=jar|image.
MED M-1: dry-run skipped orphan worktree discovery
Operators using --dry-run to size their cleanup got a number
smaller than the real --confirm run because gcOrphanWorktrees
ran ONLY in the non-dry-run branch. Now Prune calls
gcOrphanWorktrees(..., dryRun) in both branches; the GC function
inspects + records on dry-run (populates OrphanWorktrees +
credits FreedBytes) and inspects + removes on confirm. New test
TestPrune_DryRunSurfacesOrphanWorktrees pins the projection,
including verifying the worktree dir is STILL on disk after a
dry-run.
MED M-3: stale validatePatchFile docstring
Doc-comment still said "64 KB sniff window" after the 256 KB
bump landed in pass 8. The inline comment + error message were
updated then; the docstring missed. Now consistent.
LOW L-2: bookkeeping leak hint
gcOrphanWorktrees does os.RemoveAll on the worktree dir (we
don't know the source repo path to run proper `git worktree
remove`). This leaves stale .git/worktrees/<key>/ ledger
entries on the user's source repo. Now, after the GC pass
reclaims >=1 orphan, we emit a stderr "info:" line telling
the operator to `git worktree prune` in their checkout. Cheap
+ actionable + non-blocking.
Skipped findings:
- M-2 (dirSizeBytes walks full tree): acceptable. Walking
50k files of a populated java-tron worktree is <1s on a
modern SSD; NFS-pessimistic estimates can be re-litigated if
a real operator complains.
- L-1 (naming overload, four "patches" types): subjective.
- L-3 (example path assumes sibling repo): correct as-is for
the documented use case (replay tool's patches live in
tron-deployment alongside their consumer).
Schema baseline regenerated; SchemaVersion stays 1.5.0 (PR's own
bump, additive optional field within the unreleased version).
3 new test scenarios pass; `make test` + `make lint` clean.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
build.patches: []stringto intent — a list of unified-diff files trond applies to a freshgit worktreeof the resolved revision before running gradle. Closes FR-026/027/028 added in this PR's spec.md update.Motivating use case:
tools/replay/(PR #172) needs 4 java-tron source patches to skip mainnet-time-baked checks. Today the operator runsgit applymanually beforetrond build. This PR makes the patches a declarative property of the intent so the build pipeline handles them.Beyond ergonomics:
.DS_Store/ IDE temp files. Two operators with the same intent get different cache keys. With declarative patches,patch_hash = sha256(ordered patch contents)— pure function of inputs, identical on every machine.git apply.git worktree add --detachunder$TROND_STATE_DIR/builds/worktrees/<cache-key>/. The user's primary checkout'sgit statusstays clean — IDE windows don't churn, in-flight WIP is safe.Lifecycle
Cache hits skip setup/cleanup entirely. Failed patches surface as
PATCH_FAILEDwith the offending patch path andgit apply's stderr.Behavior when
build.patchesis absent/empty: unchanged. The build runs in the user's source dir directly with today's dirty-treePatchHashsemantics — preserves the iterative dev-loop where you edit source, runtrond apply, get the artifact with your edits.Schema / contract changes
intent.BuildSpecgains optionalpatches []string(FR-026)Manifest.patches(additive optional field, omitempty)schemas/output/build-{inspect,list,prune}.schema.jsondeclare the new fieldSchemaVersionbump 1.4.0 → 1.5.0 (MINOR — additive only)Patches shipped
tools/replay/patches/{01..04}-*.patch— the 4 java-tron source patches replay needs:01-skip-tx-expiration.patch—TransactionCapsule.checkExpiration→ no-op02-skip-tapos-validation.patch—Manager.processTransactionskipvalidateTapos+validateCommon03-skip-network-expiration.patch—TransactionsMsgHandlerskip network-layer expiration + its catch arm04-fast-proposals.patch—ProposalCapsule.hasExpired→ always true (optional, for faster proposal iteration)All 4 verified to apply cleanly to current java-tron HEAD (d7a90ca21) via the worktree mechanism.
Test plan
make test— cleanmake lint— cleango test ./internal/build/ -run 'TestComputePatchHash|TestValidatePatchFile|TestSetupWorktree|TestResolveBuild_Patches'— 16 sub-cases passing:TestComputePatchHash_Deterministic(4 sub) — cross-machine hash determinism; order matters; identical contents hash same regardless of filename; empty listTestValidatePatchFile(7 sub) — header sniffing accepts git/posix/svn formats, rejects yaml/binary/garbage/missingTestSetupWorktree_AppliesPatchesAndCleansUp— full lifecycle with patch application + cleanupTestSetupWorktree_BadPatchSurfacesStructuredError—PATCH_FAILEDenvelope with patch path + git stderrTestResolveBuild_PatchesOverridesPatchHash— unrelated dirty-tree noise doesn't shift the cache key whenPatchesis set (the cross-machine cache reuse invariant)trond planagainst an intent with 2 patches + java-tron source — plan resolves cleanlygit apply --checkagainst fresh java-tron worktree