From b002e4dbea22fcfd7db96b4143f384a1fbe49311 Mon Sep 17 00:00:00 2001 From: Steven Obiajulu Date: Fri, 22 May 2026 02:35:42 -0400 Subject: [PATCH] refactor(verification): weaken inv_field_001 residual axiom to document-level preservation friendliness MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Closes peer-review concern D (predicate strength) from PR #208. Codex review confirmed the current engine satisfies the stronger `∀ ctx` predicate, but the ECMA-376 Part 4 deep-research report showed that a conformant emitter MUST fragment fields across wrapper boundaries when modifying a field — fragmented ``/`` wrappers containing only `w:instrText`/`w:delInstrText` are NOT `fieldContextNeutral` under `∀ ctx`. When safe-docx becomes ECMA-376-conformant (tracked in #217), the stronger predicate will break. This commit weakens the residual axiom to be future-compatible without touching the engine. ## Predicate weakening Adds `Tier2.AcceptReject.preservationFriendly` — a document-level property asserting only that accept and reject preserve the *composed* field walk and begin/end balance. Strictly weaker than the previous per-subtree `Tier2.FieldStructure.recursivelyWellformed`: * Old: `validateFieldStructure d ∧ ∀ wrapper-subtree, fieldContextNeutral subtree` where `fieldContextNeutral bs := ∀ ctx, walkBlocks (.ok ctx) bs = .ok ctx`. * New: `validateFieldStructure d ∧ walkBlocks (.ok []) (acceptBlocks d.blocks) = walkBlocks (.ok []) d.blocks ∧ ...` (analogous for reject + balance). The old predicate is retained in `Tier2.FieldStructure` and the legacy `field_structure_preserved` lemma still compiles — they remain available for audit traceability. Only `inv_field_001` is rewired to consume the new predicate via the new `Tier2.InvFieldOne.field_structure_preserved_doc` lemma. ## Axiom replacement `Spec.lean` replaces `compareDocumentXml_output_recursivelyWellformed` with `compareDocumentXml_output_preservation_friendly`. The `inv_field_001` proof body changes to compose the new axiom with `field_structure_preserved_doc`. `#print axioms LeanSpike.inv_field_001` becomes: [propext, compareDocumentXml, compareDocumentXml_output_preservation_friendly, Quot.sound] ## TS bridge — explanatory comment The existing `isFieldContextNeutral` / `assertRecursivelyWellformed` checks in `lean-spec-bridge.test.ts` enforce the OLD stricter property. With the axiom now weaker, these become an audit OVER-check: the engine currently satisfies them (whole-field wrapping), but when #217 lands and the engine starts fragmenting, these checks will need to be removed or relaxed. The comment block above the field-bearing fixtures is updated to reflect this layering. Ref: #208, #217 --- .../src/integration/lean-spec-bridge.test.ts | 26 ++++-- verification/lean/LeanSpike/Spec.lean | 91 +++++++++---------- verification/lean/Tier2/AcceptReject.lean | 31 ++++++- verification/lean/Tier2/InvFieldOne.lean | 32 +++++++ 4 files changed, 121 insertions(+), 59 deletions(-) diff --git a/packages/docx-core/src/integration/lean-spec-bridge.test.ts b/packages/docx-core/src/integration/lean-spec-bridge.test.ts index bc9e181..b633d00 100644 --- a/packages/docx-core/src/integration/lean-spec-bridge.test.ts +++ b/packages/docx-core/src/integration/lean-spec-bridge.test.ts @@ -677,18 +677,24 @@ async function assertRoundTripInvariant( } // ============================================================================= -// Field-bearing falsifiability layer for `compareDocumentXml_output_recursivelyWellformed` +// Field-bearing falsifiability layer for the Tier 2 residual axiom // -// The new Tier 2 axiom in `verification/lean/LeanSpike/Spec.lean` asserts that -// inplace comparison output satisfies `recursivelyWellformed`: the whole -// document passes `validateFieldStructure`, AND every `w:ins` / `w:del` / -// `w:moveFrom` / `w:moveTo` wrapper subtree is `fieldContextNeutral`. +// The current (post-PR-B) axiom in `verification/lean/LeanSpike/Spec.lean` is +// `compareDocumentXml_output_preservation_friendly`: it asserts only that the +// inplace combined output is *preservation-friendly* — its document-level walk +// and begin/end balance are unchanged by accept/reject. That weaker shape is +// what `assertFieldInvariant` already checks (via `validateFieldStructure` on +// the accepted and rejected outputs). // -// The fast-check generators above are field-free and only check the *consequence* -// of the axiom (validateFieldStructure post-accept/reject). The single fixture -// case below exercises a TS-side analogue of the *precondition* itself against -// the live engine — a falsifiability layer, NOT empirical grounding for a -// universal claim. +// `assertRecursivelyWellformed` (below) additionally checks the STRICTER +// `fieldContextNeutral ∀ ctx` property per wrapper subtree. The current engine +// satisfies this stronger property because it emits whole field sequences as +// single track-change wrappers (`inPlaceModifier.ts:717, 938, 1505, 1671, +// 1957, 2300`). When ECMA-376 fragmentation conformance lands (#217), +// fragmented wrapper subtrees will NOT satisfy `∀ ctx` neutrality and this +// over-check will need to be removed or relaxed. Until then it serves as an +// audit gate that the engine has not regressed into emitting partial-wrapper +// fragments unexpectedly. // ============================================================================= async function buildFieldDocx(bodyXml: string): Promise { diff --git a/verification/lean/LeanSpike/Spec.lean b/verification/lean/LeanSpike/Spec.lean index 6330f0b..bac1ed1 100644 --- a/verification/lean/LeanSpike/Spec.lean +++ b/verification/lean/LeanSpike/Spec.lean @@ -58,56 +58,51 @@ axiom extractTextWithParagraphs : OoxmlDoc → String Remains axiomatic — owned by the `inv_rt_001` successor change. -/ axiom normalizeText : String → String -/-- **Residual obligation.** This repo's inplace atomizer output (`compareDocumentXml` - in inplace mode, `pipeline.ts:635-650` then the inplace path at `pipeline.ts:669`) - is recursively well-formed under the stack-valued field context — i.e. it - satisfies `Tier2.FieldStructure.recursivelyWellformed`: the whole document - passes `validateFieldStructure`, and every `w:ins` / `w:del` / `w:moveFrom` / - `w:moveTo` wrapper subtree (transitively) is `fieldContextNeutral`. +/-- **Residual obligation.** This repo's inplace atomizer output + (`compareDocumentXml` in inplace mode, `pipeline.ts:635-650` then the + inplace path at `pipeline.ts:669`) is *preservation-friendly* under the + stack-valued field context — i.e. it satisfies + `Tier2.AcceptReject.preservationFriendly`: the whole document passes + `validateFieldStructure`, AND the document-level walk and begin/end balance + are unchanged by `accept` and `reject`. This axiom is the single load-bearing assumption behind the `inv_field_001` closure. Tier 3 will discharge it by modeling `compareDocumentXml` definitionally. - **Why `fieldContextNeutral` (the `∀ ctx` strength) holds for THIS engine.** - safe-docx's current inplace atomizer emits whole field sequences as a single - track-change wrapper (`` or `` around a complete - begin/instrText/separate/result/end run sequence), never splitting a field - across wrapper boundaries. See: - * `inPlaceModifier.ts:717` — `getAtomRuns` returns all field runs as one unit. - * `inPlaceModifier.ts:938` — deleted field replay wraps cloned field runs in a - single ``. - * `inPlaceModifier.ts:1505, 1671` — split logic explicitly skips collapsed - fields and field characters. - * `inPlaceModifier.ts:1957, 2300` — insert/move-destination handlers wrap - the whole atom-run set in a single wrapper. - * `collapsed-field-inplace.test.ts:211` — tests assert multi-run complete - field wrappers, not partial wrappers. - - **Why this is engine-specific, not spec-conformant.** Under ECMA-376 Part 4 - (DeletedFieldCode and fldChar topics), a fully conformant track-change - emitter MUST fragment fields across wrapper boundaries when a field is - modified: `w:fldChar` is strictly barred from ``, so a modified field - has its `w:fldChar begin/separate/end` markers unwrapped at the run-sibling - level while ``/`` wrap only the changed `w:instrText` / - `w:delInstrText` payloads. Such fragmented wrapper subtrees are NOT - `fieldContextNeutral` under `∀ ctx`. If/when the safe-docx engine becomes - ECMA-376-conformant for field modifications (tracked in #217), this axiom - and the per-subtree neutrality predicate will need to be replaced with a - document-level preservation property; the Lean proof in - `Tier2.InvFieldOne` will be refactored accordingly. - - Evidence as of this change is the existing field-free fast-check bridge - cases (`packages/docx-core/src/integration/lean-spec-bridge.test.ts` - explicitly excludes field-bearing inputs and only checks the *consequence* — - `validateFieldStructure` post-accept/reject — not the recursive precondition - itself) plus dedicated field-bearing fixtures added by this change as a - falsifiability layer (NUMPAGES insertion, NUMPAGES deletion). The axiom is - engine-specific to this repo's inplace atomizer, universal in `(a, b)`, and - load-bearing — NOT empirically grounded across the full ECMA-376 surface. -/ -axiom compareDocumentXml_output_recursivelyWellformed : + **Predicate strength choice — document-level, NOT per-subtree.** A previous + iteration of this axiom asserted the *strictly stronger* + `Tier2.FieldStructure.recursivelyWellformed` (per-subtree + `fieldContextNeutral ∀ ctx`). That stronger property happens to hold for the + current safe-docx engine, which emits whole field sequences as single + track-change wrappers (`inPlaceModifier.ts:717, 938, 1505, 1671, 1957, + 2300`; `collapsed-field-inplace.test.ts:211`). But ECMA-376 Part 4 requires + a conformant emitter to *fragment* fields across wrapper boundaries when a + field is modified — `w:fldChar` is strictly barred from ``, so a + modified field has its `w:fldChar begin/separate/end` markers unwrapped at + the run-sibling level while ``/`` wrap only the changed + `w:instrText` / `w:delInstrText` payloads. Such fragmented wrapper subtrees + are NOT `fieldContextNeutral` under `∀ ctx`. Engine fragmentation + conformance is tracked in #217. + + To make this axiom future-compatible with that work, we weaken the + precondition here to `preservationFriendly` (asserts only the *composed* + walk and balance equalities, not pointwise neutrality of each wrapper). + The legacy `Tier2.InvFieldOne.field_structure_preserved` lemma remains + proved (and the underlying `fieldContextNeutral` predicate still exists in + `Tier2.FieldStructure`) for audit traceability, but is no longer on the + path to `inv_field_001`. + + Evidence: the field-bearing bridge fixtures + (`packages/docx-core/src/integration/lean-spec-bridge.test.ts` — NUMPAGES + insertion, NUMPAGES deletion) check the *consequence* of + `preservationFriendly` (`validateFieldStructure` post-accept/reject), which + is what the engine actually emits today. The axiom is engine-specific to + this repo's inplace atomizer, universal in `(a, b)`, and load-bearing — + NOT empirically grounded across the full ECMA-376 surface. -/ +axiom compareDocumentXml_output_preservation_friendly : ∀ a b combined, compareDocumentXml a b = some combined → - Tier2.FieldStructure.recursivelyWellformed combined + Tier2.AcceptReject.preservationFriendly combined /-- INV-FIELD-001: field-structure preservation across accept-all and reject-all, scoped to the successful inplace-mode comparison output @@ -124,16 +119,16 @@ axiom compareDocumentXml_output_recursivelyWellformed : As of Tier 2 this theorem is **closed** with a complete machine-checked proof: it composes the named residual axiom - `compareDocumentXml_output_recursivelyWellformed` with the preservation - lemma `Tier2.InvFieldOne.field_structure_preserved`. -/ + `compareDocumentXml_output_preservation_friendly` with the document-level + preservation lemma `Tier2.InvFieldOne.field_structure_preserved_doc`. -/ theorem inv_field_001 : ∀ (a b combined : OoxmlDoc), compareDocumentXml a b = some combined → validateFieldStructure (acceptAllChanges combined) = true ∧ validateFieldStructure (rejectAllChanges combined) = true := by intro a b combined h - have hRW := compareDocumentXml_output_recursivelyWellformed a b combined h - exact Tier2.InvFieldOne.field_structure_preserved combined hRW + have hPF := compareDocumentXml_output_preservation_friendly a b combined h + exact Tier2.InvFieldOne.field_structure_preserved_doc combined hPF /-- INV-RT-001: paired round-trip text equality under normalization, with accept-all recovering `b` and reject-all recovering `a`. Scoped to the diff --git a/verification/lean/Tier2/AcceptReject.lean b/verification/lean/Tier2/AcceptReject.lean index 54cffbb..606f124 100644 --- a/verification/lean/Tier2/AcceptReject.lean +++ b/verification/lean/Tier2/AcceptReject.lean @@ -20,7 +20,7 @@ import Tier2.FieldStructure namespace Tier2.AcceptReject -open Tier2.OoxmlModel +open Tier2.OoxmlModel Tier2.FieldStructure /-! ### accept -/ @@ -84,4 +84,33 @@ def reject : Doc → Doc | [] => [] | p :: ps => ⟨p.pPr, renameBlocks (rejectBlocks p.body)⟩ :: reject ps +/-! ### Document-level preservation predicate + +`preservationFriendly` is the precondition needed to prove that `accept` and +`reject` both produce output that satisfies `validateFieldStructure`. It is +**strictly weaker** than the per-subtree `Tier2.FieldStructure.recursivelyWellformed` +(which required `∀ ctx` neutrality of every wrapper subtree): we only require +that the *composed walk* over `acceptBlocks`/`renameBlocks ∘ rejectBlocks` ends +in the same state as the walk over the original block sequence, and that +begin/end counts balance in the outputs. + +This shape is future-compatible with ECMA-376-conformant field fragmentation +(see #217): a `` wrapping only `[w:instrText]` is not neutral under +`∀ ctx`, but a document containing such a wrapper can still be +`preservationFriendly` if its overall walk is preserved by accept/reject. -/ + +/-- A document is *preservation-friendly* if accept and reject leave the + document-level field walk and begin/end balance unchanged. Replaces + `Tier2.FieldStructure.recursivelyWellformed`. -/ +def preservationFriendly (d : Doc) : Prop := + validateFieldStructure d = true ∧ + walkBlocks (.ok []) (acceptBlocks d.blocks) + = walkBlocks (.ok []) d.blocks ∧ + walkBlocks (.ok []) (renameBlocks (rejectBlocks d.blocks)) + = walkBlocks (.ok []) d.blocks ∧ + countBlocks Atom.isBegin (acceptBlocks d.blocks) + = countBlocks Atom.isEnd (acceptBlocks d.blocks) ∧ + countBlocks Atom.isBegin (renameBlocks (rejectBlocks d.blocks)) + = countBlocks Atom.isEnd (renameBlocks (rejectBlocks d.blocks)) + end Tier2.AcceptReject diff --git a/verification/lean/Tier2/InvFieldOne.lean b/verification/lean/Tier2/InvFieldOne.lean index d29888d..f1a66fd 100644 --- a/verification/lean/Tier2/InvFieldOne.lean +++ b/verification/lean/Tier2/InvFieldOne.lean @@ -420,4 +420,36 @@ theorem field_structure_preserved (d : Doc) (h : recursivelyWellformed d) : walkBlocks_rejectBlocks d.blocks hn] exact hvwalk +/-! ### Document-level preservation lemma (load-bearing as of PR-B) -/ + +/-- **`field_structure_preserved_doc`.** Document-level variant of the + preservation lemma: consumes the weaker `preservationFriendly` precondition + (which only asserts the *composed* walk and balance equalities, not + per-subtree `∀ ctx` neutrality) and concludes `validateFieldStructure` on + both `accept` and `reject` outputs. + + This is now the theorem that closes `inv_field_001` (composed with the + weakened residual axiom `compareDocumentXml_output_preservation_friendly` + in `Spec.lean`). The legacy `field_structure_preserved` above is retained as + a stronger lemma — `recursivelyWellformed d → preservationFriendly d` + follows from the lemmas above, so any caller able to discharge the stronger + precondition can also use the legacy theorem. We keep the legacy theorem + for audit traceability (`#217` engine conformance work would let us + re-establish it). -/ +theorem field_structure_preserved_doc (d : Doc) (h : preservationFriendly d) : + validateFieldStructure (accept d) = true ∧ + validateFieldStructure (reject d) = true := by + obtain ⟨hv, hAcceptWalk, hRejectWalk, hAcceptBal, hRejectBal⟩ := h + rw [validateFieldStructure, Bool.and_eq_true] at hv + obtain ⟨_, hvwalk⟩ := hv + refine ⟨?_, ?_⟩ + · rw [validateFieldStructure, Bool.and_eq_true] + refine ⟨?_, ?_⟩ + · rw [fldCharBalanced, accept_blocks, beq_iff_eq]; exact hAcceptBal + · rw [accept_blocks, hAcceptWalk]; exact hvwalk + · rw [validateFieldStructure, Bool.and_eq_true] + refine ⟨?_, ?_⟩ + · rw [fldCharBalanced, reject_blocks, beq_iff_eq]; exact hRejectBal + · rw [reject_blocks, hRejectWalk]; exact hvwalk + end Tier2.InvFieldOne