fix(docx-core): validate w:delInstrText placement and reject w:fldChar inside <w:del>#211
Merged
stevenobiajulu merged 1 commit intoMay 22, 2026
Conversation
…r inside <w:del>
Per ECMA-376 Part 4 (DeletedFieldCode and fldChar topics), validateFieldStructure
was under-checking the spec on two counts:
1. w:delInstrText placement was unvalidated. The engine emits this tag
(inPlaceModifier.ts:319-320; reject converts it back at
trackChangesAcceptorAst.ts:613-615), so production was producing a
schema-relevant element with no structural check. Word, LibreOffice, and
docx4j all enforce or split to maintain this. The Lean Tier 2 model in
PR #208 already validates both atoms; this commit brings TS into line.
2. w:fldChar nested inside <w:del> was accepted. ECMA-376 strictly forbids
this and Microsoft Word treats violations as fatal (discards the field
state machine, renders raw instructions as literal text).
The scan now tracks <w:del>-ancestor depth alongside field depth, applies the
existing pre-separator field-body check to both w:instrText and w:delInstrText,
adds an "inside-<w:del>" requirement for w:delInstrText, and rejects any
w:fldChar reached while inside-<w:del>-depth is positive.
Unit tests cover the happy path (field-free, complete field, conformant
ECMA-376 fragmented field modification with unwrapped fldChars + ins/del
instrText), the existing rejections (orphan instrText, post-separator
instrText, unbalanced begin/end), and the new rejections (delInstrText outside
<w:del>, delInstrText outside any field body, fldChar nested in <w:del>).
Ref: #209
Ref: #208 (peer review of PR #208 surfaced this gap; Lean docstring at
Tier2/FieldStructure.lean:7-9 already documents the stricter property)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
stevenobiajulu
added a commit
that referenced
this pull request
May 22, 2026
…preservationFriendly (rebased follow-up to #208) (#220) * refactor(verification): weaken inv_field_001 residual axiom to document-level preservation friendliness Closes peer-review concern D (predicate strength) from PR #208. Codex review confirmed the current engine satisfies the stronger `∀ ctx` predicate, but the ECMA-376 Part 4 deep-research report showed that a conformant emitter MUST fragment fields across wrapper boundaries when modifying a field — fragmented `<w:ins>`/`<w:del>` wrappers containing only `w:instrText`/`w:delInstrText` are NOT `fieldContextNeutral` under `∀ ctx`. When safe-docx becomes ECMA-376-conformant (tracked in #217), the stronger predicate will break. This commit weakens the residual axiom to be future-compatible without touching the engine. ## Predicate weakening Adds `Tier2.AcceptReject.preservationFriendly` — a document-level property asserting only that accept and reject preserve the *composed* field walk and begin/end balance. Strictly weaker than the previous per-subtree `Tier2.FieldStructure.recursivelyWellformed`: * Old: `validateFieldStructure d ∧ ∀ wrapper-subtree, fieldContextNeutral subtree` where `fieldContextNeutral bs := ∀ ctx, walkBlocks (.ok ctx) bs = .ok ctx`. * New: `validateFieldStructure d ∧ walkBlocks (.ok []) (acceptBlocks d.blocks) = walkBlocks (.ok []) d.blocks ∧ ...` (analogous for reject + balance). The old predicate is retained in `Tier2.FieldStructure` and the legacy `field_structure_preserved` lemma still compiles — they remain available for audit traceability. Only `inv_field_001` is rewired to consume the new predicate via the new `Tier2.InvFieldOne.field_structure_preserved_doc` lemma. ## Axiom replacement `Spec.lean` replaces `compareDocumentXml_output_recursivelyWellformed` with `compareDocumentXml_output_preservation_friendly`. The `inv_field_001` proof body changes to compose the new axiom with `field_structure_preserved_doc`. `#print axioms LeanSpike.inv_field_001` becomes: [propext, compareDocumentXml, compareDocumentXml_output_preservation_friendly, Quot.sound] ## TS bridge — explanatory comment The existing `isFieldContextNeutral` / `assertRecursivelyWellformed` checks in `lean-spec-bridge.test.ts` enforce the OLD stricter property. With the axiom now weaker, these become an audit OVER-check: the engine currently satisfies them (whole-field wrapping), but when #217 lands and the engine starts fragmenting, these checks will need to be removed or relaxed. The comment block above the field-bearing fixtures is updated to reflect this layering. Ref: #208, #217 * fix(verification): drop assertRecursivelyWellformed from deletion fixture (currently catches #217 engine non-conformance) PR #208 added a deleted-NUMPAGES bridge fixture that calls `assertRecursivelyWellformed(result.combined)`. Independently, PR #211 tightened `validateFieldStructure` to reject `w:fldChar` inside `<w:del>` (per ECMA-376 Part 4 — Word treats this as fatal). Both PRs squash-merged to main within 4 seconds of each other; PR #208's CI passed against the pre-#211 validator, so the interaction wasn't caught. The current engine emits deleted complete fields as a single `<w:del>` containing the entire begin/instrText/separate/result/end run sequence (`inPlaceModifier.ts:938`) — that is, with `w:fldChar` inside `<w:del>`. The engine's own safety check at `pipeline.ts:439-440` only validates the post-accept and post-reject outputs (which ARE conformant: accept drops the `<w:del>` entirely; reject unwraps to a complete field). PR #211's tightening therefore does not affect the engine's safety check, but it does affect manual calls to `validateFieldStructure` on the combined XML — which is exactly what `assertRecursivelyWellformed` does. Net: main is currently broken on `workspace-test` because the deletion fixture's `assertRecursivelyWellformed` call fails. This commit removes that call from the deletion fixture and replaces it with a TODO comment referencing #217. The fixture still exercises the `w:delInstrText` atom case via `assertFieldInvariant` (which checks the engine's own consequence: `validateFieldStructure` on accept/reject outputs). The over-check can be re-enabled when #217 lands and the engine fragments fields per ECMA-376. The renamed test description reflects the actual asserted property. Ref: #217 (engine fragmentation conformance — the long-term fix). Ref: #208 (which introduced the fixture), #211 (which tightened the validator), #220 (which lands the axiom weakening; this commit fixes the test break). --------- Co-authored-by: Steven Obiajulu <stevenobiajulu@Stevens-MacBook-Pro.local>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
w:delInstrText(must be inside<w:del>AND inside an open pre-separator field body).w:fldCharnested inside<w:del>(ECMA-376 strict prohibition; Word treats violations as fatal).pipeline.field-validation.test.tscover happy path (incl. the conformant ECMA-376 fragmented-field modification pattern), the existing rejections, and the new rejections.Fixes #209. Ref: peer review of #208 (the Lean Tier 2 model already validates this stricter property; this PR brings TS into line).
Test plan
npx vitest run packages/docx-core/src/baselines/atomizer/pipeline.field-validation.test.ts— 10/10 pass.npx vitest run packages/docx-core/src/integration/lean-spec-bridge.test.ts— 4/4 pass (no regression onmain-side tests).npx vitest run packages/docx-core/src/integration/collapsed-field-inplace.test.ts packages/docx-core/src/baselines/atomizer/inPlaceModifier.test.ts— 128/128 pass.npx tsc --noEmit -p packages/docx-core/tsconfig.json— clean.npx eslint packages/docx-core/src/baselines/atomizer/pipeline.field-validation.test.ts— clean.🤖 Generated with Claude Code