feat: diffing extension for comparing documents (SD-1324 and SD-89)#2306
feat: diffing extension for comparing documents (SD-1324 and SD-89)#2306luccas-harbour wants to merge 122 commits intomainfrom
Conversation
This function can then be reused when diffing paragraphs and runs. It helps identifying modifications instead of delete/insert pairs
Always maps starting/ending positions to the old document instead of the new one.
…rtedId - seed tracked-change existingIds with both runtime commentId and stable importedId - prevent duplicate tracked-change thread creation when grouped mark id matches an existing imported id - add created sync IDs (id, params.changeId, params.importedId) back into dedupe set during sync pass - add regression test for mixed-ID replay/sync scenario where commentId diverges but importedId remains live
- make useComment store docxCommentJSON as reactive state instead of a construction-time constant - update getValues() to return the current docxCommentJSON value - ensure replay-updated imported comment structure is reflected in translateCommentsForExport output - add unit test verifying getValues() returns updated docxCommentJSON after mutation
- add replay payload normalization for comment model creation (text -> commentText,
elements -> docxCommentJSON)
- apply normalization in replay ADD path before useComment(...)
- reuse the same normalization in replay UPDATE fallback when creating missing comments
- ensure replay-added imported comments keep DOCX-native body structure for export/
round-trip
- add regression test verifying replay ADD maps elements into docxCommentJSON
- map replay isDone updates to resolvedTime/resolvedBy* when payload resolved fields are null/missing - apply the same fallback during replay payload normalization and model updates - refactor shared isDone resolution fallback logic to avoid duplicated code - add regression test covering replay update payloads with isDone: true and null resolved fields, ensuring resolved state is persisted and can be exported
…ent thread - return the matched comment’s concrete id (prefer commentId) from replay update matching - avoid cross-document active-thread misselection when importedId overlaps across open documents - update replay regression coverage to assert setActiveComment receives the active document’s thread id
… reselection
- only sync active comment when activeCommentId is explicitly present in the event
payload
- avoid inferring active selection from replay add/update events to prevent repeated
focus/unfocus churn
- preserve explicit active clear behavior on replay deletions
- update replay update test expectation to reflect non-selecting replay events
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9b926bfd1d
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
caio-pizzol
left a comment
There was a problem hiding this comment.
@luccas-harbour — solid approach overall, and the test coverage for the new diffing extension is impressive (co-located tests for every algorithm and replay module, plus end-to-end coverage).
Correctness: two things to look at — heading text changes can be missed by the diff, and tracked-change comment IDs can collide across documents. one question about the run properties skip when marks change.
DX: small cleanup opportunity with duplicated helpers.
Tests: coverage is strong. only minor gaps around the belongsToDocument legacy fallback in the comments store.
left a few inline comments with details.
| } | ||
| return JSON.stringify(oldNodeInfo.node.attrs) !== JSON.stringify(newNodeInfo.node.attrs); | ||
| } | ||
|
|
There was a problem hiding this comment.
if a heading's text changes but nothing else does, the diff misses it entirely. probably rare since most headings come in as styled paragraphs, but it can happen. worth a TODO?
There was a problem hiding this comment.
We don't actually use heading nodes anymore, we only use styled paragraphs so this doesn't really happen
| // Build a Set of existing tracked-change IDs for O(1) lookup. | ||
| // Include both runtime and imported IDs to avoid duplicate threads when | ||
| // replay/import flows remap commentId but marks still reference importedId. | ||
| const existingIds = new Set(); |
There was a problem hiding this comment.
this collects IDs from all documents, but Word numbers tracked changes starting from 1 in each file. two files can share the same ID, and the second file's comments get skipped. filtering by document here would fix it.
There was a problem hiding this comment.
sounds good, I implemented this
|
|
||
| // runProperties can overlap with mark-derived formatting. Apply these paths | ||
| // only when marks are unchanged to avoid double-applying style deltas. | ||
| if (!diff.marksDiff) { |
There was a problem hiding this comment.
when marks change, all run properties are skipped to avoid applying things twice. but properties like styleId aren't marks — they get lost too. is that an accepted tradeoff for now?
There was a problem hiding this comment.
yes, that's an accepted tradeoff for now. we'll need to move away from using marks for formatting (and use runProperties directly) in order to implement this.
| * @param right Second attrs object. | ||
| * @returns True when both attrs payloads serialize identically. | ||
| */ | ||
| const deepEquals = (left: unknown, right: unknown): boolean => { |
There was a problem hiding this comment.
there are three deepEquals implementations across the diffing code (here, replay-inline.ts:361, and attributes-diffing.ts:306) — two use JSON.stringify, one does a proper comparison. lodash.isEqual is already installed. worth picking one and reusing it?
There was a problem hiding this comment.
good call. I unified those
Summary
This PR delivers an end-to-end document compare + replay workflow, including comment
diff/replay and tracked-changes integration.
What’s Included
@superdoc/super-editorwithcompareDocumentsandreplayDifferencescommands.nodes, attrs, marks, comments, styles and numbering properties.
applyTrackedChangessupport in replay.explicitly requested.
documentId/fileId) in replay comment payloads.docxCommentJSON) on add/update.docxCommentJSONis reflected bygetValues()for export.isDonefallback to resolved fields when replay payload omits explicitresolved metadata.
Tests
computeDiff).
comments, replay-attrs, marks-from-diff).
comment.test.js).
including additional cases up to 11).
replayDiffs.test.js (passing).
Notes
mark run attrs on some inline text additions. This can be addressed once we stop using marks for formatting and use run properties directly.