feat(dds): squash-on-resubmit for non-tree DDSes#27318
Conversation
All non-tree DDSes now have explicit reSubmitSquashed overrides so
staging-mode commits (commitChanges({squash: true})) can drop
intermediate values before they reach the wire.
- SharedCell, SharedCounter, SharedMap, SharedDirectory, SharedMatrix:
content-aware per-cell / per-key LWW squash. For Map/Directory, the
staging-mode boundary may fall inside a shared PendingKeyLifetime
(pre-staging and staging sets on the same key share one lifetime).
The implementation truncates the lifetime to its pre-staging prefix
and computes the squashed final value from the staging suffix,
preserving pre-staging keySets that are still in flight at the
runtime layer.
- SharedTaskManager: explicit override delegating to reSubmitCore,
which already collapses volunteer/abandon pairs.
- Ink, ConsensusRegister, ConsensusOrderedCollection, PactMap,
legacy SharedArray, legacy SharedSignal: identity squash with
documented rationale (append-only, order-preserving, or
consensus-bound semantics).
The local-server-stress-tests harness now randomizes
commitChanges({squash}) on staging-mode exit; the 200-seed default
suite exercises end-to-end squash across all wired DDSes.
reconnectAndSquash is exported from @fluid-private/test-dds-utils for
per-DDS unit-test use. New unit tests across Cell, Counter, Map,
Directory, and Matrix cover set-then-delete, set-then-set, clear
interactions, and single-pending pass-through.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Hi! Thank you for opening this PR. Want me to review it? Based on the diff (3785 lines, 41 files), I've queued these reviewers:
How this works
|
Reworked SharedMap, SharedDirectory, and SharedCounter to address deep-review feedback: instead of a batched pendingData rewrite, the runtime walks each staged change oldest-to-newest and the DDS answers "is this change subsumed by a later staged change?" Subsumed → splice the tracking out of pendingData (targeted rollback-style cleanup); not subsumed → normal resubmit. Pre-staging ops still in flight are never touched. Counter has no subsumption (each increment is intentional intent), so its reSubmitSquashed is now identity. This fixes the bug where the prior implementation summed across pre-staging entries and could fire 0xc8f when a pre-staging increment was still in flight at squash time. SharedMap/SharedDirectory subsumption rules: - A clear is subsumed only by a later clear. - A delete for key k is subsumed by any later clear, delete for k, or lifetime for k. - A set for key k (a keySet inside a lifetime) is subsumed by any later keySet in the same lifetime, or by any later clear, delete for k, or lifetime for k. This makes the mixed-lifetime case (pre-staging and staging sets on the same key sharing one PendingKeyLifetime) fall out for free — we never wipe the lifetime, just splice out subsumed keySets. New pre-staging-in-flight regression tests for Counter, Map, and Directory cover the case that the prior reconnectAndSquash-only harness was masking. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts: # packages/dds/legacy-dds/src/signal/sharedSignal.ts # packages/dds/ordered-collection/src/consensusOrderedCollection.ts # packages/dds/register-collection/src/consensusRegisterCollection.ts # packages/test/local-server-stress-tests/src/baseModel.ts # packages/test/local-server-stress-tests/src/stressDataObject.ts
Per deep-review comment #3249886407: a staged
`set(subdir, "secret") -> deleteSubDirectory(subdir) -> commitChanges({squash: true})`
sequence still shipped "secret" on the wire. `subdir.disposed` only
flips on a *sequenced* delete; for a pending-only delete the flag
stays false, so SharedDirectory.reSubmitSquashed's old early-return
guard was skipped and the storage op fell through to reSubmitCore.
Fix: fold the subdir-reachability check into
`SubDirectory.dropIfSubsumedByLaterStorageOp`. If the subdirectory
is disposed OR no longer reachable in the optimistic view (i.e. a
pending or sequenced deleteSubDirectory has removed it from the
tree, on this subdir or any ancestor), every storage op on it is
treated as subsumed: its tracking is spliced out of
pendingStorageData and the op is dropped. `isNotDisposedAndReachable`
already consults `getWorkingDirectory(absolutePath)` which encodes
pending-delete visibility, so no separate ancestor walk is needed.
This also makes the disposed/non-disposed branches symmetric — both
splice on subsumption — so the kernel state doesn't leak entries.
New regression test: enterStaging -> set(subdir, "secret") ->
deleteSubDirectory(subdir) -> commitChanges({squash: true}) asserts
the subdir is removed on the peer and "secret" never appears as a
valueChanged event value.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per the algorithmic-complexity review concern: PendingStateManager already does an O(N) walk to drive resubmits, and squash's dropIfSubsumed did its own O(N) walk over pendingData per call — plus a nested O(K) keySets.indexOf scan inside it. In the worst case (many staged sets on one key, all in one PendingKeyLifetime), the nested scan dominates: O(K^2) just to locate the keySets. Fix: add a `lifetime: PendingKeyLifetime` back-pointer to PendingKeySet. Squash can now jump directly to the containing lifetime via `metadata.lifetime` and check tip status with a single reference comparison (`metadata === lifetime.keySets[length - 1]`). For the common non-tip case (a keySet superseded by a later keySet in the same lifetime) the entire decision is O(1) — no pendingData scan and no keySets.indexOf. The tip case and standalone clear/delete metadata still need a pendingData scan to determine ordering for the subsumption walk, so the absolute worst case is unchanged. But the many-sets-on-one-key degenerate pattern that the back-pointer specifically targets goes from O(K^2) to O(K) for the locate phase. Splice within keysets remains O(K) — a future linked-list or head-offset change in the keysets array would be needed to push the total past O(K). No behavior change; all existing tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Claude here, following up on the algorithmic-complexity concern. Inner-walk audit per DDS
Small change in 0d5957c
That collapses the locate phase for the "many sets on one key" pattern (everything in one lifetime) from O(K²) to O(K). The tip case and standalone Remaining O(N²) risk and possible follow-ups (intentionally not in this PR)The current code is still O(N²) in two narrower worst cases:
The headline guarantee — staged inserted-then-deleted secrets don't reach the wire — is unchanged by these complexity considerations. Pushed in 0d5957c; 932/932 SharedMap+SharedDirectory unit tests pass, 199/199 stress seeds pass. |
Cross-DDS audit (every op type x every code path) surfaced three actionable items: 1. SharedCell lacked the pre-staging-in-flight regression test that Counter, Map, and Directory have. Cell's per-op squash logic was already correct, but symmetry matters: add a test that does set(pre) connected, disconnect, set(secret) + delete during staging, commit-squash, and asserts (a) pre lands normally and (b) "secret" never reaches the peer. 2. ConsensusOrderedCollection.add carries a serialized user value. The identity-squash comment didn't document the `add(secret) -> acquire -> complete` leak vector. Updated the comment to call it out, matching how SharedSignal / SharedArray / ConsensusRegisterCollection document their analogous leaks. 3. SharedSequence segment-/interval-property channel is not squashed by merge-tree even when the containing op is — a pre-existing gap not introduced by this PR. Documented in the changeset as a known limitation alongside the other intentional-leak callouts. No behavior change for any DDS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Claude here — preemptive cross-DDS audit while deep review runs. Ran an op-type-by-op-type sweep across every DDS in scope: enumerate each operation interface, find every submit / process / resubmit site, and verify the squash decision against each op type. Pushed in 542ed4d. Verdicts per DDS
Actionable findings, addressed in 542ed4d
Known limitations, documented as not addressed in this PR
Test posture after this commit
|
`subdirName` is user-supplied content (e.g. user ids, tenant slugs). A staged `createSubDirectory(name) -> deleteSubDirectory(name)` pair that nets to no-op still shipped both ops on commit, transmitting the name string on the wire. The audit dismissed this as "not a leak because subdirName is just a string" — that was wrong. Acknowledging and fixing. New `SubDirectory.dropIfSubsumedSubdirOp(subdirName, opType)`: - A staged `createSubDirectory(name)` is subsumed if any later pending entry exists for the same name. If the subsumer is a `deleteSubDirectory(name)`, both entries are spliced so the subsequent delete call also drops itself. If the subsumer is another `createSubDirectory(name)` (rare), only the earlier create is spliced. - A `deleteSubDirectory(name)` whose entry was already spliced as part of a paired create drops itself. - A `deleteSubDirectory(name)` whose entry is still present (the pre-existing-subdir case) falls through to `reSubmitCore`. The name was already on the wire pre-staging, so there's no new leak. Three regression tests in `squash.spec.ts`: - `drops a staged createSubDirectory + deleteSubDirectory pair so the subdir name doesn't leak` — asserts neither subDirectoryCreated nor subDirectoryDeleted fires on the peer. - `keeps the final create when staged ops are create+delete+create on the same name` — asserts exactly one create lands. - `preserves a delete of a pre-existing subdirectory (no leak, no false subsumption)` — asserts the delete still reaches the peer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Correction to my own audit comment above. @anthony-murphy pointed out that Fixed in d2bb799. New
Three regression tests in The first test sets up a peer listener on 935/935 SharedMap+SharedDirectory unit tests pass (was 932 before this fix). Stress re-run in progress. |
Stress seed 38 caught a divergence: client-2 has 0 keys at /dir3 while
client-0 has 1. The scenario, simplified:
- Client-1 sequenced createSubDirectory("/dir3") arrives at client-2.
- Client-2 had also created a local /dir3 instance and staged a
set("prop2", "LY2c") on it, plus a deleteSubDirectory("/dir3"),
plus a paired earlier createSubDirectory.
- On commit-squash, my code spliced the staged create + delete pair
(correctly — they cancel). The staged dir3 instance is now a
"ghost": no pendingSubDirectoryData entry references it.
- For the staged set on dir3, `dropIfSubsumedByLaterStorageOp` then
called `isNotDisposedAndReachable()`, which only checks
`getWorkingDirectory(absolutePath) !== undefined`. Because client-1's
sequenced /dir3 is reachable at "/dir3", the check returned true and
the set was resubmitted — landing on the *sequenced* dir3, not the
ghost. Client-2's local view (no /dir3 at all) and the server's view
(/dir3 with prop2 from this misrouted set) diverged.
Fix: tighten the check to verify the path resolves to *this exact
SubDirectory instance*. If `getWorkingDirectory(absolutePath) !== this`,
the staged op was queued against a different (now-ghost) instance and
must be dropped — never resubmitted onto whatever instance currently
owns the path.
200/200 stress seeds pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the property leak in `regeneratePendingOp(squash=true)`. The
prior squash flag only suppressed segment **text** when an insert was
later removed locally; the **property channel** carried original
values unchanged. The pattern
`annotateRange(0, n, {k: "secret"}) -> annotateRange(0, n, {k: "public"})`
or `insertText(0, "x", {k: "secret"}) -> annotateRange(0, 1, {k: "public"})`
inside a staging session still transmitted the secret value on commit.
Implementation:
- `SegmentGroupCollection.keysAnnotatedLaterThan(localSeq)` returns the
set of property keys touched by annotate ops on this segment with
`localSeq > given`. The keys are recovered from the per-segment
entry of `SegmentGroup.previousProps` (annotate's `previousProps[i]`
has the keys touched as its own keys).
- `resetPendingDeltaToOps` ANNOTATE case with `squash=true` filters
`resetOp.props` to drop keys present in `keysAnnotatedLaterThan`.
If every key is filtered out the op is dropped entirely (no
`createAnnotateRangeOp` call).
- `resetPendingDeltaToOps` INSERT case with `squash=true` filters
`segInsertOp.properties` the same way. The insert's segment text
still flows (we're emitting the insert); only the per-key values
later overwritten by a staged annotate are stripped. If the
filter leaves no keys, `properties` is set to `undefined`.
Two regression tests added to `sharedString.spec.ts` under
"squash property channel":
- "drops a staged annotate value overridden by a later staged annotate
on the same range" — peer never sees `secret-color`; final value is
`public-color`.
- "drops a staged insert's property value overridden by a later staged
annotate" — same assertion for the insert+annotate path.
The adjust-op path (`resetOp.adjust`) and the `OBLITERATE` path are
unchanged — the former needs different semantics for numeric deltas
and the latter has its own squash story.
Interval-collection property channel (intervalCollection.ts:932-935
property-only CHANGE bypass and rebaseLocalInterval property
preservation) is still a known gap, addressed in a follow-up.
1666/1666 sequence tests pass; 1555/1555 merge-tree tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the interval-collection property leak the audit surfaced.
The prior path bypassed `rebaseLocalInterval` for property-only
CHANGEs (intervalCollection.ts:932-935 `endpointChangesNode ===
undefined ? value : rebase`), and the rebase itself preserved
`...original.properties` unchanged. Patterns like
interval.add({ ..., props: { color: "secret" } })
interval.removeIntervalById(id)
or
interval.change(id, { props: { color: "secret" } })
interval.change(id, { props: { color: "public" } })
inside a staging session still transmitted the secret value on
commit.
Implementation mirrors the merge-tree property squash:
- `IntervalAddLocalMetadata` and `IntervalChangeLocalMetadata` carry
a new `propertyKeys?: ReadonlySet<string>` recording the keys the
op submitted. Populated at submit time from `props` arg.
- `IntervalCollection.resubmitMessage` now, before deciding the
rebase path, asks two questions for ADD/CHANGE under `squash=true`:
- "Is there a later DELETE for this interval id in pending?" — if
yes, drop the op entirely.
- "Which property keys did later staged ADD/CHANGE ops on this
interval touch?" — filter those keys out of the op's
`value.properties` before the rebase/submit. If filtering
leaves no keys, `properties` becomes `undefined`.
- Two helpers (`hasLaterDeleteForInterval`, `collectLaterPropertyKeysForInterval`)
walk the per-interval `pending[id].local` DoublyLinkedList. The
filtering is non-destructive — `filterPropertiesForSquash` spreads
a new value rather than mutating the original.
DELETE ops are unchanged. CHANGE ops with endpoint changes still go
through `rebaseLocalInterval` for endpoint rebasing; the property
filter runs on the already-rebased value before submit.
Two regression tests added to the existing "squash property channel"
describe in sharedString.spec.ts:
- "drops a staged interval add subsumed by a later staged delete" —
asserts the peer's `addInterval` event never sees the secret prop.
- "drops a staged interval change's property value overridden by a
later staged change" — asserts the peer's `propertyChanged` event
never sees the intermediate prop value; final value lands.
1668/1668 sequence tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move merge-tree and sequence into the affected packages list now that property-level squash is implemented for both. Remove the sequence-property-leak entry from "known limitations". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Closing the property-channel gap the audit identified. What changedmerge-tree (40760b2)
In `Client.resetPendingDeltaToOps` under `squash=true`:
sequence (interval collection) (8c68884) `IntervalAddLocalMetadata` and `IntervalChangeLocalMetadata` now carry a `propertyKeys?: ReadonlySet` populated at submit time from the `props` arg. In `IntervalCollection.resubmitMessage` under `squash=true` for ADD or CHANGE:
The prior `endpointChangesNode === undefined` bypass that skipped `rebaseLocalInterval` entirely for property-only CHANGE ops is still in place for the rebase path, but the property filter now runs before that branch, so property-only changes get squash treatment. Regression testsFour new tests in the existing `squash property channel` describe block: ``` Each test sets up a peer listener (segment `sequenceDelta` or interval `addInterval`/`propertyChanged`) and asserts the secret value never appears in the captured stream while the final value lands correctly. Test posture
Remaining audit gaps (documented, not in this PR)The audit also flagged user-content leaks on identity-squashed DDSes where the op itself carries the leaked value and the DDS's semantics make true squash structurally harder than per-key filtering:
If any of these blocks ship, I can prioritize them next. |
Track pending ops with a FIFO of metadata records and walk the chain forward from each staged insertEntry through moveEntry hops: if the chain ends in a delete (deleteEntry or toggle(true)), splice every op in the chain so the user-supplied insert value never reaches the wire. toggleMove disables the optimization for that chain to avoid second-guessing skip-list rewiring. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds sharedArraySquash.fuzz.spec.ts wiring SharedArray into createSquashFuzzSuite. Reworks the squash chain analysis from per-call walking to a one-shot plan over pendingOps: drops are collected per-chain, then insertAfter rewrites are computed for any non-dropped insert/move whose anchor entry was dropped (by walking sharedArray backward to the nearest non-dropped entry). The plan is cached per resubmit batch and invalidated on submitArrayOp / local-ack. 19 fuzz seeds still expose multi-stage interaction edge cases (skipped); 31 seeds pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…quashFuzzSuite Adds squash fuzz specs for the three DDSes whose squash logic does real subsumption (LWW for SharedMap/Directory, per-cell for SharedMatrix). Each spec defines a poisoned-handle op, an exiting- mode generator that emits removal ops, and a validation walk that asserts no poisoned handle survives in the local view. All three suites pass at the default 50-seed budget; no production code changes required. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a squash fuzz spec for SharedCell. Cell had no prior fuzz infrastructure, so this also adds the workspace dependencies it needs (stochastic-test-utils, client-utils, runtime-utils) and a dirname.cts shim matching the pattern used by other DDS packages. Also picks up biome format passes on the map/directory/matrix squash fuzz specs from the previous commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two corrections to the squash plan: 1. The rewrite target for a non-dropped insert whose anchor was dropped must already be on the wire when this op is submitted. Walk pendingOps order: a sharedArray predecessor only qualifies if it's acked pre-staging or its pendingOps index is earlier than the op being rewritten. 2. Entries dropped by an earlier squash batch persist in sharedArray as deleted but never reach peers. A new persistent `wireBlacklist` set tracks them so subsequent cycles skip them when choosing rewrite targets — and trigger a rewrite when an op's stored `insertAfterEntryId` points into one. Clears all 19 previously-skipped seeds in the SharedArray squash fuzz suite; all 50 seeds now pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three categories of CI fixes: 1. biome format ran across sharedArray.ts (after the wire-aware rewrite refactor) and sharedArraySquash.fuzz.spec.ts. 2. flub generate assertTags reassigned three Map directory.ts shortcodes that collided with tree and presence-runtime in main, and registered new shortcodes for SharedCell, MapKernel, and tree's chunkTree/chunkedForest. 3. Reworded the squash changeset to drop terms vale's spell-check flagged (config -> configuration; changeset -> these changes; op type names lowercased and inline-coded). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rules Strip spaces around em-dashes and replace "e.g." with "for example". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The assertTag generator widened the assert line past the line limit; break it across multiple lines. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dropIfSubsumedSubdirOp matched same-name lifecycle ops by name+opType, so delete→create→delete on a pre-existing subdir and pre-staging-create + staged delete+create on the same name both wrong-paired entries — splicing a pre-staging-in-flight entry instead of its staged sibling and asserting 0xc31 on ACK. Plumb a back-pointer to the originating PendingSubDirectoryEntry through SubDirLocalOpMetadata (mirroring PendingKeySet.lifetime) and look up by reference. Refactor resubmitSubDirectoryMessage to use the same back-pointer instead of findLast by name+type. reconnectAndSquash was forcing squash=true on every pending op, so cell.spec.ts "preserves a pre-staging set" could not fail — the pre-staging set was squashed along with the staged ops. Add enterStagingMode(containerRuntime) that snapshots the pre-staging/staging boundary at disconnect; reconnectAndSquash now applies squash=true only to ops at index >= boundary, falling back to "squash all" when no boundary was recorded (preserves fuzz-harness behavior). Update the four "preserves pre-staging" tests in cell/counter/map and strengthen the Cell and Map same-key tests to also assert the pre-staging value lands on the peer (previously hollow). Add two regression tests in squash.spec.ts: staged delete→create→delete on a pre-existing subdir (Repro A) and pre-staging create + staged delete+create on the same name (Repro B). Both fail without these fixes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
IntervalCollection.resubmitMessage's squash gate covered "add" and "change"
but not "delete". A staged add({props: secret}) → removeIntervalById pair
correctly dropped the add (via hasLaterDeleteForInterval) but the paired
delete fell through to submitDelta with interval.serialize() carrying the
interval's full property bag, so the secret rode out on commit even though
ackDelete on peers only uses the intervalId. Extend the gate: drop the delete
entirely when its interval has an earlier staged add (created and removed
within the same staging batch — peers never saw it); otherwise strip the
payload's property bag down to the identity keys (intervalId,
referenceRangeLabels) so user-supplied properties on a pre-existing interval
don't ride out on a staged delete.
SharedArray.computeSquashPlan walked the entire pendingOps array, so the
walkInsertChain rooted at a pre-staging insertEntry (still in flight) pulled
a paired staged deleteEntry into the drop set and spliced both from
pendingOps. The pre-staging insert had already gone to the wire, the staged
delete was suppressed, and the entry stayed alive remotely while the local
view showed it deleted — silent data corruption. Anchor a stagingBoundaryIdx
at the index of the first op passed to reSubmitSquashed and skip chain roots
below it; pre-staging ops resubmit normally (squash=false via the new
harness, or via reSubmitCore in production) without being dropped. Reset the
boundary alongside cachedSquashPlan on new submits and local acks.
Two regression tests: the interval test captures wire ops via
containerRuntimeFactory.pushMessage and asserts the secret property never
appears in any delete payload; the SharedArray test uses enterStagingMode
plus a staged delete of a pre-staging insert and asserts both clients end
empty.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Deep ReviewReviewed commit Readiness: 5/10 — MAKING PROGRESS Not ready for sign-off. Both Tier 2 issues flagged on Path to Ready
Context for Reviewers
For human reviewer
Review history (5 prior reviews)
|
|
🔗 No broken links found! ✅ Your attention to detail is admirable. linkcheck output |
Summary
Implements
reSubmitSquashedacross all non-tree DDSes so staging-mode commits (commitChanges({ squash: true })) drop intermediate values before they reach the wire. Values written and removed within a single staging session — e.g. a sensitive string set and then deleted before commit — are no longer transmitted as part of the squashed batch. With this change, the listed DDSes no longer depend on theFluid.SharedObject.AllowStagingModeWithoutSquashingconfig flag fallback.Per-DDS treatment
SharedCell,SharedCounter,SharedMap,SharedDirectory,SharedMatrixSharedTaskManagerreSubmitCore(already collapses volunteer/abandon pairs by disconnect semantics)SharedSequence/ intervalsregeneratePendingOp(squash)Ink,ConsensusRegisterCollection,ConsensusOrderedCollection,PactMap, legacySharedArray, legacySharedSignalThe mixed-lifetime case
For Map/Directory,
PendingKeyLifetimegroups consecutive sets on the same key with no intervening delete or clear. A pre-stagingset(k, A)followed by a stagingset(k, B)lands in one lifetime, with keySet A pre-staging and keySet B staging.replayPendingStates({ committingStagedBatches: true })only replays staged batches — pre-staging messages remain in flight at the runtime layer. The naive approach of wiping the entire lifetime during squash would lose pre-staging keySet A from kernel state, so when its ACK eventually arrives, the kernel finds the squashed lifetime instead and asserts0xc07: Got a local set message we weren't expecting.The fix in
MapKernel.squashPendingDataForBatch/SubDirectory.squashPendingStorageForBatch:(lifetimeIdx, keySetIdx)pair inside a lifetime.finalKeyOpsfirst, including the mixed lifetime's staging suffix (its last keySet provides the candidate for that key, unless a later clear in the staging slice nullifies it).Order matters — computing before mutating preserves the staging-suffix value needed for LWW.
The same shape (boundary detection, suffix-aware finalKeyOps, mutate-after) is used in both
MapKernelandSubDirectory.Test coverage
squash.spec.ts(Map, Directory) andmatrix.squash.spec.ts(Matrix), plus inline blocks incell.spec.tsandcounter.spec.ts. Cover set-then-delete, set-then-set chains, clear-in-staging, delete-after-clear, single-pending pass-through.local-server-stress-testsnow plumbssquash: random.bool()throughExitStagingModeops, so the 200-seed default suite exercises end-to-end squash across all wired DDSes. The stress run is what caught the mixed-lifetime bug above — both the disposed-subdirectory guard and the keySet-boundary split were added in response to seeds it generated.reconnectAndSquashis now exported from@fluid-private/test-dds-utilsfor per-DDS unit-test use. It was previously internal to the fuzz harness.Result: 4,440 unit tests across 12 DDS packages pass, plus 200/200 stress seeds in the local-server-stress default workload.
Known scope limitations (intentional)
reSubmitCorehandles them.reSubmitCoreassumes the server has already removed the client from the queue, which is consistent with TaskManager's overall connection-bound design.insertEntryop carries the entry's user-supplied value, so an inserted-then-deleted entry within one staging session can still surface its value on commit. A full per-entry squash would mirror SharedMap's complexity in a DDS marked "not intended for use in new code" — left as a future task with a documented caveat in the code.readVersions(). Collapsing pending writes would change observable semantics; intentional no-squash.Test plan
pnpm install+fluid-buildfor every touched packagelocal-server-stress-tests200-seed default workload passes with randomizedsquashon every staging commit🤖 Generated with Claude Code