perf(cell): use DoublyLinkedList for pendingMessageIds with list node on metadata#27415
perf(cell): use DoublyLinkedList for pendingMessageIds with list node on metadata#27415anthony-murphy wants to merge 3 commits into
Conversation
…t node on metadata
|
Hi! Thank you for opening this PR. Want me to review it? Based on the diff (104 lines, 2 files), I've queued these reviewers:
How this works
|
| /** | ||
| * The node in the pending message list corresponding to this operation (op). | ||
| * Holding a direct reference to the node enables O(1) removal from arbitrary | ||
| * positions in the pending list, which is required for future squash support. |
There was a problem hiding this comment.
Deep Review: ListNode<number> leaks onto the exported ICellLocalOpMetadata here, and the new import type { ListNode } from "@fluidframework/core-utils/internal" at the top of this file couples the cell package's interfaces module to a linked-list node type that is purely a runtime implementation detail.
Containment check: ICellLocalOpMetadata is not re-exported from packages/dds/cell/src/index.ts and the only literal construction is updated in cell.ts:306-317, so this is not a public-API break today. The cost is encapsulation and maintainability — embedding ListNode<number> on a package-shaped interface makes the metadata shape harder to mock or re-implement out of tree, and your own PR self-summary flagged this same risk. PR #12273's reviewer (scarlettjlee) historically pushed back on widening this exact interface with implementation details.
Suggested fix: declare a private ICellPendingLocalOpMetadata extends ICellLocalOpMetadata inside cell.ts and use that as the concrete metadata type at the three creation/consumption sites — interfaces.ts then doesn't need the core-utils/internal import. If pendingNode must remain on the exported type for a reason not visible in the diff, mark it optional and tag it @internal. The O(1) ACK and the future O(1) arbitrary-removal capability are preserved either way.
| ): ICellLocalOpMetadata { | ||
| const pendingMessageId = ++this.messageId; | ||
| this.pendingMessageIds.push(pendingMessageId); | ||
| const { first: pendingNode } = this.pendingMessageIds.push(pendingMessageId); |
There was a problem hiding this comment.
Deep Review: DoublyLinkedList.push(...items) returns a ListNodeRange spanning all newly appended nodes (packages/common/core-utils/src/list.ts:196-205). For a single-item push, range.first === range.last, so this is correct today — but push appends at the tail, so last reads correctly as "the node I just appended." If a future caller ever batches multiple ids in one push, first would silently grab the head of the inserted range instead of the most recently appended node.
| const { first: pendingNode } = this.pendingMessageIds.push(pendingMessageId); | |
| const { last: pendingNode } = this.pendingMessageIds.push(pendingMessageId); |
…tests Address deep-review findings on PR microsoft#27415: - Stop leaking `ListNode<number>` onto `ICellLocalOpMetadata`. Move `pendingNode` to a private `ICellPendingLocalOpMetadata` subtype declared inside cell.ts. The public/internal interfaces module no longer imports from `@fluidframework/core-utils/internal`, decoupling it from a runtime implementation detail. - Use `last` instead of `first` when destructuring the single-node range returned by `DoublyLinkedList.push(...)`. They are equivalent today (single-item push), but `first` would be incorrect if a future change batched multiple pending ids into one push call. - Add two regression tests in cell.spec.ts: - drain-many-ACKs: queue several local sets and ACK them incrementally via processSomeMessages(1); asserts no assert-tag fires (0x471) and final value converges across clients in order. - rollback-with-multiple-pending-ops: enqueue three pending sets under TurnBased flush mode and run containerRuntime.rollback(), confirming the LIFO pop against the expected pending id does not throw.
Deep ReviewReviewed commit Readiness: 8/10 — ALMOST READY The Path to Ready
Context for Reviewers
For human reviewer
Review history (2 prior reviews)
|
| const { last: pendingNode } = this.pendingMessageIds.push(pendingMessageId); | ||
| const localMetadata: ICellPendingLocalOpMetadata<T> = { | ||
| pendingMessageId, | ||
| pendingNode, |
There was a problem hiding this comment.
Deep Review: PR #12273 wired applyStashedOp through createLocalOpMetadata, so stashed ops push onto pendingMessageIds and emit ICellPendingLocalOpMetadata carrying a pendingNode. The new Pending op bookkeeping describe covers direct cell.set(...) submits and LIFO rollback but doesn't exercise stash → rehydrate → ACK.
This is narrower than it might look: applyStashedOp isn't an independent producer — it delegates to set()/delete() → submitCellMessage() → createLocalOpMetadata(), the same path the new tests already cover, so the "different list instance" failure mode is ruled out by construction. The residual risk is a future regression in stash-rehydrate sequencing through processCore's ICellPendingLocalOpMetadata cast surfacing only in production.
One applyStashedOp round-trip test in the new Pending op bookkeeping describe (drive a stashed set/delete op, then ACK, assert no 0x471 and ordering preserved) closes that gap cheaply. If you prefer to defer to the squash follow-up that first reads pendingNode, note that in the PR description so the gap is intentional.
Is there an existing stash → rehydrate → ACK regression elsewhere in the DDS test corpus that exercises applyStashedOp end-to-end through the pending-id queue? If yes, this is fully addressed by reference.
| cell.get(), | ||
| undefined, | ||
| "cell should be empty after rolling back all pending sets", | ||
| ); |
There was a problem hiding this comment.
Deep Review: MockContainerRuntime.rollback is declared optional (public rollback?(): void in packages/runtime/test-runtime-utils/src/mocks.ts:449-454), so containerRuntime.rollback?.() short-circuits to undefined if the method is ever removed/renamed. assert.doesNotThrow then succeeds vacuously, and the test's whole purpose — exercising the new pop()?.data rollback path — is silently lost; the subsequent cell.get() === undefined check would fail for the wrong reason.
Guard the invocation so the test fails loudly if the mock loses the method.
| ); | |
| assert( | |
| typeof containerRuntime.rollback === "function", | |
| "MockContainerRuntime.rollback must exist for this test", | |
| ); | |
| assert.doesNotThrow( | |
| () => containerRuntime.rollback!(), | |
| "rollback should pop pending ids in LIFO order", | |
| ); |
| attribution?: AttributionKey; | ||
| } | ||
|
|
||
| /** |
There was a problem hiding this comment.
Deep Review: ICellPendingLocalOpMetadata is documented as "Kept private to this module," and its only producer is createLocalOpMetadata on SharedCell<T> (cell.ts:319-329), which threads a concrete T. The <T = any> default has no caller and forces the // eslint-disable-next-line @typescript-eslint/no-explicit-any directly above it. The pattern on SharedCell<T = any> / ISharedCell<T = any> / ICellLocalOpMetadata<T = any> exists because those are public-shaped types with external callers; this new private extension has none, and wes-carlson explicitly pushed back on any in this file during PR #10776.
Drop the = any default and the eslint suppression:
interface ICellPendingLocalOpMetadata<T> extends ICellLocalOpMetadata<T> {
pendingNode: ListNode<number>;
}
Description
Replaces
SharedCell'spendingMessageIds: number[]with aDoublyLinkedList<number>from@fluidframework/core-utils/internal, and extendsICellLocalOpMetadatato carry the inserted list node.push()returns aListNodeRange; the metadata records the inserted node aspendingNode.pendingMessageIds.first?.datahead check, thenshift()— both O(1) instead ofArray.shift()'s O(n) re-index.pop()returns the trailing node; comparenode.datato the rolled-back message id.Why
Two wins, both independent of squash:
Array.shift()is O(n) — every ACK inSharedCellre-indexes the entire pending tail today.DoublyLinkedList.shift()is O(1).localOpMetadataenables any future per-op operation (rollback, arbitrary-position drop, ...) to be O(1) vialist.remove(node)instead of anindexOf+splicescan.Notes
Pure structural prep — no
reSubmitSquashed, no squash logic, no tests touched, no api-report changes.