feat(inkless): KC-72 Reconcile stale records after diskless switch#612
Open
viktorsomogyi wants to merge 1 commit into
Open
feat(inkless): KC-72 Reconcile stale records after diskless switch#612viktorsomogyi wants to merge 1 commit into
viktorsomogyi wants to merge 1 commit into
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces a “consolidation reconciliation” step to prevent consolidating diskless fetchers from appending onto stale local log data that may exist above the high watermark after a classic→diskless switch.
Changes:
- Added
ConsolidationReconcilerto reconcile/truncate local logs (when needed) before starting consolidation fetchers on leader/follower transitions. - Introduced a per-partition “safe pruning floor” used to gate diskless-log pruning and avoid unsafe prune requests.
- Expanded/adjusted unit tests around consolidating diskless fetch behavior and pruning eligibility.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| core/src/main/scala/kafka/server/ReplicaManager.scala | Wires the new reconciler into leader/follower transitions before starting consolidation fetchers (and includes an unrelated FileMerger disablement). |
| core/src/main/scala/io/aiven/inkless/consolidation/ConsolidationReconciler.scala | New reconciliation logic to determine fetch start offset and optionally truncate local logs based on control-plane log info. |
| core/src/main/scala/io/aiven/inkless/consolidation/ConsolidatedDisklessLogPruner.scala | Tightens prune eligibility by filtering switch-pending partitions and requiring a “safe prune offset” from Partition. |
| core/src/main/scala/kafka/cluster/Partition.scala | Adds safeConsolidationPruningFloor with getters/setters to gate pruning. |
| core/src/main/scala/io/aiven/inkless/consolidation/ReconciliationException.scala | Adds a reconciler-specific exception type. |
| core/src/main/scala/io/aiven/inkless/consolidation/DisklessLeaderEndPoint.scala | Removes an outdated TODO comment. |
| core/src/test/scala/unit/kafka/server/ReplicaManagerTest.scala | Adds/adjusts tests to ensure unified-log reads vs diskless path selection for consolidating diskless fetches. |
| core/src/test/scala/io/aiven/inkless/consolidation/ConsolidationReconcilerTest.scala | New unit tests for reconciliation/truncation decisions and failure handling. |
| core/src/test/scala/io/aiven/inkless/consolidation/ConsolidatedDisklessLogPrunerTest.scala | Updates tests for the new prune “safe floor” behavior and switch-pending filtering. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
30d1d15 to
82d0fa7
Compare
331715d to
5f47cc2
Compare
Base automatically changed from
svv/ts-unification-delete-tiered-diskless
to
main
May 27, 2026 19:20
d09af83 to
51dadb3
Compare
Comment on lines
+117
to
+131
| // LEO >= seal. This covers both the initial switch (LEO == seal, nothing consolidated | ||
| // yet) and resuming an already-progressed partition after a restart, leadership | ||
| // failover, or reassignment (the local log either kept its consolidated frontier or | ||
| // was rehydrated from tiered storage). In every case we resume from the current local | ||
| // LEO so we never re-consolidate or skip data the local log already holds. | ||
| // | ||
| // The prune floor is the higher of the seal and the current log start offset: | ||
| // - at first switch logStartOffset is still the classic prefix start, so the floor is | ||
| // the seal, which blocks pruning the diskless region until consolidation has tiered | ||
| // past the boundary; | ||
| // - on resume logStartOffset has advanced past the seal as consolidated segments were | ||
| // tiered and deleted, so it reflects real pruning progress. | ||
| val pruneFloor = math.max(seal, log.logStartOffset) | ||
| partition.maybeAdvanceConsolidationPruneFloor(pruneFloor) | ||
| ConsolidationStartState.Ready(log.logEndOffset) |
Contributor
Author
There was a problem hiding this comment.
This one is valid, but would require us to track epoch possibly or persist the consolidation frontier. Will defer to a follow-up PR.
51dadb3 to
58dfd23
Compare
When a partition switches to diskless, a replica can still hold uncommitted records above the high watermark. If the switch is quick and consolidation is enabled, the consolidation fetchers would start appending on top of that stale tail. This commit implements the following: - Truncate a newly switched partition's local log down to the just-committed seal (classic-to-diskless start offset). This runs after makeLeader/makeFollower and before any fetcher starts, so catch-up and consolidation always initialize from the truncated LEO. - Add ConsolidationReconciler to decide when a consolidating partition joins the consolidation fetcher, covering both the initial switch and resume after failover, restart, or reassignment. ReplicaManager now delegates consolidation fetcher startup to it. - Hand off from the classic ReplicaFetcher to consolidation once a follower reaches the seal, via the fetcher's self-eviction path. - Gate pruning of switched diskless partitions behind a prune floor and skip partitions whose switch is still pending; born-consolidated partitions keep pruning directly. - Transform the per-partition diskless watermark into a single monotonic safeConsolidationPruningFloor (gate + progress tracker), replacing lastAppliedDisklessLogStartOffset.
58dfd23 to
a366d0d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When a partition switches to diskless, a replica can still hold uncommitted records above the high watermark. If the switch is quick and consolidation is enabled, the consolidation fetchers would start appending on top of that stale tail. This commit implements the following: