Batch MPP claims into single ChannelMonitorUpdate by Abeeujah · Pull Request #4552 · lightningdevkit/rust-lightning

Abeeujah · 2026-04-11T17:42:15Z

When multiple parts of an MPP payment arrive on the same channel, claim them via the holding cell so they are released as one combined ChannelMonitorUpdate (every PaymentPreimage step plus a single commitment_signed) instead of one round-trip per part.

claim_payment_internal now forces the holding-cell path whenever there is more than one source, records each touched channel, and flushes them after queueing. The single-source case keeps the existing inline fast path.

FundedChannel::get_update_fulfill_htlc gains a force_holding_cell flag that suppresses the per-claim ChannelMonitorUpdate so the flush can emit one combined update. When the holding cell cannot be freed immediately (awaiting RAA, monitor update in progress, disconnected, quiescent, ...), build_preimage_only_monitor_update writes a one-step preimage update so the preimage stays durable across restarts; the commitment_signed follows when the holding cell is naturally flushed.

Update test_single_channel_multiple_mpp, test_simple_mpp, and auto_retry_partial_failure to reflect the new single-round-trip behavior, and drop the threaded reproducer that targeted the old per-claim path.

Tests has been updated to reflect this new batching of MPP claims

test_single_channel_multiple_mpp
auto_retry_partial_failure
test_keysend_dup_hash_partial_mpp

closes #3986

ldk-reviews-bot · 2026-04-11T17:42:21Z

👋 I see @jkczyz was un-assigned.
If you'd like another reviewer assignment, please click here.

ldk-claude-review-bot · 2026-04-11T17:54:48Z

+							// Register a completion action and RAA blocker for each
+							// successfully claimed (non-duplicate) HTLC.
+							for (idx, value_opt) in htlc_value_msat.iter().enumerate() {
+								let this_mpp_claim =
+									pending_mpp_claim_ptr_opt.as_ref().map(|pending_mpp_claim| {
+										let claim_ptr =
+											PendingMPPClaimPointer(Arc::clone(pending_mpp_claim));
+										(counterparty_node_id, chan_id, claim_ptr)
+									});
+								let raa_blocker =
+									pending_mpp_claim_ptr_opt.as_ref().map(|pending_claim| {
+										RAAMonitorUpdateBlockingAction::ClaimedMPPPayment {
+											pending_claim: PendingMPPClaimPointer(Arc::clone(
+												pending_claim,
+											)),
+										}
+									});
+								let definitely_duplicate = value_opt.is_none();
+								debug_assert!(
+									!definitely_duplicate || idx > 0,
+									"First HTLC in batch should not be a duplicate"
+								);
+								if !definitely_duplicate {
+									let action = MonitorUpdateCompletionAction::PaymentClaimed {
+										payment_hash,
+										pending_mpp_claim: this_mpp_claim,
+									};
+									log_trace!(logger, "Tracking monitor update completion action for batch claim: {:?}", action);
+									peer_state
+										.monitor_update_blocked_actions
+										.entry(chan_id)
+										.or_insert(Vec::new())
+										.push(action);
+								}
+								if let Some(raa_blocker) = raa_blocker {
+									if !definitely_duplicate {
+										peer_state
+											.actions_blocking_raa_monitor_updates
+											.entry(chan_id)
+											.or_insert_with(Vec::new)
+											.push(raa_blocker);
+									}
+								}
+							}


Bug: This loop registers one PaymentClaimed completion action and one ClaimedMPPPayment RAA blocker per non-duplicate HTLC, but they all map to a single combined ChannelMonitorUpdate. When that monitor update completes, handle_monitor_update_completion_actions processes all N actions simultaneously:

The first PaymentClaimed action's retain loop finds all N RAA blockers (they share the same PendingMPPClaimPointer via Arc::ptr_eq). Each blocker is processed in sequence—after the first one transitions the channel in channels_without_preimage → channels_with_preimage, the remaining N-1 blockers each re-check channels_without_preimage.is_empty() and push duplicates into freed_channels.

Result: N entries in freed_channels, causing handle_monitor_update_release to be called N times for the same channel. This is redundant in the happy path but could prematurely unblock other channels' blocked monitor updates in edge cases (multi-channel MPP where only this channel has batched parts).

The remaining N-1 PaymentClaimed actions find no blockers (already drained by the first action) and no pending_claiming_payments entry (already removed by the first action), so they're no-ops—but they still churn through the completion handler.

Fix: Register exactly ONE PaymentClaimed action and ONE RAA blocker for the entire batch, matching the single ChannelMonitorUpdate. Something like:

// One action + one blocker for the entire batch let this_mpp_claim = pending_mpp_claim_ptr_opt.as_ref().map(|pending_mpp_claim| { (counterparty_node_id, chan_id, PendingMPPClaimPointer(Arc::clone(pending_mpp_claim))) }); let raa_blocker = pending_mpp_claim_ptr_opt.as_ref().map(|pending_claim| { RAAMonitorUpdateBlockingAction::ClaimedMPPPayment { pending_claim: PendingMPPClaimPointer(Arc::clone(pending_claim)), } }); let action = MonitorUpdateCompletionAction::PaymentClaimed { payment_hash, pending_mpp_claim: this_mpp_claim, }; peer_state.monitor_update_blocked_actions.entry(chan_id).or_insert(Vec::new()).push(action); if let Some(raa_blocker) = raa_blocker { peer_state.actions_blocking_raa_monitor_updates.entry(chan_id).or_insert_with(Vec::new).push(raa_blocker); }

ldk-claude-review-bot · 2026-04-11T17:54:59Z

+						UpdateFulfillsCommitFetch::AllDuplicateClaims {} => {
+							debug_assert!(false, "We shouldn't claim duplicatively from a payment");
+						},


Bug: The AllDuplicateClaims case only fires a debug_assert but doesn't handle startup replay. Compare with the single-HTLC path in claim_mpp_part (the UpdateFulfillCommitFetch::DuplicateClaim branch, ~lines 9768-9860), which does significant work during startup:

Re-registers RAA blockers that may not yet be present if the monitor update hadn't completed before shutdown (lines 9784-9789).

Queues or immediately processes the PaymentClaimed completion action so the PaymentClaimed event can be generated during replay (lines 9791-9855).

During startup replay, all claims in a batch can be duplicates if the ChannelMonitorUpdate with the preimages had already been applied before shutdown but the PaymentClaimed event wasn't generated yet. In that scenario, this code path silently drops the completion action, potentially causing:

The PaymentClaimed event to never be emitted.

RAA blockers to not be re-registered, allowing premature RAA processing.

ldk-claude-review-bot · 2026-04-11T17:55:06Z

+					if combined_monitor_update.update_id == 0 {
+						combined_monitor_update.update_id = monitor_update.update_id;


Nit: Using update_id == 0 as a sentinel to detect the first NewClaim works because valid monitor update IDs start at 1, but this is an implicit invariant. Consider using an Option<u64> or a dedicated first_update_id variable to make this more robust and self-documenting.

ldk-claude-review-bot · 2026-04-11T17:55:16Z

+			for (_, _, group) in grouped_sources {
+				if group.len() == 1 {
+					// Single HTLC on this channel, use existing path.
+					let htlc = group.into_iter().next().unwrap();
+					let this_mpp_claim = pending_mpp_claim_ptr_opt.as_ref().map(|pending_mpp_claim| {
+						let counterparty_id = htlc.prev_hop.counterparty_node_id.expect("Prior to upgrading to LDK 0.1, all pending HTLCs forwarded by LDK 0.0.123 or before must be resolved. It appears at least one claimable payment was not resolved. Please downgrade to LDK 0.0.125 and resolve the HTLC by claiming the payment prior to upgrading.");
 						let claim_ptr = PendingMPPClaimPointer(Arc::clone(pending_mpp_claim));
 						(counterparty_id, htlc.prev_hop.channel_id, claim_ptr)
 					});
-				let raa_blocker = pending_mpp_claim_ptr_opt.as_ref().map(|pending_claim| {
-					RAAMonitorUpdateBlockingAction::ClaimedMPPPayment {
-						pending_claim: PendingMPPClaimPointer(Arc::clone(pending_claim)),
-					}
-				});
+					let raa_blocker = pending_mpp_claim_ptr_opt.as_ref().map(|pending_claim| {
+						RAAMonitorUpdateBlockingAction::ClaimedMPPPayment {
+							pending_claim: PendingMPPClaimPointer(Arc::clone(pending_claim)),
+						}
+					});

-				// Create new attribution data as the final hop. Always report a zero hold time, because reporting a
-				// non-zero value will not make a difference in the penalty that may be applied by the sender. If there
-				// is a phantom hop, we need to double-process.
-				let attribution_data =
-					if let Some(phantom_secret) = htlc.prev_hop.phantom_shared_secret {
-						let attribution_data =
-							process_fulfill_attribution_data(None, &phantom_secret, 0);
-						Some(attribution_data)
-					} else {
-						None
-					};
+					let attribution_data =
+						if let Some(phantom_secret) = htlc.prev_hop.phantom_shared_secret {
+							let attribution_data =
+								process_fulfill_attribution_data(None, &phantom_secret, 0);
+							Some(attribution_data)
+						} else {
+							None
+						};

-				let attribution_data = process_fulfill_attribution_data(
-					attribution_data,
-					&htlc.prev_hop.incoming_packet_shared_secret,
-					0,
-				);
+					let attribution_data = process_fulfill_attribution_data(
+						attribution_data,
+						&htlc.prev_hop.incoming_packet_shared_secret,
+						0,
+					);

-				self.claim_funds_from_hop(
-					htlc.prev_hop,
-					payment_preimage,
-					payment_info.clone(),
-					Some(attribution_data),
-					|_, definitely_duplicate| {
-						debug_assert!(
-							!definitely_duplicate,
-							"We shouldn't claim duplicatively from a payment"
-						);
-						(
-							Some(MonitorUpdateCompletionAction::PaymentClaimed {
-								payment_hash,
-								pending_mpp_claim: this_mpp_claim,
-							}),
-							raa_blocker,
-						)
-					},
-				);
+					self.claim_funds_from_hop(
+						htlc.prev_hop,
+						payment_preimage,
+						payment_info.clone(),
+						Some(attribution_data),
+						|_, definitely_duplicate| {
+							debug_assert!(
+								!definitely_duplicate,
+								"We shouldn't claim duplicatively from a payment"
+							);
+							(
+								Some(MonitorUpdateCompletionAction::PaymentClaimed {
+									payment_hash,
+									pending_mpp_claim: this_mpp_claim,
+								}),
+								raa_blocker,
+							)
+						},
+					);
+				} else {
+					// Multiple HTLCs on the same channel, batch into a single commitment.
+					self.claim_batch_funds_from_channel(
+						group,
+						payment_preimage,
+						payment_hash,
+						payment_info.clone(),
+						&pending_mpp_claim_ptr_opt,
+					);
+				}


The grouping logic uses group.len() == 1 to decide between the single-HTLC path and the batch path. For a single-source payment (sources.len() == 1), pending_mpp_claim_ptr_opt is None (set at line 9407). In that case the single HTLC takes the group.len() == 1 branch, which is correct.

However, for a multi-source MPP where each source happens to be on a different channel (no channel has >1 HTLC), every group has len==1, so ALL HTLCs go through the single-HTLC path. This means the batch path is only exercised when ≥2 parts of the same MPP arrive on the same channel—which is likely the intent, but worth documenting explicitly in a comment for future readers, since the grouping loop above suggests it handles all sources.

ldk-claude-review-bot · 2026-04-11T17:55:26Z

+		} else {
+			let blocked_upd = self.context.blocked_monitor_updates.get(0);
+			let new_mon_id = blocked_upd
+				.map(|upd| upd.update.update_id)
+				.unwrap_or(combined_monitor_update.update_id);
+			combined_monitor_update.update_id = new_mon_id;
+			for held_update in self.context.blocked_monitor_updates.iter_mut() {
+				held_update.update.update_id += 1;
+			}
+
+			// Reset latest_monitor_update_id before building a new commitment so its ID is consecutive.
+			self.context.latest_monitor_update_id = self
+				.context
+				.blocked_monitor_updates
+				.last()
+				.map(|upd| upd.update.update_id)
+				.unwrap_or(combined_monitor_update.update_id);
+
+			if any_not_blocked {
+				debug_assert!(false, "If there is a pending blocked monitor we should have MonitorUpdateInProgress set");
+				let update = self.build_commitment_no_status_check(logger);
+				self.context.blocked_monitor_updates.push(PendingChannelMonitorUpdate { update });
+			}


The else branch handles two distinct cases conflated together:

!release_cs_monitor (blocked monitor updates exist) — needs to shift IDs

release_cs_monitor && !any_not_blocked (no blocked updates, but all claims were put in holding cell) — no shifts needed

For case 2, blocked_monitor_updates is empty, so the for-loop and the last() call are no-ops, and latest_monitor_update_id is set to combined_monitor_update.update_id. This works correctly but is non-obvious. A comment or separate handling would make the two cases clearer.

Also: self.context.blocked_monitor_updates.get(0) should use .first() per idiomatic Rust.

ldk-claude-review-bot · 2026-04-11T17:55:35Z

+	nodes[7].node.peer_disconnected(nodes[2].node.get_our_node_id());
+	nodes[7].node.peer_disconnected(nodes[3].node.get_our_node_id());
+	nodes[7].node.peer_disconnected(nodes[4].node.get_our_node_id());
+	nodes[7].node.peer_disconnected(nodes[5].node.get_our_node_id());


After delivering all 6 update_fulfill_htlc messages, the test asserts 6 PaymentForwarded events but doesn't validate their contents (e.g. which upstream channel each forward corresponds to, or fee amounts). The old test checked expect_payment_forwarded! with specific source nodes and fee values (Some(1000)). Consider adding at least a fee check to ensure the batch claim correctly reports forwarding fees.

ldk-claude-review-bot · 2026-04-11T17:56:05Z

Review Summary

Issues Found

Bug (medium severity):

channelmanager.rs:9901-9929 — Each queued HTLC in a batch registers its own PaymentClaimed completion action and ClaimedMPPPayment RAA blocker, but they all map to a single combined ChannelMonitorUpdate. This causes N redundant handle_monitor_update_release calls when the update completes and could prematurely unblock downstream channels in multi-channel MPP edge cases. Should be exactly ONE action + ONE blocker per combined update.

Missing test coverage:

channel.rs:7875-7917 — build_preimage_only_monitor_update (the deferred flush path when the channel can't generate a new commitment) has zero test coverage. All existing tests exercise only the happy path where the holding cell is immediately flushed.

Nit:

chanmon_update_fail_tests.rs:4661 — Typo: "ariving" → "arriving"

Prior Review Issues Status

All previously flagged critical issues (missing monitor_updating_paused panic, multiple actions per batch, all-duplicates startup replay) appear to have been addressed in this revision, except for the N-actions-per-batch issue which remains.

Cross-cutting Observations

The PR description claims test_keysend_dup_hash_partial_mpp was updated, but no changes to that test appear in the diff.
The core batching logic (claim loop → holding cell → flush) is correct for the main code paths. The ChannelMonitor preimage idempotency guarantees make the approach safe even when redundant preimage steps are produced.
Lock ordering is correct: per_peer_state read lock and peer_state_lock are properly acquired and released between the claim loop and the flush loop, avoiding deadlocks.

ldk-claude-review-bot · 2026-04-12T09:31:51Z

+							// Register a completion action and RAA blocker for each
+							// successfully claimed (non-duplicate) HTLC.


Nit: This comment says "for each successfully claimed (non-duplicate) HTLC" but the code registers exactly one completion action and one RAA blocker for the entire batch (i.e., per combined ChannelMonitorUpdate). This is actually the correct behavior — one action per monitor update — but the comment is misleading and could confuse future readers.

Suggested change

// Register a completion action and RAA blocker for each

// successfully claimed (non-duplicate) HTLC.

// Register a single completion action and RAA blocker for the

// combined ChannelMonitorUpdate covering all claims in this batch.

ldk-claude-review-bot · 2026-04-12T09:32:04Z

+			let mut additional_update = self.build_commitment_no_status_check(logger);
+			// build_commitment_no_status_check may bump latest_monitor_id but we want them
+			// to be strictly increasing by one, so reset it here.
+			self.context.latest_monitor_update_id = combined_monitor_update.update_id;


Nit: This comment was copied from the single-HTLC version and only mentions build_commitment_no_status_check, but in the batch case the N individual get_update_fulfill_htlc calls also bumped latest_monitor_update_id. The reset here discards N+1 ID increments (N preimage bumps + 1 commitment bump), not just the commitment bump. Consider updating the comment for accuracy:

Suggested change

let mut additional_update = self.build_commitment_no_status_check(logger);

// build_commitment_no_status_check may bump latest_monitor_id but we want them

// to be strictly increasing by one, so reset it here.

self.context.latest_monitor_update_id = combined_monitor_update.update_id;

// The N calls to get_update_fulfill_htlc and build_commitment_no_status_check

// each bumped latest_monitor_update_id, but we merged everything into a single

// ChannelMonitorUpdate with the first ID, so reset to that ID.

self.context.latest_monitor_update_id = combined_monitor_update.update_id;

ldk-claude-review-bot · 2026-04-12T09:32:20Z

+		// Channel is closed, fall back to per-HTLC claiming against the closed channel monitor.
+		mem::drop(peer_state_lock);
+		mem::drop(per_peer_state);
+		for htlc in htlcs {
+			let this_mpp_claim = pending_mpp_claim_ptr_opt.as_ref().map(|pending_mpp_claim| {
+				let claim_ptr = PendingMPPClaimPointer(Arc::clone(pending_mpp_claim));
+				(counterparty_node_id, htlc.prev_hop.channel_id, claim_ptr)
+			});
+			let raa_blocker = pending_mpp_claim_ptr_opt.as_ref().map(|pending_claim| {
+				RAAMonitorUpdateBlockingAction::ClaimedMPPPayment {
+					pending_claim: PendingMPPClaimPointer(Arc::clone(pending_claim)),
+				}
+			});
+
+			let attribution_data = if let Some(phantom_secret) = htlc.prev_hop.phantom_shared_secret
+			{
+				Some(process_fulfill_attribution_data(None, &phantom_secret, 0))
+			} else {
+				None
+			};
+
+			let attribution_data = process_fulfill_attribution_data(
+				attribution_data,
+				&htlc.prev_hop.incoming_packet_shared_secret,
+				0,
+			);
+
+			self.claim_funds_from_hop(
+				htlc.prev_hop,
+				payment_preimage,
+				payment_info.clone(),
+				Some(attribution_data),
+				|_, definitely_duplicate| {
+					debug_assert!(
+						!definitely_duplicate,
+						"We shouldn't claim duplicatively from a payment"
+					);
+					(
+						Some(MonitorUpdateCompletionAction::PaymentClaimed {
+							payment_hash,
+							pending_mpp_claim: this_mpp_claim,
+						}),
+						raa_blocker,
+					)
+				},
+			);
+		}


This closed-channel fallback calls claim_funds_from_hop per HTLC, which registers a separate PaymentClaimed action + ClaimedMPPPayment RAA blocker for each HTLC. Since all N HTLCs are on the same channel, PendingMPPClaim.channels_without_preimage has only ONE entry for this channel. When the first monitor update completes, its PaymentClaimed action's retain loop in handle_monitor_update_completion_actions processes ALL N blockers (they share the same PendingMPPClaimPointer via Arc::ptr_eq). The first blocker empties channels_without_preimage, then all N blockers push duplicate entries to freed_channels (since is_empty() is true for each).

This causes handle_monitor_update_release to be called N times redundantly for the same channel. This is a pre-existing behavior (the original per-HTLC path had the same issue), but since you're explicitly handling the batch case for open channels, you could apply the same "single action + single blocker" pattern here for consistency and correctness:

// Channel is closed. Register one combined action + blocker, then claim each HTLC individually. let this_mpp_claim = pending_mpp_claim_ptr_opt.as_ref().map(|pending_mpp_claim| { (counterparty_node_id, chan_id, PendingMPPClaimPointer(Arc::clone(pending_mpp_claim))) }); let raa_blocker = pending_mpp_claim_ptr_opt.as_ref().map(|pending_claim| { RAAMonitorUpdateBlockingAction::ClaimedMPPPayment { pending_claim: PendingMPPClaimPointer(Arc::clone(pending_claim)), } }); // claim_funds_from_hop for each, but only the first carries the action/blocker

Not blocking, but worth considering for a follow-up.

ldk-reviews-bot · 2026-04-13T17:53:33Z

🔔 1st Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

Abeeujah · 2026-04-16T15:13:54Z

let _ = htlc_value_msat

It's intentionally left unused right now, pending any design decision that would be made for the msat values that field holds, in the case of a single claim, there's a completion_action function computing with the value, In this case, it is a batch claim, so it's open ended to how batch claims should work

Do we sum all since they were claimed at a single swoop and pass to completion_action
Should a Batch Claim Event be implemented as there is for a single claim

This is why that field exist, the absense of the direction it should take is the reason it is intentionally currently left unused.

codecov · 2026-04-16T16:09:03Z

Codecov Report

❌ Patch coverage is 88.33333% with 21 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.13%. Comparing base (44828f7) to head (dddd924).
⚠️ Report is 6 commits behind head on main.

Files with missing lines	Patch %	Lines
lightning/src/ln/channel.rs	79.68%	12 Missing and 1 partial ⚠️
lightning/src/ln/channelmanager.rs	93.10%	5 Missing and 3 partials ⚠️

Additional details and impacted files

@@           Coverage Diff            @@
##             main    #4552    +/-   ##
========================================
  Coverage   86.12%   86.13%            
========================================
  Files         157      157            
  Lines      108922   109117   +195     
  Branches   108922   109117   +195     
========================================
+ Hits        93812    93985   +173     
- Misses      12495    12510    +15     
- Partials     2615     2622     +7

Flag	Coverage Δ
tests	`86.13% <88.33%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ldk-reviews-bot · 2026-04-16T22:09:27Z

🔔 1st Reminder

Hey @TheBlueMatt! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

ldk-reviews-bot · 2026-04-20T00:00:42Z

🔔 2nd Reminder

Hey @TheBlueMatt! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

ldk-reviews-bot · 2026-04-22T00:01:33Z

🔔 3rd Reminder

Hey @TheBlueMatt! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

TheBlueMatt

Hmm, would this not be simpler by using the holding cell instead? When we're failing, we call queue_fail_htlc which just shoves the failure into the holding cell then call check_free_peer_holding_cells to release them all at once. Rather than rebuilding that logic for claims, we should be able to do something similar. We'd have to handle the monitor updates with the preimages a bit differently, but not drastically so.

ldk-claude-review-bot · 2026-04-27T14:11:48Z

+					} else {
+						// Deferred flush: build a preimage-only ChannelMonitorUpdate so the
+						// preimage is durable now. The commitment_signed will be sent later
+						// when the holding cell is naturally flushed.
+						chan.build_preimage_only_monitor_update(payment_preimage, payment_info)
+					};


Bug: Panic in production — missing monitor_updating_paused call in the deferred-flush path.

When maybe_free_holding_cell_htlcs returns None (channel can't generate a new commitment due to AWAITING_REMOTE_REVOKE, PEER_DISCONNECTED, LOCAL_STFU_SENT, or QUIESCENT), build_preimage_only_monitor_update creates the update but does NOT call chan.monitor_updating_paused(...), so MONITOR_UPDATE_IN_PROGRESS is never set on the channel.

The subsequent handle_new_monitor_update (line 10297) submits the update to the chain monitor. If persistence is synchronous (common in simple setups) and there are no other in-flight updates for this channel, all_updates_complete becomes true, triggering try_resume_channel_post_monitor_update → monitor_updating_restored, which hits a hard assert! at channel.rs:9839:

assert!(self.context.channel_state.is_monitor_update_in_progress()); // PANICS

Compare with the single-HTLC path in get_update_fulfill_htlc_and_commit (line 7690) which always calls monitor_updating_paused(false, !update_blocked, false, ...) regardless of whether a commitment could be generated, ensuring MONITOR_UPDATE_IN_PROGRESS is set before the update is submitted.

Fix: After build_preimage_only_monitor_update, call:

chan.monitor_updating_paused(false, false, false, Vec::new(), Vec::new(), Vec::new(), &&logger);

Reproduction scenario: Node receives an MPP payment with ≥2 parts on the same channel. Before calling claim_funds, the channel enters AWAITING_REMOTE_REVOKE (e.g., a prior commitment exchange just completed with the peer's commitment_signed but the peer hasn't sent revoke_and_ack yet). claim_funds takes the batch path, can_generate_new_commitment() returns false, deferred flush path is taken, and the preimage-only update is submitted. With synchronous persistence → panic.

ldk-claude-review-bot · 2026-04-27T15:59:36Z

+					// Single HTLC on this channel, use existing path.
+					let htlc = group.into_iter().next().unwrap();
+					let this_mpp_claim = pending_mpp_claim_ptr_opt.as_ref().map(|pending_mpp_claim| {
+						let counterparty_id = htlc.mpp_part.prev_hop.counterparty_node_id.expect("Prior to upgrading to LDK 0.1, all pending HTLCs forwarded by LDK 0.0.123 or before must be resolved. It appears at least one claimable payment was not resolved. Please downgrade to LDK 0.0.125 and resolve the HTLC by claiming the payment prior to upgrading.");


Nit: The counterparty_node_id expect message is duplicated from line 9604 (where it was already unwrapped for grouping). Since the grouping loop above already unwraps counterparty_node_id for every HTLC, by the time we reach the group.len() == 1 branch we know it's Some. The second expect at line 9621 is redundant — you could just reuse the counterparty_id from the group tuple.

Note: the same applies to claim_batch_funds_from_channel at line 10139 — the counterparty_node_id was already validated during grouping.

ldk-claude-review-bot · 2026-04-27T15:59:44Z

+			let first_mpp_claim = this_mpp_claim.take();
+			let first_raa_blocker = raa_blocker.take();


The closed-channel fallback uses .take() so only the first HTLC in the batch gets the MonitorUpdateCompletionAction and RAA blocker (2nd+ get (None, None)). This is functionally correct — all HTLCs share the same channel, so one action + one blocker is sufficient — but it means each subsequent HTLC creates and persists a ChannelMonitorUpdate with no associated completion action.

Consider adding a brief comment explaining the .take() intent: "Only the first HTLC in the batch carries the completion action and RAA blocker; subsequent HTLCs only persist the preimage for safety."

ldk-claude-review-bot · 2026-04-27T15:59:56Z

+						// Inject payment_info into the first PaymentPreimage step that
+						// matches our preimage (the holding cell may contain unrelated
+						// claims for other preimages from prior queueing).
+						for step in monitor_update.updates.iter_mut() {


This payment_info injection only fills the first matching PaymentPreimage step in the combined monitor update (then breaks). When there are N HTLCs in the batch, there are N PaymentPreimage steps for the same preimage, and only the first gets payment_info. This is correct because ChannelMonitor::provide_payment_preimage uses and_modify().or_insert_with() and will store payment_info from the first step, while subsequent steps with None are no-ops.

Worth a brief comment explaining this, since a reader might expect all N steps to carry payment_info:

Suggested change

// Inject payment_info into the first PaymentPreimage step that

// matches our preimage (the holding cell may contain unrelated

// claims for other preimages from prior queueing).

for step in monitor_update.updates.iter_mut() {

// Inject payment_info into the first PaymentPreimage step that

// matches our preimage. Only one step needs it — the ChannelMonitor

// stores payment_info per payment_hash (not per HTLC), so subsequent

// preimage steps with None are idempotent no-ops.

for step in monitor_update.updates.iter_mut() {

ldk-claude-review-bot · 2026-04-27T16:00:05Z

+			&self
+				.context
+				.pending_inbound_htlcs
+				.iter()
+				.find(|h| h.htlc_id == htlc_id_arg)
+				.map(|h| h.payment_hash)
+				.unwrap_or(PaymentHash([0u8; 32])),


Nit: Redundant HTLC lookup — the htlc variable from the find at line 7730 already holds the matching HTLC, but this log statement does a second find to get the payment_hash. You can just use htlc.payment_hash from the earlier binding.

(The earlier htlc borrow does end at line 7763, but you could restructure to extract the payment_hash before the holding-cell loop, or restructure the scoping slightly.)

TheBlueMatt · 2026-05-01T21:20:14Z

+	/// will be produced when the holding cell is freed.
+	///
+	/// Returns `true` if the claim was queued, `false` if it was a duplicate.
+	pub fn queue_claim_htlc<L: Logger>(


Please don't just add a redundant function. Refactor the existing get_update_fulfill_htlc to support pushing into the holding cell.

TheBlueMatt · 2026-05-01T21:21:23Z

+			}
+
+			for (_, _, group) in grouped_sources {
+				if group.len() == 1 {


Huh? Why is this a separate branch? We should have one common set of code that works for both 1-len and N-len. Its also not clear to me why we have to group at the start, we should be able to just iterate and push each claim into the holding cell for each channel then free them all at the end.

When multiple parts of an MPP payment arrive on the same channel, claim them via the holding cell so they are released as one combined ChannelMonitorUpdate (every PaymentPreimage step plus a single commitment_signed) instead of one round-trip per part. claim_payment_internal now forces the holding-cell path whenever there is more than one source, records each touched channel, and flushes them after queueing. The single-source case keeps the existing inline fast path. FundedChannel::get_update_fulfill_htlc gains a force_holding_cell flag that suppresses the per-claim ChannelMonitorUpdate so the flush can emit one combined update. When the holding cell cannot be freed immediately (awaiting RAA, monitor update in progress, disconnected, quiescent, ...), build_preimage_only_monitor_update writes a one-step preimage update so the preimage stays durable across restarts; the commitment_signed follows when the holding cell is naturally flushed. Update test_single_channel_multiple_mpp, test_simple_mpp, and auto_retry_partial_failure to reflect the new single-round-trip behavior, and drop the threaded reproducer that targeted the old per-claim path.

ldk-claude-review-bot · 2026-05-13T18:38:43Z

-	// `ChannelMonitorUpdate` until after it completed. See the commit which introduced this test
-	// for more info.
+	// Test that when an MPP payment has many parts ariving on the same channel, all of them are
+	// claimed in a single commitment update rather than requiring a round-trip per claim.


Nit: typo — "ariving" → "arriving"

Suggested change

// claimed in a single commitment update rather than requiring a round-trip per claim.

// Test that when an MPP payment has many parts arriving on the same channel, all of them are

ldk-claude-review-bot · 2026-05-13T18:38:58Z

 					match fulfill_res {
+						UpdateFulfillCommitFetch::Queued { htlc_value_msat } => {
+							// The claim was queued into the holding cell so that this and any
+							// other concurrent claims on this channel will be released as a
+							// single combined `ChannelMonitorUpdate` later (see
+							// `claim_payment_internal`'s post-iteration flush). Just record the
+							// post-update completion actions and the RAA blocker; the actual
+							// monitor update will be applied when the holding cell is flushed.
+							let (action_opt, raa_blocker_opt) =
+								completion_action(Some(htlc_value_msat), false);
+							if let Some(action) = action_opt {
+								log_trace!(
+									logger,
+									"Tracking monitor update completion action (queued): {:?}",
+									action
+								);
+								peer_state
+									.monitor_update_blocked_actions
+									.entry(chan_id)
+									.or_insert(Vec::new())
+									.push(action);
+							}
+							if let Some(raa_blocker) = raa_blocker_opt {
+								peer_state
+									.actions_blocking_raa_monitor_updates
+									.entry(chan_id)
+									.or_insert_with(Vec::new)
+									.push(raa_blocker);
+							}


Bug: Each queued HTLC registers its own PaymentClaimed completion action and ClaimedMPPPayment RAA blocker, but they all map to a single combined ChannelMonitorUpdate produced by the subsequent flush.

When that combined update completes and handle_monitor_update_completion_actions processes the N actions:

The first PaymentClaimed action's retain loop finds all N RAA blockers (they share the same PendingMPPClaimPointer via Arc::ptr_eq). After the first blocker empties channels_without_preimage, the remaining N-1 blockers each redundantly push to freed_channels.

The remaining N-1 PaymentClaimed actions find no blockers left (already drained) and no pending_claiming_payments entry (already removed), so they're expensive no-ops.

This isn't a crash but causes N redundant handle_monitor_update_release calls for the same channel, which is wasteful and could prematurely unblock downstream channels in multi-channel MPP edge cases.

Fix: Register exactly ONE action + ONE RAA blocker for the entire batch (matching the single ChannelMonitorUpdate), rather than one per queued HTLC. Move the action/blocker registration to flush_pending_holding_cell_claim (or to a single post-loop step) instead of registering inside each iteration's Queued handler.

ldk-claude-review-bot · 2026-05-13T18:39:09Z

+	pub fn build_preimage_only_monitor_update<L: Logger>(
+		&mut self, payment_preimage: PaymentPreimage, payment_info: Option<PaymentClaimDetails>,
+		logger: &L,
+	) -> ChannelMonitorUpdate {
+		self.context.latest_monitor_update_id += 1;
+		let mut monitor_update = ChannelMonitorUpdate {
+			update_id: self.context.latest_monitor_update_id,
+			updates: vec![ChannelMonitorUpdateStep::PaymentPreimage {
+				payment_preimage,
+				payment_info,
+			}],
+			channel_id: Some(self.context.channel_id()),
+		};
+
+		// If there are blocked monitor updates queued ahead of us, the new preimage update
+		// has to fly before them, so insert it at the head and bump their IDs (mirrors the
+		// logic in `get_update_fulfill_htlc_and_commit`).
+		if !self.context.blocked_monitor_updates.is_empty() {
+			let new_mon_id = self.context.blocked_monitor_updates[0].update.update_id;
+			monitor_update.update_id = new_mon_id;
+			for held_update in self.context.blocked_monitor_updates.iter_mut() {
+				held_update.update.update_id += 1;
+			}
+		}
+
+		// Mark the channel as having a monitor update in progress so that when the update is
+		// submitted to the chain monitor and persistence completes (synchronously or otherwise),
+		// `monitor_updating_restored` doesn't trip its `MONITOR_UPDATE_IN_PROGRESS` assertion.
+		// This mirrors the `monitor_updating_paused` call made in `free_holding_cell_htlcs`
+		// (the `Some` branch of `maybe_free_holding_cell_htlcs`) and in
+		// `get_update_fulfill_htlc_and_commit`.
+		self.monitor_updating_paused(
+			false,
+			false,
+			false,
+			Vec::new(),
+			Vec::new(),
+			Vec::new(),
+			logger,
+		);
+
+		monitor_update
+	}


Missing test coverage: build_preimage_only_monitor_update is only called from the deferred flush path in flush_pending_holding_cell_claim (when maybe_free_holding_cell_htlcs returns None), but none of the new or updated tests exercise this path. All three tests (test_single_channel_multiple_mpp, auto_retry_partial_failure, test_keysend_dup_hash_partial_mpp) have the channel in a state where can_generate_new_commitment() is true at claim time, so the holding cell is always immediately flushed.

Consider adding a test where an MPP batch claim happens while the channel is awaiting a RAA (e.g., after sending a commitment_signed but before receiving revoke_and_ack). This would exercise:

The build_preimage_only_monitor_update path

The ID-shifting logic for blocked monitor updates

The eventual holding cell flush when RAA arrives (producing a combined update with redundant preimage steps)

ldk-reviews-bot requested a review from jkczyz April 11, 2026 17:52

ldk-claude-review-bot reviewed Apr 11, 2026

View reviewed changes

Abeeujah force-pushed the parallelize-mppclaims branch from 395b30d to 88d62c4 Compare April 12, 2026 09:17

Abeeujah requested a review from ldk-claude-review-bot April 12, 2026 09:24

ldk-claude-review-bot reviewed Apr 12, 2026

View reviewed changes

Abeeujah force-pushed the parallelize-mppclaims branch from 88d62c4 to 5880633 Compare April 12, 2026 11:08

jkczyz requested review from TheBlueMatt and removed request for jkczyz April 14, 2026 22:09

Abeeujah force-pushed the parallelize-mppclaims branch from 5880633 to d801ec9 Compare April 16, 2026 15:08

TheBlueMatt reviewed Apr 24, 2026

View reviewed changes

Abeeujah force-pushed the parallelize-mppclaims branch from d801ec9 to 5992824 Compare April 27, 2026 14:00

ldk-claude-review-bot reviewed Apr 27, 2026

View reviewed changes

Abeeujah force-pushed the parallelize-mppclaims branch from 5992824 to f466457 Compare April 27, 2026 15:44

ldk-claude-review-bot reviewed Apr 27, 2026

View reviewed changes

TheBlueMatt reviewed May 1, 2026

View reviewed changes

Abeeujah force-pushed the parallelize-mppclaims branch from f466457 to dddd924 Compare May 13, 2026 18:26

ldk-claude-review-bot reviewed May 13, 2026

View reviewed changes

		if combined_monitor_update.update_id == 0 {
		combined_monitor_update.update_id = monitor_update.update_id;

		// Register a completion action and RAA blocker for each
		// successfully claimed (non-duplicate) HTLC.

		let first_mpp_claim = this_mpp_claim.take();
		let first_raa_blocker = raa_blocker.take();

	// claimed in a single commitment update rather than requiring a round-trip per claim.
	// Test that when an MPP payment has many parts arriving on the same channel, all of them are

Conversation

Abeeujah commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ldk-reviews-bot commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ldk-claude-review-bot commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review Summary

Issues Found

Prior Review Issues Status

Cross-cutting Observations

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ldk-reviews-bot commented Apr 13, 2026

Uh oh!

Abeeujah commented Apr 16, 2026

Uh oh!

codecov Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ldk-reviews-bot commented Apr 16, 2026

Uh oh!

ldk-reviews-bot commented Apr 20, 2026

Uh oh!

ldk-reviews-bot commented Apr 22, 2026

Uh oh!

TheBlueMatt left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Abeeujah commented Apr 11, 2026 •

edited

Loading

ldk-reviews-bot commented Apr 11, 2026 •

edited

Loading

ldk-claude-review-bot commented Apr 11, 2026 •

edited

Loading

codecov Bot commented Apr 16, 2026 •

edited

Loading