fix(qbft_manager): fast-fail expired late callers when no entry remains#3
Open
peter941221 wants to merge 1 commit intodiegomrsantos:test/qbft-late-roundsfrom
Conversation
Author
|
Quick follow-up on this narrow stacked fix because linked issue The branch is still intentionally narrow and still intentionally stacked on The claim remains just:
All visible checks on this PR are still green. If you would prefer to wait for |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR Body 914
Title
qbft_manager: fast-fail expired late callers when no entry remainsBase / Stack Note
sigp/anchor#719.#719already keeps decided instances alive until their beacon-chain deadline, so pre-deadline late callers can still receive the cached result.sigp/anchor#914: once cleanup has removed the entry entirely, a post-deadline local caller should not respawn a useless shadow instance.Summary
Two local duty paths can independently call
decide_instance(...)for the same logical committee-slot decision. After#719, a decided instance is intentionally retained until its deadline so late callers can still reuse the result. The remaining hole is the post-deadline case: once the cleaner removes the entry, a later caller can still hitVacant, spawn a duplicate instance, and then wait through a pointless max-round timeout.This patch makes that post-deadline path fail immediately when the entry is already gone, while preserving the current reuse path when an entry is still registered.
Changes
current_slot_exceeds_deadline(...)helper inqbft_managerdecide_instance(...):current_slot > deadlineand no entry exists, returnOk(Completed::TimedOut)immediatelycurrent_slot > deadlineand an entry still exists, preserve the current reuse pathWhy This Shape
I considered a blunter rule of "if expired, always fast-fail", but that would regress the good part of
#719: a late caller should still be able to reuse a decided result while the entry is intentionally retained before cleanup.So the narrower invariant here is:
#719already establishedValidation
cargo fmt --all --checkcargo test -p qbft_managercargo clippy -p qbft_manager --tests -- -D warningsgit diff --checkResidual Scope
This PR only seals the local late-caller path in
decide_instance(...).receive_data(...) -> pass_to_instance(...)still usesget_or_spawn_instance(...)unconditionally, so expired inbound network traffic remains a separate lifecycle-policy concern. I am keeping that out of this patch because it likely belongs with the broader timing-policy consolidation discussed insigp/anchor#916.References
https://github.com/sigp/anchor/issues/914https://github.com/sigp/anchor/pull/719https://github.com/sigp/anchor/issues/916https://eips.ethereum.org/EIPS/eip-7045https://doi.org/10.1145/78969.78972https://doi.org/10.1145/42282.42283