Skip to content

feat(p2p): log publish-side gossip diagnostics before publish#401

Draft
MegaRedHand wants to merge 1 commit into
mainfrom
feat/publish-side-gossip-diagnostics
Draft

feat(p2p): log publish-side gossip diagnostics before publish#401
MegaRedHand wants to merge 1 commit into
mainfrom
feat/publish-side-gossip-diagnostics

Conversation

@MegaRedHand
Copy link
Copy Markdown
Collaborator

What

Emit a diagnostic log line on every gossip publish (block, attestation, aggregation), immediately after snappy encoding and before publish(), so a captured on-wire message can be reconciled against what a peer logs on receipt. Built to debug cross-client snappy/SSZ corruption such as blockblaz/zeam#942.

Fields logged

Field Meaning
topic gossipsub topic string (on-wire)
slot message slot
proposer / block_root block path only ("if available")
validator attestation path only
ssz_sha256 SHA256 of the uncompressed SSZ payload
compressed_sha256 SHA256 of the snappy-compressed (on-wire) bytes
compressed_len on-wire byte length
snappy_self_decode_ok self round-trip decompress(compressed) == ssz; local-encoder corruption canary
message_id gossipsub message id, computed via the same function peers use
git_sha client build SHA
snappy snappy lib + resolved version (rust-snap/<ver>)

How

  • Extracted the message-id logic into a shared gossip_message_id(topic, data); the gossipsub message_id_fn now delegates to it, so the logged id provably matches what peers assign.
  • Added crates/net/p2p/build.rs to surface build-time values: VERGEN_GIT_SHA (mirrors the bin crate) and SNAP_VERSION parsed from the workspace Cargo.lock (the snap crate exposes no version const, and Cargo surfaces no dep-version env var).
  • Moved hex from a dev-dependency to a regular dependency.

Verification

Ran a local 4-node devnet (20 slots, finalized to slot 19). Confirmed:

  • 22 block / 92 attestation / 23 aggregation diagnostic lines, every field populated.
  • git_sha=abfdd7b, snappy=rust-snap/1.1.1 (match git rev-parse --short HEAD + Cargo.lock).
  • snappy_self_decode_ok=false: 0; receive-side decompress/decode errors: 0.
  • Block sizes include the large case (compressed_len=370444, past snappy's 64 KB block boundary) that the #942 corruption requires.

Sample (block):

Publishing block to gossipsub (publish diagnostics) topic=/leanconsensus/12345678/block/ssz_snappy slot=4 proposer=0 block_root=86c842dd…0c98 ssz_sha256=46897d82…02fc compressed_sha256=18008e9e…4b84 compressed_len=370444 snappy_self_decode_ok=true message_id=0fac287f446d6401d67b18119f9f910506f91ce2 git_sha="abfdd7b" snappy="rust-snap/1.1.1"

Notes

  • These lines are info-level on every published message, so attestation subnets are chatty. Easy to drop attestation/aggregation to debug and keep blocks at info, or gate behind a flag, if reviewers prefer.

Emit a diagnostic log line on every gossip publish (block, attestation,
aggregation) immediately after snappy encoding and before publish(), so a
captured on-wire message can be reconciled against what a peer logs on
receipt. This is to debug cross-client snappy/SSZ corruption such as
blockblaz/zeam#942.

Each line carries topic, slot (plus proposer/block_root for blocks),
sha256 of the SSZ and of the compressed payload, compressed_len, a
snappy self-decode round-trip check (local-encoder canary), the gossipsub
message_id, the client git SHA, and the snappy lib/version.

The message_id is computed through a shared gossip_message_id(topic, data)
that the gossipsub message_id_fn now also delegates to, so the logged id
provably matches the one peers assign. git SHA and the resolved `snap`
crate version are surfaced at build time via a new p2p build.rs (vergen
for the SHA, Cargo.lock parse for the snappy version); hex moves from a
dev-dependency to a regular dependency.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant