Skip to content

feat(validator): salted-hash keeper for density tapering#290

Merged
Thykof merged 3 commits into
mainfrom
feat/randomize-tapering-keeper
Jul 2, 2026
Merged

feat(validator): salted-hash keeper for density tapering#290
Thykof merged 3 commits into
mainfrom
feat/randomize-tapering-keeper

Conversation

@Thykof

@Thykof Thykof commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

What

Density tapering keeps one validator_request per (asset, bucket) and soft-deletes the rest. Until now the keeper was the smallest id — i.e. always the earliest request in the bucket. This changes the keeper to the row with the smallest md5(id || THINNING_SALT), so the kept request is spread across the bucket instead of always being its earliest member.

Why it's safe

The hash is stable per row, so tapering stays idempotent: the global-min-hash row, once eligible, is always rn=1 and is never thinned; a transient keeper covers the bucket until then. So every bucket always keeps ≥1 row and converges to exactly one keeper well before it enters scoring range (~24h later for LOW). The latest_per_asset protection for the low-latency feed is unchanged. Applies to both LOW and HIGH cycles.

Config

  • THINNING_SALT is optional (added to .env.example). Set → salted selection; unset → the handler logs a one-time warning and falls back to md5(id). No crash either way.
  • Only the scoring cycle runs tapering, so only that pod reads the salt.

Tests

  • Existing thinning tests rewritten to a keeper-agnostic invariant (exactly one survivor per (asset, bucket), rest tombstoned) via a _split_alive_thinned helper.
  • Added: keeper stable across re-runs (idempotency), keeper == argmin(md5(id || salt)), and the unset-salt md5(id) fallback.
  • conftest sets a stable THINNING_SALT for reproducibility.

🤖 Generated with Claude Code

Thykof and others added 2 commits July 1, 2026 21:20
Pick the kept validator_request per (asset, bucket) by
md5(id || THINNING_SALT) instead of the smallest id, so the kept
request is spread across the bucket rather than always its earliest
member. The hash is stable per row, so tapering stays idempotent: the
global-min-hash row, once eligible, is always rn=1 and never thinned,
and every bucket converges to exactly one keeper before scoring range.

THINNING_SALT is optional (documented in .env.example); when unset the
handler warns once and falls back to md5(id).

Tests assert the keeper-agnostic invariant (one survivor per bucket),
determinism across runs, the md5(id || salt) selection, and the unset
fallback.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The keeper is now a salted-hash pick, so the two latest_per_asset tests
can no longer assume the smallest-id row is the keeper:

- drops_protection_once_latest_ages_past_time_length (the CI failure):
  both rows are past time_length, so latest_per_asset is empty and the
  bucket collapses to one randomized keeper. Assert exactly one survivor
  instead of naming it — two survivors would mean the stale latest was
  wrongly protected.
- preserves_latest_request_per_asset_during_gap: if the latest wins the
  hash draw the older (unprotected) row is thinned, so "both alive" no
  longer holds. Assert the freshest row is always alive (as keeper or via
  latest_per_asset), which is what the test is named to prove.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates validator density tapering to pick the per-(asset, bucket) keeper request via a salted MD5 of the validator_request id, instead of always keeping the smallest id, so the kept request is spread across the bucket while remaining deterministic across re-runs.

Changes:

  • Change tapering keeper selection to ORDER BY md5(vr_id::text || :salt) with optional THINNING_SALT (fallback to md5(id) when unset).
  • Rewrite/extend thinning tests to be keeper-agnostic where appropriate, plus add determinism and salted-hash keeper assertions.
  • Add THINNING_SALT documentation to .env.example and set a stable salt in the test suite.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
synth/validator/miner_data_handler.py Implements salted-hash keeper selection and logs when salt is missing.
tests/test_miner_data_handler.py Updates thinning assertions to be keeper-agnostic and adds new keeper/determinism tests.
tests/conftest.py Sets a stable THINNING_SALT for reproducible tests.
.env.example Documents optional THINNING_SALT configuration.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +89 to +99
# Optional salt for density-tapering keeper selection. The kept
# validator_request per bucket is picked by md5(id || salt), so the
# keeper is spread across the bucket rather than always its earliest
# member. Unset falls back to md5(id); warn once so that's not
# silent.
self.thinning_salt = os.getenv("THINNING_SALT", "")
if not self.thinning_salt:
bt.logging.warning(
"THINNING_SALT is not set; density-tapering keeper "
"selection falls back to md5(id)."
)
Comment on lines +807 to +812
The keeper is the row with the smallest `md5(id || thinning_salt)`,
not the smallest id, so the kept request is spread across the bucket
rather than always being its earliest member. The hash is stable per
row (idempotent across runs): the global-min-hash row, once eligible,
is always rn=1 and never thinned, so every bucket keeps exactly one
row and converges to that keeper well before it enters scoring range.
Comment thread tests/conftest.py
Comment on lines +36 to +41
@pytest.fixture(scope="session", autouse=True)
def thinning_salt_env():
# Provide a stable THINNING_SALT so the salted-hash keeper selection is
# reproducible across the suite.
os.environ["THINNING_SALT"] = "test-thinning-salt"
yield
Comment thread tests/test_miner_data_handler.py
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@Thykof Thykof merged commit a7621fa into main Jul 2, 2026
5 checks passed
@Thykof Thykof deleted the feat/randomize-tapering-keeper branch July 2, 2026 11:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants