fix(weka): guarantee trailing-user turns; drop raw-payload re-validation + fix batch-flush shutdown loss by ajcasagrande · Pull Request #12 · SemiAnalysisAI/aiperf

ajcasagrande · 2026-06-30T17:32:32Z

Two independent fixes on top of cquil11/agentx-v0.4-sync.

1. Weka: guarantee trailing-user turns (`feat(weka)`)

Reconstructed weka turns could end with a trailing assistant segment when a block-aligned context pull-back truncated onto an assistant block — 376 / 98,827 turns (0.38%) on the 062126 corpus. A chat request must end with a user message.

New compute_asst_block_caps: a role-independent forward pass that finds every degenerate pull-back's truncation target and caps the assistant block count of the turn that created that block, so the boundary lands on a user block. The boundary is fixed at creation and never relabeled → cross-turn KV-cache reuse preserved (relabel-after-emission would diverge the cached prefix).
Wired into all five reconstruction loops (serial parent/child/flat-chain + parallel-worker parent/child).
Extracted compute_turn_block_geometry as the single source of truth advance_turn and the planner share; the planner mutates its block tile in place (O(new)/turn).
The only theoretically-unfixable shape (a turn that appends zero new tokens, e.g. a system-only turn 0) stays flagged on _trailing_non_user_turns; none occur in the corpus.

Validated on the full 393-trace corpus: trailing-assistant 376 → 0, 0 crashes, byte-exact sum(seg.tokens) == in_tokens preserved.

2. Raw payload: drop per-record orjson re-validation + fix batch-flush shutdown loss (`fix(raw-payload)`)

Removed the per-send orjson.loads round-trip in InferenceClient and the per-record orjson.loads validation in RawRecordWriterProcessor. payload_bytes are validated at dataset-load time (or produced by orjson.dumps), so re-parsing every request/record only reintroduced the decode cost the verbatim / orjson.Fragment fast paths exist to avoid. Invalid bytes now forward/splice verbatim.
Fixed a pre-existing shutdown data-loss race: RawRecordWriterProcessor's batch-trigger flush task was scheduled via execute_async without being registered in _flush_tasks, so _stop_all_tasks cancelled it before _close_file awaited it — losing the whole batch on stop. Now registered like the parent BufferedJSONLWriterMixin.

Testing

uv run pytest tests/unit/dataset/loader/ — green (new helper / cap / tool-shaping tests added).
uv run pytest tests/unit/post_processors/test_raw_record_writer_adversarial.py — green (adversarial tests updated to verbatim-splice behavior; the batch-flush shutdown test now passes).
uv run pytest tests/component_integration/dataset/test_raw_payload_replay_adversarial.py -m component_integration — green.
Full 393-trace corpus reconstruction: 0 trailing-assistant, no crashes.

Pre-existing unrelated failure, not touched here: tests/unit/dataset/loader/test_weka_trace.py::test_flattened_fanout_logs_detection_summary (a log-string drift in code outside this change).

🤖 Generated with Claude Code

Reconstructed weka turns could end with a trailing assistant segment when a block-aligned context pull-back truncated onto an assistant block (376/98,827 turns on the 062126 corpus). A chat request must end with a user message. Add compute_asst_block_caps: a role-independent forward pass that finds every degenerate pull-back's truncation target and caps the assistant block count of the turn that created that block, so the boundary lands on a user block. The boundary is fixed at creation and never relabeled, preserving cross-turn KV-cache reuse. Wire the cap into all five reconstruction loops (serial parent/child/flat-chain + parallel parent/child). Extract compute_turn_block_geometry as the single source of truth advance_turn and the planner share; the planner mutates its block tile in place to stay O(new) per turn. Drives trailing-assistant turns to 0 across the full 393-trace corpus with no crashes; byte-exact sum(seg.tokens)==in_tokens preserved. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Anthony Casagrande <acasagrande@nvidia.com>

…sh shutdown loss Remove the per-send orjson.loads round-trip in InferenceClient and the per-record orjson.loads validation in RawRecordWriterProcessor. payload_bytes are validated at dataset-load time (or produced by orjson.dumps), so re-parsing every request/record only reintroduces the decode cost the verbatim / orjson.Fragment paths exist to avoid. Invalid bytes now forward/splice verbatim. Also fix a pre-existing shutdown data-loss race: RawRecordWriterProcessor's batch-trigger flush task was scheduled via execute_async without being registered in _flush_tasks, so _stop_all_tasks cancelled it before _close_file awaited it, losing the whole batch on stop. Register the task like the parent BufferedJSONLWriterMixin does. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Anthony Casagrande <acasagrande@nvidia.com>

github-actions · 2026-06-30T17:32:41Z

Try out this PR

Quick install:

pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@e165d787faa8ba8de34feb6e5e3525eaa33a82f0

Recommended with virtual environment (using uv):

uv venv --python 3.12 && source .venv/bin/activate
uv pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@e165d787faa8ba8de34feb6e5e3525eaa33a82f0

Last updated for commit: e165d78 • Browse code

ajcasagrande and others added 2 commits June 30, 2026 10:29

cquil11 merged commit cec702b into SemiAnalysisAI:cquil11/agentx-v0.4-sync Jun 30, 2026
1 of 2 checks passed

weireweire mentioned this pull request Jul 1, 2026

Fix replay requests ending with assistant messages ajcasagrande/aiperf#4

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(weka): guarantee trailing-user turns; drop raw-payload re-validation + fix batch-flush shutdown loss#12

fix(weka): guarantee trailing-user turns; drop raw-payload re-validation + fix batch-flush shutdown loss#12
cquil11 merged 2 commits into
SemiAnalysisAI:cquil11/agentx-v0.4-syncfrom
ajcasagrande:ajc/fix-trailing-assistant-roles

ajcasagrande commented Jun 30, 2026

Uh oh!

github-actions Bot commented Jun 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ajcasagrande commented Jun 30, 2026

1. Weka: guarantee trailing-user turns (feat(weka))

2. Raw payload: drop per-record orjson re-validation + fix batch-flush shutdown loss (fix(raw-payload))

Testing

Uh oh!

github-actions Bot commented Jun 30, 2026

Try out this PR

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. Weka: guarantee trailing-user turns (`feat(weka)`)

2. Raw payload: drop per-record orjson re-validation + fix batch-flush shutdown loss (`fix(raw-payload)`)