fix(review): gate idempotency on a recorded verdict, not run status#344
Open
neversettle17-101 wants to merge 2 commits into
Open
fix(review): gate idempotency on a recorded verdict, not run status#344neversettle17-101 wants to merge 2 commits into
neversettle17-101 wants to merge 2 commits into
Conversation
Engine.Trigger's per-commit idempotency check short-circuited on any non-failed run for (session, targetSHA), including a row still ReviewRunRunning whose reviewer process had actually died (e.g. the user closed the reviewer's terminal mid-run, bypassing Submit). Nothing transitions such a row out of Running, so retrying review for the same commit returned "already up to date" permanently, until a new commit was pushed (#342). Now when the existing pass is Running, Trigger checks launcher liveness via the review's reviewer handle before reusing it: if alive, behavior is unchanged (reuse the in-progress run, no double-spawn); if not alive, the stale row is marked Failed and Trigger falls through to spawn a fresh run. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Per maintainer direction on #342: pane liveness can't distinguish "still actively reviewing" from "pane left open after the user stopped the agent mid-run," so an Alive()-gated check still misses the reported case. Trigger's idempotency now treats a pass as "already up to date" only when it actually carries a verdict (set exclusively by Submit), not merely a non-Failed status. An un-verdicted Running pass is never trusted: it's superseded (marked Failed) and a fresh pass is started — the live pane is notified if still alive, or a new one spawned if not. A Running row that loses the supersede race to a concurrent Submit is re-read and its verdict trusted instead of starting a redundant pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #342
Root cause
Engine.Trigger(backend/internal/review/review.go) gated per-commit idempotency on run status: it returned any existing non-Failedrun for(session, targetSHA)as "already up to date." That includes a row stillReviewRunRunningwhose reviewer was stopped mid-run without ever callingSubmit(which is the only thing that records a verdict). Such a row staysRunningforever, so retrying review for the same commit kept short-circuiting with "Review is already up to date for this commit" until a new commit was pushed.Per the maintainer's corrected direction: an
Alive()-gated check still doesn't cover the reported case, because pane liveness can't distinguish "still actively reviewing" from "pane left open after the user stopped/interrupted the agent" — both report alive.Fix
Idempotency is now gated on whether a verdict was actually recorded (set exclusively by
Submit), not on run status:existing.Verdict != VerdictNone→ genuinely done; return as-is withCreated: false.existing.Status == ReviewRunRunningwith no verdict → never trusted. The stale row is superseded (markedFailed, body"superseded by a new review trigger") andTriggerfalls through to start a fresh pass — reusing the pane viaNotifyif it's still alive, orSpawning a new one if not.UpdateReviewRunResultreports the row was no longerRunning(lost the race to a concurrentSubmit), the row is re-read and its verdict trusted instead of starting a redundant pass.A review the user stopped mid-run (Ctrl-C, killed terminal, or killed pane) can now always be retried.
Tests
In
internal/review:TestTriggerIsIdempotentForSameCommit(updated): now requires an actual recorded verdict to reuse a run.TestTriggerRespawnsWhenRunningRowHasNoVerdict(new): dead pane → stale row markedFailed, freshSpawn,Created: true.TestTriggerNotifiesLiveReviewerWhenRunningRowHasNoVerdict(new): alive pane → stale row superseded, live pane re-Notifyd (no redundant second spawn),Created: true.TestTriggerConcurrentSameWorkerNeverDoubleSpawns(renamed/adjusted): asserts the invariant that still holds under verdict-gating — concurrent triggers never spawn two independent reviewer panes for the same commit (fix(review): concurrent Trigger double-spawns reviewers (no serialization, no unique constraint) #242), even if they're no longer a pure single-row no-op.go build ./...,go vet ./internal/review/, andgo test -race ./internal/review/all pass. (Pre-existing, unrelated failures ininternal/adapters/runtime/zellijreproduce onmainand depend on the local zellij binary.)🤖 Generated with Claude Code