Skip to content

fix(review): gate idempotency on a recorded verdict, not run status#344

Open
neversettle17-101 wants to merge 2 commits into
mainfrom
ao/agent-orchestrator-6/fix-stale-running-review-342
Open

fix(review): gate idempotency on a recorded verdict, not run status#344
neversettle17-101 wants to merge 2 commits into
mainfrom
ao/agent-orchestrator-6/fix-stale-running-review-342

Conversation

@neversettle17-101

@neversettle17-101 neversettle17-101 commented Jun 20, 2026

Copy link
Copy Markdown
Collaborator

Fixes #342

Root cause

Engine.Trigger (backend/internal/review/review.go) gated per-commit idempotency on run status: it returned any existing non-Failed run for (session, targetSHA) as "already up to date." That includes a row still ReviewRunRunning whose reviewer was stopped mid-run without ever calling Submit (which is the only thing that records a verdict). Such a row stays Running forever, so retrying review for the same commit kept short-circuiting with "Review is already up to date for this commit" until a new commit was pushed.

Per the maintainer's corrected direction: an Alive()-gated check still doesn't cover the reported case, because pane liveness can't distinguish "still actively reviewing" from "pane left open after the user stopped/interrupted the agent" — both report alive.

Fix

Idempotency is now gated on whether a verdict was actually recorded (set exclusively by Submit), not on run status:

  • existing.Verdict != VerdictNone → genuinely done; return as-is with Created: false.
  • existing.Status == ReviewRunRunning with no verdict → never trusted. The stale row is superseded (marked Failed, body "superseded by a new review trigger") and Trigger falls through to start a fresh pass — reusing the pane via Notify if it's still alive, or Spawning a new one if not.
  • If the supersede UpdateReviewRunResult reports the row was no longer Running (lost the race to a concurrent Submit), the row is re-read and its verdict trusted instead of starting a redundant pass.

A review the user stopped mid-run (Ctrl-C, killed terminal, or killed pane) can now always be retried.

Tests

In internal/review:

  • TestTriggerIsIdempotentForSameCommit (updated): now requires an actual recorded verdict to reuse a run.
  • TestTriggerRespawnsWhenRunningRowHasNoVerdict (new): dead pane → stale row marked Failed, fresh Spawn, Created: true.
  • TestTriggerNotifiesLiveReviewerWhenRunningRowHasNoVerdict (new): alive pane → stale row superseded, live pane re-Notifyd (no redundant second spawn), Created: true.
  • TestTriggerConcurrentSameWorkerNeverDoubleSpawns (renamed/adjusted): asserts the invariant that still holds under verdict-gating — concurrent triggers never spawn two independent reviewer panes for the same commit (fix(review): concurrent Trigger double-spawns reviewers (no serialization, no unique constraint) #242), even if they're no longer a pure single-row no-op.

go build ./..., go vet ./internal/review/, and go test -race ./internal/review/ all pass. (Pre-existing, unrelated failures in internal/adapters/runtime/zellij reproduce on main and depend on the local zellij binary.)

🤖 Generated with Claude Code

neversettle17-101 and others added 2 commits June 20, 2026 16:41
Engine.Trigger's per-commit idempotency check short-circuited on any
non-failed run for (session, targetSHA), including a row still
ReviewRunRunning whose reviewer process had actually died (e.g. the user
closed the reviewer's terminal mid-run, bypassing Submit). Nothing
transitions such a row out of Running, so retrying review for the same
commit returned "already up to date" permanently, until a new commit was
pushed (#342).

Now when the existing pass is Running, Trigger checks launcher liveness
via the review's reviewer handle before reusing it: if alive, behavior is
unchanged (reuse the in-progress run, no double-spawn); if not alive, the
stale row is marked Failed and Trigger falls through to spawn a fresh run.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Per maintainer direction on #342: pane liveness can't distinguish "still
actively reviewing" from "pane left open after the user stopped the agent
mid-run," so an Alive()-gated check still misses the reported case.

Trigger's idempotency now treats a pass as "already up to date" only when
it actually carries a verdict (set exclusively by Submit), not merely a
non-Failed status. An un-verdicted Running pass is never trusted: it's
superseded (marked Failed) and a fresh pass is started — the live pane is
notified if still alive, or a new one spawned if not. A Running row that
loses the supersede race to a concurrent Submit is re-read and its verdict
trusted instead of starting a redundant pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@neversettle17-101 neversettle17-101 changed the title fix(review): retry review when a stale Running pass's reviewer died fix(review): gate idempotency on a recorded verdict, not run status Jun 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug(review): Trigger's idempotency check treats a dead Running row as up-to-date, blocking re-review after the reviewer terminal is killed

1 participant