Skip to content

Fix pinned workflow trampolining via revision-based signal suppression#9895

Open
Shivs11 wants to merge 3 commits intomainfrom
trampolining-rev-number
Open

Fix pinned workflow trampolining via revision-based signal suppression#9895
Shivs11 wants to merge 3 commits intomainfrom
trampolining-rev-number

Conversation

@Shivs11
Copy link
Copy Markdown
Member

@Shivs11 Shivs11 commented Apr 9, 2026

Summary

  • Matching now sends the real targetDeploymentRevisionNumber for pinned workflows instead of hardcoded 0
  • Added revision_number field to LastNotifiedTargetVersion (server-internal) and DeclinedTargetVersionUpgrade (public API)
  • Case 4 in the target version change switch now uses targetRevisionNumber <= declined.RevisionNumber instead of version string comparison, preventing trampolining when stale matching partitions send outdated target versions

Problem

When a pinned workflow CaNs and declines a target version upgrade, a stale matching partition can send an older target version. The old case 4 compared version strings (declined.buildId == target.buildId), which didn't match the stale version — causing the workflow to re-signal, CaN again, and trampoline indefinitely between stale and up-to-date partitions.

Test plan

  • TestStalePartition_RevisionSuppressesTrampolining — integration test that simulates a stale partition via rollbackTaskQueueToVersion and verifies:
    • Stale partition (revision 0) is suppressed after declining at a higher revision
    • Genuinely new version (v4 at higher revision) correctly fires the signal
  • Verified test fails when matching + history changes are reverted (matching sends 0, old string comparison)

API repo dependency

  • api: temporalio/api@trampolining-rev-number
  • api-go: temporalio/api-go@trampolining-rev-number

🤖 Generated with Claude Code


Note

Medium Risk
Touches persistence proto/schema and the AddWorkflowTaskStartedEvent API by adding revision numbers, which can impact workflow task dispatch/continue-as-new behavior if mismatched across components. Most call sites are updated, but compatibility and migration/regeneration of generated protos and mocks is the main risk.

Overview
Adds a revision_number to persistence message LastNotifiedTargetVersion (with regenerated executions.pb.go) so the server can carry the task-queue routing config revision alongside the last notified deployment version.

Extends MutableState.AddWorkflowTaskStartedEvent to accept a task-dispatch revision number and threads it through workflow-task start paths (and tests), defaulting to 0 where not available.

Updates module deps and mocks: bumps go.temporal.io/api to a newer commit, makes github.com/golang/mock a direct dependency, and regenerates chasm_tree_mock.go to use github.com/golang/mock/gomock.

Reviewed by Cursor Bugbot for commit de79e50. Bugbot is set up for automated code reviews on this repo. Configure here.

Shivs11 and others added 2 commits April 9, 2026 20:39
Previously, matching hardcoded revision number to 0 for pinned workflow
tasks. This meant history could not use revision-based comparison to
detect stale routing configs from lagging partitions. Now we send the
actual targetDeploymentRevisionNumber (already computed from the
routing config) so history can suppress stale target version signals.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nals

Add revision_number to LastNotifiedTargetVersion (server-internal) and
DeclinedTargetVersionUpgrade (public API) protos. Thread
TaskDispatchRevisionNumber from matching through to the target version
change switch in AddWorkflowTaskStartedEvent.

Case 4 now compares revision numbers instead of version strings:
if the incoming target revision <= the declined revision, the signal
is suppressed. This prevents trampolining when stale matching partitions
send outdated target versions that differ from what was already declined.

computeDeclinedTargetVersionUpgrade now carries the revision number
from LastNotifiedTargetVersion into DeclinedTargetVersionUpgrade at
continue-as-new time.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Shivs11 Shivs11 force-pushed the trampolining-rev-number branch from 7847232 to 1ec2f8b Compare April 10, 2026 00:44
TestStalePartition_RevisionSuppressesTrampolining verifies that when a
stale matching partition sends an outdated target version with a low
revision number, the revision-based comparison in case 4 of the target
version change switch suppresses the signal. Also verifies that a
genuinely new version with a higher revision correctly fires the signal.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Shivs11 Shivs11 force-pushed the trampolining-rev-number branch from 1ec2f8b to de79e50 Compare April 10, 2026 00:48
@Shivs11 Shivs11 marked this pull request as ready for review April 10, 2026 00:51
@Shivs11 Shivs11 requested review from a team as code owners April 10, 2026 00:51
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit de79e50. Configure here.

reflect "reflect"
time "time"

gomock "github.com/golang/mock/gomock"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mock regenerated with wrong gomock library version

Medium Severity

chasm_tree_mock.go was regenerated using the deprecated github.com/golang/mock/gomock instead of the project-standard go.uber.org/mock/gomock. All other mock files in the interfaces package (engine_mock.go, shard_context_mock.go, workflow_context_mock.go) use go.uber.org/mock/gomock, and all test files that call NewMockChasmTree create controllers via go.uber.org/mock/gomock. Since these are distinct Go types, this introduces a type incompatibility. The isgomock struct{} field was also dropped, confirming the wrong mockgen binary was used. The github.com/golang/mock dependency was also promoted from indirect to direct in go.mod to support this.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit de79e50. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant