Fix pinned workflow trampolining via revision-based signal suppression#9895
Fix pinned workflow trampolining via revision-based signal suppression#9895
Conversation
Previously, matching hardcoded revision number to 0 for pinned workflow tasks. This meant history could not use revision-based comparison to detect stale routing configs from lagging partitions. Now we send the actual targetDeploymentRevisionNumber (already computed from the routing config) so history can suppress stale target version signals. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nals Add revision_number to LastNotifiedTargetVersion (server-internal) and DeclinedTargetVersionUpgrade (public API) protos. Thread TaskDispatchRevisionNumber from matching through to the target version change switch in AddWorkflowTaskStartedEvent. Case 4 now compares revision numbers instead of version strings: if the incoming target revision <= the declined revision, the signal is suppressed. This prevents trampolining when stale matching partitions send outdated target versions that differ from what was already declined. computeDeclinedTargetVersionUpgrade now carries the revision number from LastNotifiedTargetVersion into DeclinedTargetVersionUpgrade at continue-as-new time. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
7847232 to
1ec2f8b
Compare
TestStalePartition_RevisionSuppressesTrampolining verifies that when a stale matching partition sends an outdated target version with a low revision number, the revision-based comparison in case 4 of the target version change switch suppresses the signal. Also verifies that a genuinely new version with a higher revision correctly fires the signal. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1ec2f8b to
de79e50
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Reviewed by Cursor Bugbot for commit de79e50. Configure here.
| reflect "reflect" | ||
| time "time" | ||
|
|
||
| gomock "github.com/golang/mock/gomock" |
There was a problem hiding this comment.
Mock regenerated with wrong gomock library version
Medium Severity
chasm_tree_mock.go was regenerated using the deprecated github.com/golang/mock/gomock instead of the project-standard go.uber.org/mock/gomock. All other mock files in the interfaces package (engine_mock.go, shard_context_mock.go, workflow_context_mock.go) use go.uber.org/mock/gomock, and all test files that call NewMockChasmTree create controllers via go.uber.org/mock/gomock. Since these are distinct Go types, this introduces a type incompatibility. The isgomock struct{} field was also dropped, confirming the wrong mockgen binary was used. The github.com/golang/mock dependency was also promoted from indirect to direct in go.mod to support this.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit de79e50. Configure here.


Summary
targetDeploymentRevisionNumberfor pinned workflows instead of hardcoded 0revision_numberfield toLastNotifiedTargetVersion(server-internal) andDeclinedTargetVersionUpgrade(public API)targetRevisionNumber <= declined.RevisionNumberinstead of version string comparison, preventing trampolining when stale matching partitions send outdated target versionsProblem
When a pinned workflow CaNs and declines a target version upgrade, a stale matching partition can send an older target version. The old case 4 compared version strings (
declined.buildId == target.buildId), which didn't match the stale version — causing the workflow to re-signal, CaN again, and trampoline indefinitely between stale and up-to-date partitions.Test plan
TestStalePartition_RevisionSuppressesTrampolining— integration test that simulates a stale partition viarollbackTaskQueueToVersionand verifies:API repo dependency
temporalio/api@trampolining-rev-numbertemporalio/api-go@trampolining-rev-number🤖 Generated with Claude Code
Note
Medium Risk
Touches persistence proto/schema and the
AddWorkflowTaskStartedEventAPI by adding revision numbers, which can impact workflow task dispatch/continue-as-new behavior if mismatched across components. Most call sites are updated, but compatibility and migration/regeneration of generated protos and mocks is the main risk.Overview
Adds a
revision_numberto persistence messageLastNotifiedTargetVersion(with regeneratedexecutions.pb.go) so the server can carry the task-queue routing config revision alongside the last notified deployment version.Extends
MutableState.AddWorkflowTaskStartedEventto accept a task-dispatch revision number and threads it through workflow-task start paths (and tests), defaulting to0where not available.Updates module deps and mocks: bumps
go.temporal.io/apito a newer commit, makesgithub.com/golang/mocka direct dependency, and regenerateschasm_tree_mock.goto usegithub.com/golang/mock/gomock.Reviewed by Cursor Bugbot for commit de79e50. Bugbot is set up for automated code reviews on this repo. Configure here.