retry and error parsing by gurudatta-patil · Pull Request #9878 · temporalio/temporal

gurudatta-patil · 2026-04-08T22:11:10Z

What changed?

When the SQL persistence schema version check runs at startup (verifyPersistenceCompatibleVersion), it now retries automatically on transient Unavailable errors instead of failing immediately. The underlying error wrapping was also fixed to use %w so the error type is preserved through the chain and the retry predicate can actually detect it.

Why?

when a node is replaced, the pod restarts before the network is fully ready. The DB might be reachable within seconds, but the one-shot startup check would fail hard and put the server into CrashLoopBackOff requiring a manual redeploy to recover.

How did you test it?

built
covered by existing tests

Potential risks

The retry has no expiration (NoInterval), so if the DB is permanently unreachable, the server will block at startup indefinitely rather than crashing

Every failed attempt logs a Warn

#8202

gurudatta-patil added 2 commits April 8, 2026 17:02

retry and error parsing

83c92b7

log db not connecting

83cd8d5

gurudatta-patil requested review from a team as code owners April 8, 2026 22:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

retry and error parsing#9878

retry and error parsing#9878
gurudatta-patil wants to merge 2 commits intotemporalio:mainfrom
gurudatta-patil:fix/retry-and-error-parsing

gurudatta-patil commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gurudatta-patil commented Apr 8, 2026

What changed?

Why?

How did you test it?

Potential risks

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant