Bug description
After PR #2570 migrated vertex status updates to Server-Side Apply (SSA), vertices can get permanently stuck in a reconcile error loop and never reach Running phase, producing no pods.
Error
The controller logs the following error on every reconcile loop:
Vertex.numaflow.numaproj.io "<vertex-name>" is invalid: status.lastScaledAt: Invalid value: "null": lastScaledAt in body must be of type string: "null"
Root cause
With SSA, the entire VertexStatus struct is serialized and sent as a JSON patch. A zero-value metav1.Time (i.e. LastScaledAt that has never been set) serializes to JSON null. The full CRD schema defines lastScaledAt as type: string, format: date-time without nullable: true, so the Kubernetes API server rejects the patch.
This creates a deadlock:
- Vertex created →
LastScaledAt is zero → SSA writes null → API rejects
- Vertex never reaches
Running
- Autoscaler skips it: "Vertex not in Running phase, skip scaling"
LastScaledAt never gets set → back to step 1
The trigger condition is any vertex where currentReplicas == desiredReplicas on first reconcile (e.g. a source vertex with min: 1, max: 1, or any vertex during pipeline pause with min: 0, max: 0), because the scale branch that sets LastScaledAt is never taken.
Affected versions
All versions since #2570 was merged (v1.7.x+). The minimal CRD install is not affected since status uses x-kubernetes-preserve-unknown-fields: true.
Fix
Add // +nullable to LastScaledAt in VertexStatus and MonoVertexStatus so the generated CRD schema includes nullable: true, allowing the API server to accept null for this field.
Bug description
After PR #2570 migrated vertex status updates to Server-Side Apply (SSA), vertices can get permanently stuck in a reconcile error loop and never reach
Runningphase, producing no pods.Error
The controller logs the following error on every reconcile loop:
Root cause
With SSA, the entire
VertexStatusstruct is serialized and sent as a JSON patch. A zero-valuemetav1.Time(i.e.LastScaledAtthat has never been set) serializes to JSONnull. The full CRD schema defineslastScaledAtastype: string, format: date-timewithoutnullable: true, so the Kubernetes API server rejects the patch.This creates a deadlock:
LastScaledAtis zero → SSA writesnull→ API rejectsRunningLastScaledAtnever gets set → back to step 1The trigger condition is any vertex where
currentReplicas == desiredReplicason first reconcile (e.g. a source vertex withmin: 1, max: 1, or any vertex during pipeline pause withmin: 0, max: 0), because the scale branch that setsLastScaledAtis never taken.Affected versions
All versions since #2570 was merged (v1.7.x+). The minimal CRD install is not affected since
statususesx-kubernetes-preserve-unknown-fields: true.Fix
Add
// +nullabletoLastScaledAtinVertexStatusandMonoVertexStatusso the generated CRD schema includesnullable: true, allowing the API server to acceptnullfor this field.