The Error
You run kubectl rollout status deployment/my-app and the terminal just sits there:
Waiting for deployment "my-app" rollout to finish: 1 old replicas are pending termination...
The new pod is up. It's passing health checks. But the old one won't die. Minutes pass. The deployment never reaches successfully rolled out.
Root Causes
- A PodDisruptionBudget (PDB) is blocking termination β the budget allows zero disruptions, so Kubernetes refuses to kill the old pod.
- The old pod is frozen in
Terminatingβ a finalizer, a process ignoring SIGTERM, or a slow graceful shutdown is holding it hostage. - Not enough node capacity β the new pod can't be scheduled anywhere, so the old one stays alive as a placeholder.
- Deadlocked rolling update settings β
maxUnavailable: 0andmaxSurge: 0together mean Kubernetes can't make a single move. - The new pod never turns Ready β Kubernetes won't evict the old replica until its replacement is healthy. A broken readiness probe silently stalls the whole rollout.
Diagnose First
Don't guess. Run these commands to pinpoint exactly what's stuck:
# See the full rollout state
kubectl rollout status deployment/my-app
# List all pods from this deployment with node placement
kubectl get pods -l app=my-app -o wide
# Inspect events on the stuck pod
kubectl describe pod <terminating-pod-name>
# Check if a PodDisruptionBudget is blocking
kubectl get pdb
kubectl describe pdb <pdb-name>
If the pod has been in Terminating for more than 5 minutes, it almost certainly has a finalizer. Check:
kubectl get pod <terminating-pod-name> -o json | jq '.metadata.finalizers'
A non-empty array here is your culprit.
Fix 1: Force-Delete a Pod Stuck in Terminating
Pod been Terminating for 10+ minutes? Normal deletion isn't coming back. Force it:
kubectl delete pod <terminating-pod-name> --grace-period=0 --force
This skips the graceful shutdown window entirely. Only use it after confirming the pod is genuinely frozen β not just a slow Java service that needs 90 seconds to drain connections.
If a finalizer is blocking deletion, strip it out manually:
kubectl patch pod <terminating-pod-name> -p '{"metadata":{"finalizers":[]}}' --type=merge
Fix 2: Adjust or Temporarily Remove the PodDisruptionBudget
kubectl describe pdb showing Allowed disruptions: 0? That's the blocker. It means your PDB's minAvailable equals your current replica count β no pod can be touched. Three ways out:
# Option A: scale up to create headroom (safest)
kubectl scale deployment/my-app --replicas=<current+1>
# Option B: relax the PDB minimum temporarily
kubectl patch pdb <pdb-name> -p '{"spec":{"minAvailable":1}}'
# Option C: delete the PDB entirely until rollout finishes
kubectl delete pdb <pdb-name>
Option A is safest for production β you're adding capacity rather than reducing protection. Restore the original PDB config once the rollout completes.
Fix 3: Fix Readiness Probe Failures on the New Pod
A rollout that stalls silently with no obvious errors is almost always a readiness probe issue. The new pod starts, never turns Ready, and Kubernetes keeps the old one alive indefinitely.
kubectl describe pod <new-pod-name> | grep -A 10 Readiness
kubectl logs <new-pod-name>
Two common scenarios: the app needs 45 seconds to initialize but initialDelaySeconds is set to 10, or the /health endpoint got renamed to /healthz in the new image. Fix the probe config:
readinessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 15
periodSeconds: 10
failureThreshold: 3
Fix 4: Tune Rolling Update Strategy
Setting both maxUnavailable: 0 and maxSurge: 0 creates a deadlock. Kubernetes can't bring up a new pod (surge is 0) and can't remove the old one (unavailable is 0). Nothing moves.
kubectl get deployment my-app -o yaml | grep -A 5 strategy
Allow at least 1 surge pod and the rollout unblocks immediately:
kubectl patch deployment my-app -p '{
"spec": {
"strategy": {
"rollingUpdate": {
"maxUnavailable": 0,
"maxSurge": 1
}
}
}
}'
maxUnavailable: 0, maxSurge: 1 is the standard zero-downtime default. It runs one extra pod during the transition, then removes the old one once the replacement is healthy.
Fix 5: Rollback If Needed
New image is broken and you need production unblocked now? Roll back first, debug later:
# Roll back to the previous revision
kubectl rollout undo deployment/my-app
# Or target a specific revision
kubectl rollout history deployment/my-app
kubectl rollout undo deployment/my-app --to-revision=2
Verify the Fix
# Confirm rollout completed
kubectl rollout status deployment/my-app
# Expected: deployment "my-app" successfully rolled out
# No pods stuck in Terminating
kubectl get pods -l app=my-app
# Replica counts match desired state
kubectl get deployment my-app
Prevention
- Tune
terminationGracePeriodSecondsto your app's actual shutdown time. A Spring Boot app draining a connection pool might need 60β90 seconds. The default 30s isn't universal. - Test readiness probes in staging before every rollout. A probe that never passes is the most common cause of silent rollout stalls β and the easiest to catch early.
- Default to
maxUnavailable: 0, maxSurge: 1unless you have strict resource constraints. It's the safest zero-downtime configuration for most workloads. - Keep PDB
minAvailablebelow your replica count. If you run 3 replicas and setminAvailable: 3, you've permanently blocked all disruptions β including your own rollouts. - Handle SIGTERM properly in your application. Your process should catch the signal, stop accepting new requests, drain in-flight ones, then exit cleanly. Ignoring SIGTERM is what turns a 30-second shutdown into a 10-minute frozen pod.

