The Problem: When Pods Get Stuck in Limbo
You’ve just triggered a deployment or scaled up a service, but the new pods aren't starting. Instead, they sit in a Pending state indefinitely. When you dig into the logs with a describe command, you see a frustrating message from the scheduler:
$ kubectl describe pod my-app-7f45b6d-abcde
...
Events:
Type Warning
Reason FailedScheduling
Message 0/3 nodes are available: 1 Insufficient cpu, 2 Insufficient memory. preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod.
This is the Kubernetes scheduler telling you it inspected every node in your cluster and found zero room. It even looked for lower-priority pods to kick out (preemption), but that wouldn't have cleared enough space either. Essentially, your cluster is at its "paper" capacity.
The Logic: Requests vs. Reality
Here is the catch: the Kubernetes scheduler makes decisions based on Requests, not real-time usage. You might look at a Grafana dashboard and see your nodes idling at 15% CPU, yet the scheduler still reports "Insufficient cpu."
When you define a pod, you specify resources.requests. The scheduler adds up the requests of every pod already running on a node. If the remaining "Allocatable" capacity is smaller than your new pod's request, that node is disqualified. For example, if a node has 8000m CPU and existing pods have requested 7500m, a new pod requesting 1000m will fail to schedule, even if the actual CPU usage is near zero.
Breaking down the error message:
- 1 Insufficient cpu: One node had its CPU capacity almost entirely reserved by other pods.
- 2 Insufficient memory: Two nodes didn't have enough unreserved RAM to meet your pod's minimum requirements.
- No preemption victims found: Your pod’s
PriorityClassisn't high enough to displace existing workloads.
Step 1: Diagnose Cluster Capacity
Before changing your configuration, see what the nodes think they are doing. Run this command to compare allocated resources against total capacity:
kubectl describe nodes | grep -A 7 "Allocated resources"
Focus on the Requests column. If you see percentages near 95% or 100%, you've hit a bottleneck. While kubectl top nodes shows real-time metrics, the scheduler ignores those numbers in favor of the reserved request values.
Quick Fix: Right-Sizing Resource Requests
Many teams set high requests "just to be safe," which leads to massive resource fragmentation. If your pod doesn't actually need a full core to start, lowering the request can solve the scheduling issue instantly.
Consider this common deployment snippet:
resources:
requests:
cpu: "1000m" # 1 Full Core - Is this really needed at startup?
memory: "2Gi"
limits:
cpu: "2000m"
memory: "4Gi"
Try reducing the requests while keeping the limits high. This gives the scheduler more flexibility while still allowing the pod to burst when needed:
resources:
requests:
cpu: "250m"
memory: "512Mi"
Permanent Fix A: Enable the Cluster Autoscaler
If your requests are already accurate and your nodes are genuinely full, you need more hardware. If you use a managed service like EKS, GKE, or AKS, you should enable the Cluster Autoscaler.
When the autoscaler detects a pod that is Unschedulable due to resource constraints, it triggers the provisioning of a new node. Once the new instance joins the cluster, the scheduler automatically places the pending pod there. This usually takes 2–5 minutes depending on your cloud provider.
Permanent Fix B: Use Priority and Preemption
Is your production API being blocked by a low-priority background job? You can use PriorityClasses to tell Kubernetes which pods matter most. This allows the scheduler to evict less important pods to make room for critical ones.
- Define a PriorityClass:
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: critical-api
value: 1000000
globalDefault: false
description: "Use this for core production services."
- Apply it to your Pod spec:
spec:
priorityClassName: critical-api
containers:
...
Check for Node Selectors and Taints
Sometimes the issue isn't total capacity, but narrow constraints. If you use nodeSelector, affinity, or tolerations, the scheduler might only be looking at a small subset of your nodes. For instance, if you pin a pod to instance-type: m5.large and those specific nodes are full, the pod will stay pending even if you have ten empty t3.medium nodes available.
Verification: Confirming the Fix
After you scale the cluster or reduce requests, monitor the pod events. You are looking for a successful Scheduled event:
$ kubectl get events --watch
...
Normal Scheduled 10s default-scheduler Successfully assigned default/my-app-abcde to ip-10-0-1-52.ec2.internal
One final warning: if your pod starts but immediately crashes with an OOMKilled status, you likely trimmed your memory requests too aggressively. Always monitor the pod's behavior for a few minutes after it finally transitions to the Running state.

