Fix Kubernetes ImagePullBackOff: Causes and Solutions

The Error

You deploy a pod and it never reaches Running. Instead:

$ kubectl get pods
NAME                        READY   STATUS             RESTARTS   AGE
my-app-7d4b9c6f8-xk2p9     0/1     ImagePullBackOff   0          2m

You might also see it flip between ErrImagePull and ImagePullBackOff — treat them the same. Kubernetes tried to pull the image, failed, and is now waiting progressively longer between retries (30s → 1m → 2m → up to 5m).

Diagnose First

Don't guess. Get the actual error message before touching anything:

kubectl describe pod <pod-name> -n <namespace>

Scroll to the Events section at the bottom. You'll find something specific — authentication required, manifest unknown, pull access denied, or 429 Too Many Requests.

That one line tells you exactly which fix applies.

Root Cause 1: Wrong Image Name or Tag

Most common cause. Typo in the image name, nonexistent tag, or an image you forgot to push.

# Check what image your pod is actually trying to pull
kubectl get pod <pod-name> -o jsonpath='{.spec.containers[*].image}'

# Try pulling it manually from a cluster node
docker pull nginx:lates   # typo — 'lates' instead of 'latest'

The Events section will show manifest unknown or not found. Fix the image name or tag in your deployment manifest:

kubectl set image deployment/my-app my-app=nginx:latest
# or edit the manifest directly
kubectl edit deployment/my-app

Verify the image exists in the registry before deploying:

# For Docker Hub
curl -s https://hub.docker.com/v2/repositories/<org>/<repo>/tags/ | jq '.results[].name'

# For ECR
aws ecr describe-images --repository-name <repo-name> --region <region>

Root Cause 2: Private Registry — Missing imagePullSecret

Private registries (ECR, GCR, Docker Hub private repos, Harbor) require credentials. Without them, Kubernetes has no way to authenticate.

Look for pull access denied or authentication required in Events.

Create the pull secret

# For Docker Hub / generic registry
kubectl create secret docker-registry regcred \
  --docker-server=https://index.docker.io/v1/ \
  --docker-username=<your-username> \
  --docker-password=<your-password-or-token> \
  --docker-email=<your-email> \
  -n <namespace>

# For AWS ECR — get login token first
aws ecr get-login-password --region <region> | \
  kubectl create secret docker-registry ecr-secret \
    --docker-server=<account-id>.dkr.ecr.<region>.amazonaws.com \
    --docker-username=AWS \
    --docker-password-stdin \
    -n <namespace>

Reference the secret in your pod spec

spec:
  imagePullSecrets:
    - name: regcred
  containers:
    - name: my-app
      image: myprivateregistry.com/my-app:v1.2

Already have a running deployment? Patch it instead of redeploying:

kubectl patch deployment my-app \
  -p '{"spec":{"template":{"spec":{"imagePullSecrets":[{"name":"regcred"}]}}}}'

Attach the secret to the default service account

This makes every pod in the namespace use the secret automatically — no need to add imagePullSecrets to each deployment:

kubectl patch serviceaccount default \
  -p '{"imagePullSecrets":[{"name":"regcred"}]}' \
  -n <namespace>

Root Cause 3: Expired or Invalid Credentials

ECR tokens expire every 12 hours. A pull secret that worked yesterday can silently stop working today — this catches people off guard.

Delete and recreate the secret:

kubectl delete secret ecr-secret -n <namespace>

aws ecr get-login-password --region <region> | \
  kubectl create secret docker-registry ecr-secret \
    --docker-server=<account-id>.dkr.ecr.<region>.amazonaws.com \
    --docker-username=AWS \
    --docker-password-stdin \
    -n <namespace>

For production workloads, don't manage this manually. Use a Kubernetes CronJob to refresh the token every 8–10 hours, or better yet, switch to IRSA (IAM Roles for Service Accounts) on EKS and skip static secrets entirely.

Root Cause 4: Docker Hub Rate Limiting

Docker Hub caps anonymous pulls at 100 per 6 hours and authenticated pulls at 200 per 6 hours — counted per IP. Cloud cluster nodes often share a NAT gateway, so a handful of deployments can exhaust the quota in minutes.

Events will show: 429 Too Many Requests or toomanyrequests: You have reached your pull rate limit.

Quick fix — authenticate your pulls:

kubectl create secret docker-registry dockerhub-creds \
  --docker-server=https://index.docker.io/v1/ \
  --docker-username=<username> \
  --docker-password=<access-token> \
  -n <namespace>

Longer-term, mirror the images you need to your own registry (ECR, GCR, or Harbor) and pull from there. This eliminates the rate limit problem and speeds up pod startup.

Root Cause 5: Network or Firewall Blocking the Registry

When the node can't reach the registry at all, Events show connection timeouts like dial tcp: i/o timeout or context deadline exceeded.

SSH into a node and test directly:

# SSH into a node and test connectivity
curl -v https://registry-1.docker.io/v2/
curl -v https://<account-id>.dkr.ecr.<region>.amazonaws.com/v2/

Three things to check: missing VPC endpoint for ECR (outbound traffic goes to the public internet instead of staying internal), a security group blocking outbound port 443, or an HTTP proxy configured for Docker but not for containerd. The fix depends on which one applies to your setup.

Verify the Fix

# Watch the pod status in real time
kubectl get pods -w -n <namespace>

# Healthy sequence: Pending → ContainerCreating → Running
# If it loops back to ImagePullBackOff, describe again
kubectl describe pod <pod-name> -n <namespace>

# Check recent events for confirmation
kubectl get events -n <namespace> --sort-by='.lastTimestamp' | tail -20

Once you see Running with READY 1/1, the image pulled successfully.

Prevention

Verify image tags exist in the registry before deploying — pin to a specific digest or semver tag, never :latest in production
Use a registry co-located with your cluster (ECR for EKS, GCR for GKE) to avoid cross-region latency, rate limits, and firewall complexity
Automate short-lived token refresh with a Kubernetes CronJob or ExternalSecret operator — don't rely on manually recreating secrets
On EKS, IRSA + ECR pull-through cache is the cleanest setup: no secrets to manage, tokens rotate automatically
When in doubt, run docker pull <image> from a cluster node before deploying — it's faster than waiting for Kubernetes to tell you the image doesn't exist