Fix 'InsufficientInstanceCapacity' Error When Launching or Starting EC2 Instances

intermediateโ˜๏ธ AWS2026-05-17| AWS EC2, AWS CLI, Terraform, CloudFormation โ€” any region/AZ combination

Error Message

An error occurred (InsufficientInstanceCapacity) when calling the RunInstances operation: We currently do not have sufficient capacity in the Availability Zone you requested.
#aws#ec2#capacity#devops

TL;DR

AWS ran out of physical hardware in the Availability Zone you targeted. Three things work immediately: try a different AZ, switch to a similar instance type, or move to a different region. For long-term guarantees โ€” production workloads, compliance requirements โ€” set up an On-Demand Capacity Reservation before you actually need it.

What triggers this error

Every RunInstances call โ€” and every stopped-instance restart โ€” asks AWS to allocate real physical servers in a specific AZ. When that AZ's hypervisor fleet is fully booked for your instance family, you get InsufficientInstanceCapacity. It's a supply problem, not an account problem. Raising your service quota does nothing here.

This hits most often in these situations:

  • Launching GPU-heavy instances like p4d.24xlarge, trn1, or inf2 โ€” these have thin global supply
  • Restarting a stopped instance: AWS released its hardware when you stopped it, and now that spot is taken
  • Auto Scaling groups configured to hammer a single AZ with a single instance type
  • Deploying into a small or lightly-provisioned AZ (e.g., us-east-1e in older accounts)

Fix 1: Try a different Availability Zone

Fastest fix, highest success rate. Capacity pools are per-AZ, so us-east-1b might have plenty while us-east-1a is slammed.

# Drop the AZ constraint โ€” let AWS pick, or point to a subnet in a different AZ
aws ec2 run-instances \
  --image-id ami-0abcdef1234567890 \
  --instance-type m5.xlarge \
  --subnet-id subnet-0bb1234567890abcd \
  --count 1

Need to try several AZs automatically? Here's a quick shell loop:

#!/bin/bash
SUBNETS=("subnet-aaa" "subnet-bbb" "subnet-ccc")  # one subnet per AZ
for SUBNET in "${SUBNETS[@]}"; do
  echo "Trying subnet $SUBNET..."
  RESULT=$(aws ec2 run-instances \
    --image-id ami-0abcdef1234567890 \
    --instance-type m5.xlarge \
    --subnet-id "$SUBNET" \
    --count 1 2>&1)
  if echo "$RESULT" | grep -q "InstanceId"; then
    echo "Success in $SUBNET"
    echo "$RESULT"
    break
  fi
  echo "Failed: $RESULT"
done

Fix 2: Switch to an equivalent instance type

Different instance families draw from different capacity pools โ€” even when the hardware underneath is nearly identical. An m5.xlarge (Intel) and an m5a.xlarge (AMD) give you 4 vCPU and 16 GB RAM either way, but they compete for separate physical inventory.

# m5.xlarge failing? Drop in m6i.xlarge โ€” same specs, separate pool
aws ec2 run-instances \
  --image-id ami-0abcdef1234567890 \
  --instance-type m6i.xlarge \
  --subnet-id subnet-0bb1234567890abcd \
  --count 1

Reliable fallback pairs to try:

  • m5 โ†’ m5a / m6i / m6a
  • c5 โ†’ c5a / c6i
  • r5 โ†’ r5a / r6i
  • p3 โ†’ p3dn / p4d if available, otherwise EC2 Capacity Blocks

Fix 3: Use a Spot Instance as a fallback

Spot draws from a separate capacity pool. On-Demand exhaustion doesn't touch it. This works well for batch jobs, CI runners, or any workload that can handle interruptions.

aws ec2 run-instances \
  --image-id ami-0abcdef1234567890 \
  --instance-type m5.xlarge \
  --subnet-id subnet-0bb1234567890abcd \
  --instance-market-options '{"MarketType":"spot","SpotOptions":{"SpotInstanceType":"one-time"}}' \
  --count 1

Fix 4: Pre-reserve capacity with On-Demand Capacity Reservations

Running production traffic or compliance-sensitive workloads? Don't gamble on capacity being available when you need it. Create a reservation in advance โ€” before an incident forces your hand.

aws ec2 create-capacity-reservation \
  --instance-type m5.xlarge \
  --instance-platform Linux/UNIX \
  --availability-zone us-east-1a \
  --instance-count 5 \
  --instance-match-criteria open

The open match criteria means any On-Demand launch in that AZ with a matching type automatically uses your reservation โ€” no extra flags needed at launch time. One catch: you pay the On-Demand rate 24/7 whether or not instances are running in it. Reserve only what you'll actually use.

To target a reservation explicitly:

aws ec2 run-instances \
  --image-id ami-0abcdef1234567890 \
  --instance-type m5.xlarge \
  --subnet-id subnet-in-us-east-1a \
  --capacity-reservation-specification \
    'CapacityReservationTarget={CapacityReservationId=cr-0123456789abcdef0}' \
  --count 1

Fix 5: Auto Scaling groups โ€” enable multi-AZ and mixed instance types

Single-AZ, single-instance-type ASGs are a capacity disaster waiting to happen. A mixed instances policy spreads the risk across multiple pools at once:

aws autoscaling update-auto-scaling-group \
  --auto-scaling-group-name my-asg \
  --availability-zones us-east-1a us-east-1b us-east-1c \
  --mixed-instances-policy '{
    "LaunchTemplate": {
      "LaunchTemplateSpecification": {
        "LaunchTemplateId": "lt-0123456789abcdef0",
        "Version": "$Latest"
      },
      "Overrides": [
        {"InstanceType": "m5.xlarge"},
        {"InstanceType": "m5a.xlarge"},
        {"InstanceType": "m6i.xlarge"},
        {"InstanceType": "m6a.xlarge"}
      ]
    },
    "InstancesDistribution": {
      "OnDemandBaseCapacity": 1,
      "OnDemandPercentageAboveBaseCapacity": 100
    }
  }'

Fix 6: For stopped instances that won't start

Stopping an EC2 instance releases its underlying hardware back into the pool. When you try to start it again, AWS has to find new hardware โ€” and sometimes there isn't any in that AZ.

Three options, in order of least to most disruptive:

  • Wait and retry: Capacity shifts constantly. Spot shortages in popular AZs like us-east-1a often clear within 15โ€“60 minutes.
  • Change the instance type temporarily: Stop โ†’ modify instance type โ†’ start. A different type might have open capacity.
  • Create an AMI and relaunch in another AZ: More work, but lets you migrate to wherever capacity exists.
# Snapshot the stopped instance
aws ec2 create-image \
  --instance-id i-0123456789abcdef0 \
  --name "my-instance-backup-$(date +%Y%m%d)" \
  --no-reboot

# Launch from that AMI in a different AZ
aws ec2 run-instances \
  --image-id ami- \
  --instance-type m5.xlarge \
  --subnet-id subnet-in-different-az \
  --count 1

Verifying the fix

A successful launch returns an InstanceId. Check that the instance actually reached running state:

aws ec2 describe-instances \
  --instance-ids i-0123456789abcdef0 \
  --query 'Reservations[].Instances[].{ID:InstanceId,State:State.Name,AZ:Placement.AvailabilityZone}' \
  --output table

Expected output:

--------------------------------------------------
|             DescribeInstances                  |
+----------------------+----------+--------------+
|          AZ          |    ID    |    State     |
+----------------------+----------+--------------+
|  us-east-1b          | i-0abc.. |  running     |
+----------------------+----------+--------------+

Before your next launch, check which AZs actually carry the instance type you need:

aws ec2 describe-instance-type-offerings \
  --location-type availability-zone \
  --filters Name=instance-type,Values=m5.xlarge \
  --query 'InstanceTypeOfferings[].Location' \
  --output table

Further reading

  • AWS docs: On-Demand Capacity Reservations โ€” guaranteed capacity in specific AZs
  • AWS docs: EC2 Capacity Blocks for ML โ€” time-boxed GPU instance reservations
  • AWS docs: Auto Scaling mixed instances policy โ€” building resilient fleets

Related Error Notes