Fix 'InsufficientInstanceCapacity' Error When Launching or Starting EC2 Instances

TL;DR

AWS ran out of physical hardware in the Availability Zone you targeted. Three things work immediately: try a different AZ, switch to a similar instance type, or move to a different region. For long-term guarantees — production workloads, compliance requirements — set up an On-Demand Capacity Reservation before you actually need it.

What triggers this error

Every RunInstances call — and every stopped-instance restart — asks AWS to allocate real physical servers in a specific AZ. When that AZ's hypervisor fleet is fully booked for your instance family, you get InsufficientInstanceCapacity. It's a supply problem, not an account problem. Raising your service quota does nothing here.

This hits most often in these situations:

Launching GPU-heavy instances like p4d.24xlarge, trn1, or inf2 — these have thin global supply
Restarting a stopped instance: AWS released its hardware when you stopped it, and now that spot is taken
Auto Scaling groups configured to hammer a single AZ with a single instance type
Deploying into a small or lightly-provisioned AZ (e.g., us-east-1e in older accounts)

Fix 1: Try a different Availability Zone

Fastest fix, highest success rate. Capacity pools are per-AZ, so us-east-1b might have plenty while us-east-1a is slammed.

# Drop the AZ constraint — let AWS pick, or point to a subnet in a different AZ
aws ec2 run-instances \
  --image-id ami-0abcdef1234567890 \
  --instance-type m5.xlarge \
  --subnet-id subnet-0bb1234567890abcd \
  --count 1

Need to try several AZs automatically? Here's a quick shell loop:

#!/bin/bash
SUBNETS=("subnet-aaa" "subnet-bbb" "subnet-ccc")  # one subnet per AZ
for SUBNET in "${SUBNETS[@]}"; do
  echo "Trying subnet $SUBNET..."
  RESULT=$(aws ec2 run-instances \
    --image-id ami-0abcdef1234567890 \
    --instance-type m5.xlarge \
    --subnet-id "$SUBNET" \
    --count 1 2>&1)
  if echo "$RESULT" | grep -q "InstanceId"; then
    echo "Success in $SUBNET"
    echo "$RESULT"
    break
  fi
  echo "Failed: $RESULT"
done

Fix 2: Switch to an equivalent instance type

Different instance families draw from different capacity pools — even when the hardware underneath is nearly identical. An m5.xlarge (Intel) and an m5a.xlarge (AMD) give you 4 vCPU and 16 GB RAM either way, but they compete for separate physical inventory.

# m5.xlarge failing? Drop in m6i.xlarge — same specs, separate pool
aws ec2 run-instances \
  --image-id ami-0abcdef1234567890 \
  --instance-type m6i.xlarge \
  --subnet-id subnet-0bb1234567890abcd \
  --count 1

Reliable fallback pairs to try:

m5 → m5a / m6i / m6a
c5 → c5a / c6i
r5 → r5a / r6i
p3 → p3dn / p4d if available, otherwise EC2 Capacity Blocks

Fix 3: Use a Spot Instance as a fallback

Spot draws from a separate capacity pool. On-Demand exhaustion doesn't touch it. This works well for batch jobs, CI runners, or any workload that can handle interruptions.

aws ec2 run-instances \
  --image-id ami-0abcdef1234567890 \
  --instance-type m5.xlarge \
  --subnet-id subnet-0bb1234567890abcd \
  --instance-market-options '{"MarketType":"spot","SpotOptions":{"SpotInstanceType":"one-time"}}' \
  --count 1

Fix 4: Pre-reserve capacity with On-Demand Capacity Reservations

Running production traffic or compliance-sensitive workloads? Don't gamble on capacity being available when you need it. Create a reservation in advance — before an incident forces your hand.

aws ec2 create-capacity-reservation \
  --instance-type m5.xlarge \
  --instance-platform Linux/UNIX \
  --availability-zone us-east-1a \
  --instance-count 5 \
  --instance-match-criteria open

The open match criteria means any On-Demand launch in that AZ with a matching type automatically uses your reservation — no extra flags needed at launch time. One catch: you pay the On-Demand rate 24/7 whether or not instances are running in it. Reserve only what you'll actually use.

To target a reservation explicitly:

aws ec2 run-instances \
  --image-id ami-0abcdef1234567890 \
  --instance-type m5.xlarge \
  --subnet-id subnet-in-us-east-1a \
  --capacity-reservation-specification \
    'CapacityReservationTarget={CapacityReservationId=cr-0123456789abcdef0}' \
  --count 1

Fix 5: Auto Scaling groups — enable multi-AZ and mixed instance types

Single-AZ, single-instance-type ASGs are a capacity disaster waiting to happen. A mixed instances policy spreads the risk across multiple pools at once:

aws autoscaling update-auto-scaling-group \
  --auto-scaling-group-name my-asg \
  --availability-zones us-east-1a us-east-1b us-east-1c \
  --mixed-instances-policy '{
    "LaunchTemplate": {
      "LaunchTemplateSpecification": {
        "LaunchTemplateId": "lt-0123456789abcdef0",
        "Version": "$Latest"
      },
      "Overrides": [
        {"InstanceType": "m5.xlarge"},
        {"InstanceType": "m5a.xlarge"},
        {"InstanceType": "m6i.xlarge"},
        {"InstanceType": "m6a.xlarge"}
      ]
    },
    "InstancesDistribution": {
      "OnDemandBaseCapacity": 1,
      "OnDemandPercentageAboveBaseCapacity": 100
    }
  }'

Fix 6: For stopped instances that won't start

Stopping an EC2 instance releases its underlying hardware back into the pool. When you try to start it again, AWS has to find new hardware — and sometimes there isn't any in that AZ.

Three options, in order of least to most disruptive:

Wait and retry: Capacity shifts constantly. Spot shortages in popular AZs like us-east-1a often clear within 15–60 minutes.
Change the instance type temporarily: Stop → modify instance type → start. A different type might have open capacity.
Create an AMI and relaunch in another AZ: More work, but lets you migrate to wherever capacity exists.

# Snapshot the stopped instance
aws ec2 create-image \
  --instance-id i-0123456789abcdef0 \
  --name "my-instance-backup-$(date +%Y%m%d)" \
  --no-reboot

# Launch from that AMI in a different AZ
aws ec2 run-instances \
  --image-id ami- \
  --instance-type m5.xlarge \
  --subnet-id subnet-in-different-az \
  --count 1

Verifying the fix

A successful launch returns an InstanceId. Check that the instance actually reached running state:

aws ec2 describe-instances \
  --instance-ids i-0123456789abcdef0 \
  --query 'Reservations[].Instances[].{ID:InstanceId,State:State.Name,AZ:Placement.AvailabilityZone}' \
  --output table

Expected output:

--------------------------------------------------
|             DescribeInstances                  |
+----------------------+----------+--------------+
|          AZ          |    ID    |    State     |
+----------------------+----------+--------------+
|  us-east-1b          | i-0abc.. |  running     |
+----------------------+----------+--------------+

Before your next launch, check which AZs actually carry the instance type you need:

aws ec2 describe-instance-type-offerings \
  --location-type availability-zone \
  --filters Name=instance-type,Values=m5.xlarge \
  --query 'InstanceTypeOfferings[].Location' \
  --output table

Fix 'InsufficientInstanceCapacity' Error When Launching or Starting EC2 Instances

TL;DR

What triggers this error

Fix 1: Try a different Availability Zone

Fix 2: Switch to an equivalent instance type

Fix 3: Use a Spot Instance as a fallback

Fix 4: Pre-reserve capacity with On-Demand Capacity Reservations

Fix 5: Auto Scaling groups — enable multi-AZ and mixed instance types

Fix 6: For stopped instances that won't start

Verifying the fix

Further reading

Related Error Notes

Fixing the AWS EventBridge 'Permission Denied' Error for Lambda

Stopping the Throttle: Fixing Kinesis ProvisionedThroughputExceededException

Fix: 'This stack uses assets' Error in AWS CDK