Fixing Boto3 ThrottlingException: Handle 'Rate exceeded' Errors in AWS Python Scripts

intermediate☁️ AWS2026-04-29| Python 3.x, Boto3 SDK, AWS Lambda, or any environment running AWS API operations via Python.

Error Message

ThrottlingException: Rate exceeded
#throttling#boto3#api#aws#python#devops

The Error MessageYou’re hammering the AWS API, and suddenly your script grinds to a halt. You see a wall of red text ending in this:

botocore.exceptions.ClientError: An error occurred (ThrottlingException) when calling the DescribeInstances operation: Rate exceeded

This error wears many masks. In DynamoDB, it is ProvisionedThroughputExceededException. In EC2, it is often RequestLimitExceeded. Whatever the name, the root issue is simple: you are moving faster than AWS allows for your account and region.

Why This HappensAWS uses a "Token Bucket" algorithm to protect its infrastructure. Think of it like a coffee shop that can serve 5 customers per minute. If 20 people walk in at once, the first 5 get coffee immediately. The rest have to wait until the barista catches up.

You will likely hit these limits when:

  • Rapid Looping: Your script iterates through 500 S3 objects or SSM parameters without a single time.sleep().- Lambda Spikes: 100 Lambda functions trigger at once, all trying to grab the same secret from Secrets Manager simultaneously.- Thick CI/CD Pipelines: Tools like Terraform or Pulumi make hundreds of Boto3 calls per second during a massive deployment.## Step-by-Step Fixes### 1. Use the Boto3 Adaptive Retry StrategyBoto3’s default retry policy is often too timid for high-volume scripts. The easiest fix is to swap the default settings for a robust Config object using adaptive mode.
import boto3
from botocore.config import Config

# Enable 'adaptive' mode for smarter client-side rate limiting
my_config = Config(
    retries = {
        'max_attempts': 10,
        'mode': 'adaptive'
    }
)

# Apply the config to your client
ec2 = boto3.client('ec2', config=my_config)

response = ec2.describe_instances()

Why use Adaptive mode?

  • Standard: Retries up to 3 times on transient errors using basic exponential backoff.- Adaptive: This is the gold standard for bulk operations. It observes throttling responses and actually slows down your outgoing requests to match the service's capacity. It prevents you from ever hitting the bucket limit in the first place.### 2. Implement Custom Exponential BackoffSometimes you need surgical control over specific, high-risk functions. The tenacity library is the industry standard for wrapping Python calls in retry logic.
from tenacity import retry, wait_exponential, stop_after_attempt
import boto3

s3 = boto3.client('s3')

@retry(wait=wait_exponential(multiplier=1, min=2, max=10), stop=stop_after_attempt(5))
def get_s3_object_with_retry(bucket, key):
    return s3.get_object(Bucket=bucket, Key=key)

# This call waits 2s, then 4s, then 8s if throttled
result = get_s3_object_with_retry("my-bucket", "large-data.json")

3. Optimize Your API PatternsBefore adding more retries, check if you can make fewer calls. Optimization is always cleaner than error handling.

  • Batching: Use plural APIs. Fetch 10 parameters at once with ssm.get_parameters instead of calling ssm.get_parameter inside a loop.- Server-Side Filtering: Do not list 1,000 EC2 instances just to find the two that are "running." Use the Filters parameter in your API call to let AWS do the heavy lifting.- Caching: If you are fetching a secret or config value, store it in memory for 60 seconds. Fetching the same secret 1,000 times a minute is a guaranteed way to get throttled.### 4. Request a Service Quota IncreaseIf your code is efficient but you are still hitting walls, you might have outgrown the default limits. For example, SSM Parameter Store defaults to 40 transactions per second (TPS). If you need 100, you have to ask.
  • Open the Service Quotas console in the AWS Management Console.- Search for the service (e.g., "EC2") and the specific quota (e.g., "DescribeInstances rate").- Select the quota and click "Request quota increase."## VerificationDo not guess if your fix is working. Every Boto3 response includes ResponseMetadata that tells you exactly what happened behind the scenes.
import boto3
from botocore.config import Config

config = Config(retries={'max_attempts': 5})
ssm = boto3.client('ssm', config=config)

response = ssm.get_parameter(Name="MyConfig")

# Check the retry history
retries = response['ResponseMetadata'].get('RetryAttempts', 0)
print(f"Success! It took {retries} retries.")

If RetryAttempts is consistently high, such as 4 out of 5, your throughput is too high for your current backoff strategy.

Final Tips- Environment Variables: You can enable adaptive retries globally without changing code. Set AWS_RETRY_MODE=adaptive and AWS_MAX_ATTEMPTS=10 in your Dockerfile or Lambda configuration.- CloudWatch Metrics: Monitor the ThrottledCount metric in CloudWatch. A spike here often precedes a full application outage.- Logging: Always log the number of retries. It is the "check engine light" for your AWS integrations.

Related Error Notes