The Error MessageYou’re hammering the AWS API, and suddenly your script grinds to a halt. You see a wall of red text ending in this:
botocore.exceptions.ClientError: An error occurred (ThrottlingException) when calling the DescribeInstances operation: Rate exceeded
This error wears many masks. In DynamoDB, it is ProvisionedThroughputExceededException. In EC2, it is often RequestLimitExceeded. Whatever the name, the root issue is simple: you are moving faster than AWS allows for your account and region.
Why This HappensAWS uses a "Token Bucket" algorithm to protect its infrastructure. Think of it like a coffee shop that can serve 5 customers per minute. If 20 people walk in at once, the first 5 get coffee immediately. The rest have to wait until the barista catches up.
You will likely hit these limits when:
- Rapid Looping: Your script iterates through 500 S3 objects or SSM parameters without a single
time.sleep().- Lambda Spikes: 100 Lambda functions trigger at once, all trying to grab the same secret from Secrets Manager simultaneously.- Thick CI/CD Pipelines: Tools like Terraform or Pulumi make hundreds of Boto3 calls per second during a massive deployment.## Step-by-Step Fixes### 1. Use the Boto3 Adaptive Retry StrategyBoto3’s default retry policy is often too timid for high-volume scripts. The easiest fix is to swap the default settings for a robustConfigobject usingadaptivemode.
import boto3
from botocore.config import Config
# Enable 'adaptive' mode for smarter client-side rate limiting
my_config = Config(
retries = {
'max_attempts': 10,
'mode': 'adaptive'
}
)
# Apply the config to your client
ec2 = boto3.client('ec2', config=my_config)
response = ec2.describe_instances()
Why use Adaptive mode?
- Standard: Retries up to 3 times on transient errors using basic exponential backoff.- Adaptive: This is the gold standard for bulk operations. It observes throttling responses and actually slows down your outgoing requests to match the service's capacity. It prevents you from ever hitting the bucket limit in the first place.### 2. Implement Custom Exponential BackoffSometimes you need surgical control over specific, high-risk functions. The
tenacitylibrary is the industry standard for wrapping Python calls in retry logic.
from tenacity import retry, wait_exponential, stop_after_attempt
import boto3
s3 = boto3.client('s3')
@retry(wait=wait_exponential(multiplier=1, min=2, max=10), stop=stop_after_attempt(5))
def get_s3_object_with_retry(bucket, key):
return s3.get_object(Bucket=bucket, Key=key)
# This call waits 2s, then 4s, then 8s if throttled
result = get_s3_object_with_retry("my-bucket", "large-data.json")
3. Optimize Your API PatternsBefore adding more retries, check if you can make fewer calls. Optimization is always cleaner than error handling.
- Batching: Use plural APIs. Fetch 10 parameters at once with
ssm.get_parametersinstead of callingssm.get_parameterinside a loop.- Server-Side Filtering: Do not list 1,000 EC2 instances just to find the two that are "running." Use theFiltersparameter in your API call to let AWS do the heavy lifting.- Caching: If you are fetching a secret or config value, store it in memory for 60 seconds. Fetching the same secret 1,000 times a minute is a guaranteed way to get throttled.### 4. Request a Service Quota IncreaseIf your code is efficient but you are still hitting walls, you might have outgrown the default limits. For example, SSM Parameter Store defaults to 40 transactions per second (TPS). If you need 100, you have to ask. - Open the Service Quotas console in the AWS Management Console.- Search for the service (e.g., "EC2") and the specific quota (e.g., "DescribeInstances rate").- Select the quota and click "Request quota increase."## VerificationDo not guess if your fix is working. Every Boto3 response includes
ResponseMetadatathat tells you exactly what happened behind the scenes.
import boto3
from botocore.config import Config
config = Config(retries={'max_attempts': 5})
ssm = boto3.client('ssm', config=config)
response = ssm.get_parameter(Name="MyConfig")
# Check the retry history
retries = response['ResponseMetadata'].get('RetryAttempts', 0)
print(f"Success! It took {retries} retries.")
If RetryAttempts is consistently high, such as 4 out of 5, your throughput is too high for your current backoff strategy.

