The Error
Mid-API call, your app dies with this:
openai.BadRequestError: Error code: 400 - {'error': {'message': 'The response was filtered due to the prompt triggering Azure OpenAI\'s content management policy. Please modify your prompt and retry.', 'type': 'content_filter', 'param': 'prompt', 'code': 'content_filter'}}
On the standard OpenAI API it's terser:
openai.BadRequestError: The response was filtered
Either way, the content moderation layer intercepted your request. It blocked either the input prompt or the generated output β before it ever reached your code. This is a 400, not a 500. The server didn't fail; it deliberately refused.
Why This Happens
Every prompt and completion passes through OpenAI's moderation pipeline. The system scores content across categories like hate, self-harm, sexual content, and violence. Scores run from 0.0 to 1.0 β cross the threshold in any single category, and you get this error instead of a completion.
Common triggers:
- Sensitive keywords in technical or academic context (security research, medical discussions)
- Raw user input dropped into the prompt without any sanitization
- Creative writing that involves violence, abuse, or adult themes
- Medical or legal scenarios describing harm-related situations
- Security prompts β asking about CVEs, exploit patterns, or vulnerability details
- The model's own output being filtered β your prompt was clean, but the completion wasn't
That last one catches people off guard. You can't always predict what the model will generate, so even a safe-looking prompt can produce filtered output.
Step 1 β Identify What Got Filtered
Before rewriting anything, confirm whether the problem is your input or the model's output. Run your prompt through the Moderation API directly:
import openai
client = openai.OpenAI()
response = client.moderations.create(
input="Your prompt text here"
)
result = response.results[0]
print("Flagged:", result.flagged)
print("Categories:", result.categories)
print("Scores:", result.category_scores)
If result.flagged is True, your input is the problem. The result.categories dict tells you exactly what triggered it β something like violence: True or sexual/minors: True. The scores show how close other categories are to the threshold, which is useful when a prompt fails intermittently.
If flagged is False but you still get the error, the output is being filtered. Move on to checking finish_reason (Step 4).
Step 2 β Rewrite the Prompt
Rephrasing is usually enough. A few patterns that work:
- Swap emotionally charged words for clinical equivalents. "How to hurt X" β "What are the risks associated with X".
- Add framing that makes intent explicit: "for a security audit", "in a fictional context", "for a medical training dataset".
- Split long prompts. A single flagged sentence buried in a 500-word prompt will still kill the whole request.
- Sanitize user input before it hits the API. Don't trust what users send you.
# Unsafe β user input goes straight into the prompt
prompt = f"User asked: {user_input}\nAnswer:"
# Safe β check first, then build the prompt
moderation_check = client.moderations.create(input=user_input)
if moderation_check.results[0].flagged:
raise ValueError("User input contains flagged content.")
prompt = f"User asked: {user_input}\nAnswer:"
Step 3 β Handle the Error Gracefully in Code
Don't let a content filter crash your whole app. Catch BadRequestError and give users a meaningful message:
import openai
client = openai.OpenAI()
try:
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": user_message}
]
)
answer = response.choices[0].message.content
except openai.BadRequestError as e:
if "content_filter" in str(e) or "response was filtered" in str(e).lower():
answer = "Sorry, I can't respond to that request due to content policy."
else:
raise # Different BadRequestError β don't swallow it
The else: raise matters. Not every BadRequestError is a content filter β invalid model names, malformed messages, and token limit overflows throw the same exception type.
Step 4 β Check finish_reason on Successful Responses
There's a subtler variant: the API returns HTTP 200, but the output was filtered mid-generation. In this case finish_reason is content_filter instead of stop, and message.content may be None or truncated.
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}]
)
choice = response.choices[0]
if choice.finish_reason == "content_filter":
print("Output was filtered β content may be incomplete or None")
print("Content:", choice.message.content) # May be None or truncated
else:
print(choice.message.content)
Always check finish_reason in production. Blindly reading message.content when it's None will throw an AttributeError deeper in your code, which is harder to trace than the original filter error.
Step 5 β Azure OpenAI Specific Fix
Azure OpenAI gives you more control. Content filter strictness is configurable per deployment β you can request adjusted thresholds for specific categories through the Azure portal.
Navigate to Azure OpenAI β Your Resource β Content Filters and create a custom filter profile. For example, a game moderation service might need looser violence filtering. Microsoft approves these on a case-by-case basis.
One more thing specific to Azure: a wrong deployment name can surface errors that look like content filter issues. Double-check it:
client = openai.AzureOpenAI(
azure_endpoint="https://your-resource.openai.azure.com/",
api_key="your-api-key",
api_version="2024-02-01"
)
response = client.chat.completions.create(
model="your-deployment-name", # Must match your Azure deployment exactly
messages=[{"role": "user", "content": prompt}]
)
Verify the Fix
Two checks before calling it done:
import openai
client = openai.OpenAI()
# 1. Confirm the prompt is clean
mod = client.moderations.create(input=your_new_prompt)
print("Still flagged:", mod.results[0].flagged) # Should be False
# 2. Run the actual completion
try:
res = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": your_new_prompt}]
)
print("finish_reason:", res.choices[0].finish_reason) # Should be 'stop'
print("Response:", res.choices[0].message.content)
except openai.BadRequestError as e:
print("Still getting error:", e)
flagged: False and finish_reason: stop β that's the clean state you want.
Tips
- Never pass raw user input directly into prompts. Always pre-screen with the Moderation API first, particularly in apps that handle user-generated content.
- The Moderation API is free to call. Use it as a gate before every completion in high-risk workflows β the latency cost is negligible compared to a failed request.
- System prompts can trigger filtering too. A system message that instructs the model to role-play as a harmful character will get caught just like user messages.
gpt-3.5-turboandgpt-4odon't have identical sensitivity. A prompt that fails on one model sometimes works on the other β worth testing if you have flexibility.- Legitimately sensitive use cases (medical education, security research, legal analysis) can apply for a usage policy exception through OpenAI's support portal. Document your use case clearly when submitting.

