Fix OpenAI BadRequestError: The response was filtered by Content Moderation

intermediate🧠 AI Tools2026-05-17| Python 3.8+, openai SDK >=1.0.0, any OS (Linux/macOS/Windows)

Error Message

BadRequestError: The response was filtered
#openai#moderation#content-filter

The Error

Mid-API call, your app dies with this:

openai.BadRequestError: Error code: 400 - {'error': {'message': 'The response was filtered due to the prompt triggering Azure OpenAI\'s content management policy. Please modify your prompt and retry.', 'type': 'content_filter', 'param': 'prompt', 'code': 'content_filter'}}

On the standard OpenAI API it's terser:

openai.BadRequestError: The response was filtered

Either way, the content moderation layer intercepted your request. It blocked either the input prompt or the generated output β€” before it ever reached your code. This is a 400, not a 500. The server didn't fail; it deliberately refused.

Why This Happens

Every prompt and completion passes through OpenAI's moderation pipeline. The system scores content across categories like hate, self-harm, sexual content, and violence. Scores run from 0.0 to 1.0 β€” cross the threshold in any single category, and you get this error instead of a completion.

Common triggers:

  • Sensitive keywords in technical or academic context (security research, medical discussions)
  • Raw user input dropped into the prompt without any sanitization
  • Creative writing that involves violence, abuse, or adult themes
  • Medical or legal scenarios describing harm-related situations
  • Security prompts β€” asking about CVEs, exploit patterns, or vulnerability details
  • The model's own output being filtered β€” your prompt was clean, but the completion wasn't

That last one catches people off guard. You can't always predict what the model will generate, so even a safe-looking prompt can produce filtered output.

Step 1 β€” Identify What Got Filtered

Before rewriting anything, confirm whether the problem is your input or the model's output. Run your prompt through the Moderation API directly:

import openai

client = openai.OpenAI()

response = client.moderations.create(
    input="Your prompt text here"
)

result = response.results[0]
print("Flagged:", result.flagged)
print("Categories:", result.categories)
print("Scores:", result.category_scores)

If result.flagged is True, your input is the problem. The result.categories dict tells you exactly what triggered it β€” something like violence: True or sexual/minors: True. The scores show how close other categories are to the threshold, which is useful when a prompt fails intermittently.

If flagged is False but you still get the error, the output is being filtered. Move on to checking finish_reason (Step 4).

Step 2 β€” Rewrite the Prompt

Rephrasing is usually enough. A few patterns that work:

  • Swap emotionally charged words for clinical equivalents. "How to hurt X" β†’ "What are the risks associated with X".
  • Add framing that makes intent explicit: "for a security audit", "in a fictional context", "for a medical training dataset".
  • Split long prompts. A single flagged sentence buried in a 500-word prompt will still kill the whole request.
  • Sanitize user input before it hits the API. Don't trust what users send you.
# Unsafe β€” user input goes straight into the prompt
prompt = f"User asked: {user_input}\nAnswer:"

# Safe β€” check first, then build the prompt
moderation_check = client.moderations.create(input=user_input)
if moderation_check.results[0].flagged:
    raise ValueError("User input contains flagged content.")

prompt = f"User asked: {user_input}\nAnswer:"

Step 3 β€” Handle the Error Gracefully in Code

Don't let a content filter crash your whole app. Catch BadRequestError and give users a meaningful message:

import openai

client = openai.OpenAI()

try:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "user", "content": user_message}
        ]
    )
    answer = response.choices[0].message.content
except openai.BadRequestError as e:
    if "content_filter" in str(e) or "response was filtered" in str(e).lower():
        answer = "Sorry, I can't respond to that request due to content policy."
    else:
        raise  # Different BadRequestError β€” don't swallow it

The else: raise matters. Not every BadRequestError is a content filter β€” invalid model names, malformed messages, and token limit overflows throw the same exception type.

Step 4 β€” Check finish_reason on Successful Responses

There's a subtler variant: the API returns HTTP 200, but the output was filtered mid-generation. In this case finish_reason is content_filter instead of stop, and message.content may be None or truncated.

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": prompt}]
)

choice = response.choices[0]
if choice.finish_reason == "content_filter":
    print("Output was filtered β€” content may be incomplete or None")
    print("Content:", choice.message.content)  # May be None or truncated
else:
    print(choice.message.content)

Always check finish_reason in production. Blindly reading message.content when it's None will throw an AttributeError deeper in your code, which is harder to trace than the original filter error.

Step 5 β€” Azure OpenAI Specific Fix

Azure OpenAI gives you more control. Content filter strictness is configurable per deployment β€” you can request adjusted thresholds for specific categories through the Azure portal.

Navigate to Azure OpenAI β†’ Your Resource β†’ Content Filters and create a custom filter profile. For example, a game moderation service might need looser violence filtering. Microsoft approves these on a case-by-case basis.

One more thing specific to Azure: a wrong deployment name can surface errors that look like content filter issues. Double-check it:

client = openai.AzureOpenAI(
    azure_endpoint="https://your-resource.openai.azure.com/",
    api_key="your-api-key",
    api_version="2024-02-01"
)

response = client.chat.completions.create(
    model="your-deployment-name",  # Must match your Azure deployment exactly
    messages=[{"role": "user", "content": prompt}]
)

Verify the Fix

Two checks before calling it done:

import openai

client = openai.OpenAI()

# 1. Confirm the prompt is clean
mod = client.moderations.create(input=your_new_prompt)
print("Still flagged:", mod.results[0].flagged)  # Should be False

# 2. Run the actual completion
try:
    res = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": your_new_prompt}]
    )
    print("finish_reason:", res.choices[0].finish_reason)  # Should be 'stop'
    print("Response:", res.choices[0].message.content)
except openai.BadRequestError as e:
    print("Still getting error:", e)

flagged: False and finish_reason: stop β€” that's the clean state you want.

Tips

  • Never pass raw user input directly into prompts. Always pre-screen with the Moderation API first, particularly in apps that handle user-generated content.
  • The Moderation API is free to call. Use it as a gate before every completion in high-risk workflows β€” the latency cost is negligible compared to a failed request.
  • System prompts can trigger filtering too. A system message that instructs the model to role-play as a harmful character will get caught just like user messages.
  • gpt-3.5-turbo and gpt-4o don't have identical sensitivity. A prompt that fails on one model sometimes works on the other β€” worth testing if you have flexibility.
  • Legitimately sensitive use cases (medical education, security research, legal analysis) can apply for a usage policy exception through OpenAI's support portal. Document your use case clearly when submitting.

Related Error Notes