Understanding the ErrorThink of this error as a white flag from the Instructor library. It means the LLM tried several times to generate a response that fits your Pydantic model, but failed every single time. By default, Instructor gives up after three attempts. When you see this exception, it’s rarely a bug in the code. Instead, it's a signal that your schema and the LLM's output are fundamentally out of sync.
instructor.exceptions.InstructorRetryException: Failed to extract data after 3 retries
Why Extraction FailsDon't assume the model is being lazy. Most failures happen because the instructions are too rigid or the data is too messy. Common culprits include:
- Deeply Nested Schemas: Smaller models like GPT-3.5-Turbo or Claude Haiku often lose track of the structure if you nest objects more than three levels deep.- Regex Bottlenecks: Using strict patterns like
pattern=r"^\d{3}-\d{2}-\d{4}$"for Social Security numbers can cause failures if the LLM adds a single extra space or a prefix.- Ambiguity: If a field is namedtypewithout a description, the model might guess 'User' while your Enum expects 'ADMIN_USER'.- Token Limit Cutoffs: For very long responses, the model might hit its 4,096 token limit and stop mid-JSON, leaving the string unparseable.## Fix 1: Give the Model a Map with Field DescriptionsLLMs aren't mind readers. They rely heavily on the metadata inside your Pydantic models. By adding adescriptionto your fields, you provide the context the model needs to map raw text to your structure.
The Problem: Vague Schema```
from pydantic import BaseModel
class UserInfo(BaseModel): name: str age: int status: str # The model doesn't know what 'status' means here
### The Solution: Context-Rich Schema```
from pydantic import BaseModel, Field
class UserInfo(BaseModel):
name: str = Field(description="Full name. Capitalize the first letter of each word.")
age: int = Field(description="Age in years. Must be a positive integer.")
status: str = Field(description="Current employment status: choose from 'Employed', 'Unemployed', or 'Student'.")
Fix 2: Relax Your Validation ConstraintsStrict validation is great for databases but tough for LLMs. If you require a specific phone format and the LLM outputs (555) 123-4567 instead of 5551234567, the validation fails. Try extracting the data as a raw string first, then clean it up with a Pydantic @field_validator or a separate Python function.
# Avoid this for raw extraction
# phone: str = Field(pattern=r"^\+\d{10,15}$")
# Use this instead
phone: str = Field(description="The phone number exactly as it appears in the text.")
Fix 3: Upgrade the Model and Increase RetriesIf you are using a smaller model for complex tasks, it might simply lack the 'reasoning' power to follow your schema. Upgrading from GPT-3.5 to GPT-4o often solves retry issues immediately. You can also give the model more chances to fix its own mistakes by bumping up max_retries.
import instructor
from openai import OpenAI
# Modern Instructor syntax using from_openai
client = instructor.from_openai(OpenAI())
try:
user = client.chat.completions.create(
model="gpt-4o", # GPT-4o is significantly better at complex JSON than GPT-3.5
response_model=UserInfo,
max_retries=5, # Give it 5 chances to get the JSON right
messages=[{"role": "user", "content": "Extract data from: John is 30 and currently studying."}]
)
except instructor.exceptions.InstructorRetryException as e:
# Inspect the raw output to see why it failed
print(f"Last attempt output: {e.last_completion}")
Fix 4: Implement Chain of Thought (CoT)Complexity is the enemy of accuracy. By adding a 'thought' field, you force the model to reason through the extraction process before it writes the final JSON. This simple step can reduce validation errors by over 30% in complex tasks.
class ComplexExtraction(BaseModel):
explanation: str = Field(description="A step-by-step logic of how you found the data.")
data_points: list[str]
confidence: float

