How to Fix LangChain OutputParserException: Could not parse LLM output

The Error

If you have spent more than an hour building with LangChain, you have probably run into this frustrating wall. You expect a clean Python object, but instead, your console throws a wall of red text:

OutputParserException: Could not parse LLM output

This happens when the Large Language Model (LLM) gets "chatty." Instead of returning a raw JSON object, it might wrap the data in conversational filler or markdown code blocks that your parser doesn't know how to handle.

Why this happens

LLMs are designed to be helpful assistants, not rigid data exporters. Even when you explicitly demand "JSON only," models often fail in predictable ways. In our testing with Llama 3, we found that without strict prompting, models fail to return valid JSON roughly 15-20% of the time. Common culprits include:

Conversational prefixes: "Sure! Here is the data you requested: { ... }"
Markdown formatting: Wrapping the response in json ... blocks.
Syntax errors: Missing trailing commas or unescaped double quotes within strings.

Step-by-Step Fix

1. Inspect the Raw Output

You cannot fix what you cannot see. Before you start tweaking your prompt, capture the exact string the LLM sent back. Wrap your chain in a try-except block to reveal the culprit.

from langchain_core.exceptions import OutputParserException

try:
    response = chain.invoke({"input": "Get user data for John Doe"})
except OutputParserException as e:
    # This allows you to see exactly what the LLM produced
    print(f"The LLM sent this back: {e.llm_output}")
    raise e

Checking e.llm_output is the fastest way to tell if the model is hallucinating the schema or just adding unnecessary markdown tags.

2. Inject Format Instructions

Don't make the LLM guess the structure. LangChain's parsers come with a built-in method to generate the exact technical requirements for the model. If your prompt doesn't include these, the model is essentially flying blind.

from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field

class UserInfo(BaseModel):
    name: str = Field(description="User's name")
    age: int = Field(description="User's age")

parser = PydanticOutputParser(pydantic_object=UserInfo)

# parser.get_format_instructions() generates the specific JSON schema
prompt = PromptTemplate(
    template="Answer the query.\n{format_instructions}\n{query}\n",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

3. Switch to JsonOutputParser

The PydanticOutputParser is strict. If the LLM includes triple backticks, it often fails. If you are hitting this wall, switch to the JsonOutputParser. It is significantly more robust and can usually strip away markdown wrappers automatically without crashing.

from langchain_core.output_parsers import JsonOutputParser

# This parser is more 'forgiving' with markdown blocks
parser = JsonOutputParser(pydantic_object=UserInfo)
chain = prompt | model | parser

4. Implement the OutputFixingParser

Sometimes the LLM makes a tiny, fixable mistake like forgetting a closing brace }. Instead of letting the application crash, use the OutputFixingParser. It catches the error and sends the bad output back to the LLM with instructions to correct it. While this adds a second API call, it can push your success rate toward 100%.

from langchain.output_parsers import OutputFixingParser
from langchain_openai import ChatOpenAI

# If the first pass fails, this sends it back to GPT-4 to fix the syntax
new_parser = OutputFixingParser.from_llm(parser=parser, llm=ChatOpenAI(model="gpt-4o"))

Verify the Results

Test your chain with an ambiguous query. A successful fix should return a clean Pydantic object, even if the model tries to be helpful by adding extra text.

result = chain.invoke({"query": "I'm John, and I just turned 30!"})
print(type(result)) 
# Expected: <class '__main__.UserInfo'>
print(result.age)
# Expected: 30

Pro-Tips for Production

Force Zero Temperature: Always set temperature=0 for structured tasks. Even a small bump to 0.7 can increase formatting hallucinations by 30% in smaller models.
Use Native Tool Calling: If you are using OpenAI, Anthropic, or Gemini, stop using manual parsers. Use model.with_structured_output(UserInfo). It uses the model's native API for tool calling, which is far more reliable than text-based parsing.
Validate your JSON: If you are manually crafting prompts, use a tool like the JSON Formatter & Validator. It helps ensure your few-shot examples don't have hidden syntax errors that might confuse the LLM.
Few-Shot Examples: For smaller models like Llama 3 (8B) or Mistral, include two examples of the expected JSON in your prompt. This is often more effective than any parser setting.