Fix LlamaIndex IndexError: list index out of range from Empty Retrieval Response

The Error

You're running a RAG pipeline with LlamaIndex and suddenly hit this:

Traceback (most recent call last):
  File "query.py", line 14, in <module>
    response = query_engine.query("What is the refund policy?")
  File ".../llama_index/core/query_engine/retriever_query_engine.py", line 190, in query
    ...
  File ".../llama_index/core/response_synthesizers/base.py", line 102, in synthesize
    text_chunks = [node.get_content() for node in nodes]
IndexError: list index out of range

The retriever returned zero nodes, and something downstream tried to index into an empty list. It shows up constantly in LlamaIndex RAG setups — and it's almost always a silent failure with no warning before the crash.

Why It Happens

Every case traces back to one thing: the retriever found no matching nodes for the query. Here are the usual culprits:

The index was built from an empty document list, or documents weren't chunked properly
Similarity threshold is set too high — no chunk scores above the cutoff (0.85+ is often too strict)
similarity_top_k is set to 0, or the index has fewer nodes than requested
Embedding mismatch — index was built with one model, queried with another
Vector store connection issue (Pinecone, Weaviate, Chroma) silently returning empty results
The index file got corrupted or wasn't persisted correctly before loading

Step-by-Step Fix

Step 1 — Verify the retriever actually returns nodes

Start by isolating the retriever. Call it directly, bypassing the query engine entirely:

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

# Build or load your index
index = VectorStoreIndex.from_documents(documents)
retriever = index.as_retriever(similarity_top_k=5)

# Direct retriever call — bypass the query engine
nodes = retriever.retrieve("What is the refund policy?")
print(f"Retrieved {len(nodes)} nodes")
for node in nodes:
    print(node.score, node.get_content()[:100])

If len(nodes) == 0, the bug is in retriever configuration or index data — not the query engine. That narrows things down fast.

Step 2 — Check your index actually has data

# For an in-memory VectorStoreIndex
print(f"Index has {len(index.docstore.docs)} documents")

# For a persisted index
from llama_index.core import StorageContext, load_index_from_storage

storage_context = StorageContext.from_defaults(persist_dir="./storage")
index = load_index_from_storage(storage_context)
print(f"Loaded index with {len(index.docstore.docs)} docs")

Count is 0? Your documents weren't indexed. Re-run the ingestion pipeline and confirm documents load correctly before indexing.

Step 3 — Lower or remove the similarity threshold

A node postprocessor with an aggressive similarity cutoff can silently filter out every result:

from llama_index.core.postprocessor import SimilarityPostprocessor

# Too aggressive — likely cutting all results
postprocessor = SimilarityPostprocessor(similarity_cutoff=0.85)

# More reasonable starting point
postprocessor = SimilarityPostprocessor(similarity_cutoff=0.5)

Disable the postprocessor first. Confirm retrieval works. Then add filtering back in small steps — drop the cutoff by 0.1 at a time until you find a value that doesn't kill all results.

Step 4 — Fix embedding model mismatch

Built the index with one model and queried with another? Cosine similarity scores will be garbage — nothing will match. Always pin the embedding model explicitly:

from llama_index.core import Settings
from llama_index.embeddings.openai import OpenAIEmbedding

# Set globally — applies to both indexing and querying
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")

# Now build index
index = VectorStoreIndex.from_documents(documents)

# And query — same embedding model used automatically
query_engine = index.as_query_engine()

Changed embedding models after the index was already built? You have to rebuild from scratch. There's no shortcut here.

Step 5 — Guard against empty results in your code

Either way, your app shouldn't crash on empty retrieval. Add a guard:

from llama_index.core import VectorStoreIndex
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.response_synthesizers import get_response_synthesizer

retriever = index.as_retriever(similarity_top_k=5)

# Option A: check nodes before synthesizing
nodes = retriever.retrieve(query)
if not nodes:
    print("No relevant documents found for this query.")
else:
    synth = get_response_synthesizer()
    response = synth.synthesize(query, nodes=nodes)
    print(response)

# Option B: wrap query engine call
try:
    response = query_engine.query(query)
    if not str(response).strip():
        print("Empty response — no matching content found.")
except IndexError:
    print("Retriever returned no results for this query.")

Step 6 — For external vector stores (Pinecone, Chroma, Weaviate)

External stores can return empty results with no error at all. Check the collection count first:

# Chroma example
import chromadb

client = chromadb.PersistentClient(path="./chroma_db")
collection = client.get_collection("my_collection")
print(f"Chroma collection has {collection.count()} entries")

# If 0, re-run your ingestion script
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import StorageContext, VectorStoreIndex

vector_store = ChromaVectorStore(chroma_collection=collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)

Zero entries means the ingestion never ran, or wrote to a different collection name. Double-check the collection_name matches between your ingestion and query scripts.

Verify the Fix

Run this sanity check after applying any fix:

import logging
logging.basicConfig(level=logging.DEBUG)

retriever = index.as_retriever(similarity_top_k=3)
nodes = retriever.retrieve("test query matching your documents")

assert len(nodes) > 0, "Still returning empty — check index and embeddings"
print(f"OK: retrieved {len(nodes)} nodes")
print("Top result score:", nodes[0].score)
print("Top result preview:", nodes[0].get_content()[:200])

You should see at least one node with a non-zero score. Score comes back as None? Your vector store isn't returning similarity scores — dig into the store's configuration, not the retriever.

Quick Tips

Always rebuild the index after changing chunk size or embedding model — stale vectors cause subtle retrieval failures that are hard to trace
Use similarity_top_k=10 during debugging — narrow it down after confirming retrieval actually works
Enable LlamaIndex debug events to trace exactly what the retriever is doing: from llama_index.core import set_global_handler; set_global_handler("simple")
Very short documents (under 100 tokens) often produce nodes that score too low to pass any threshold — bump Settings.chunk_size to at least 256 and re-index
In production, always handle the empty-nodes case explicitly. Don't let the query engine decide what error to raise.