The Error
You're running a RAG pipeline with LlamaIndex and suddenly hit this:
Traceback (most recent call last):
File "query.py", line 14, in <module>
response = query_engine.query("What is the refund policy?")
File ".../llama_index/core/query_engine/retriever_query_engine.py", line 190, in query
...
File ".../llama_index/core/response_synthesizers/base.py", line 102, in synthesize
text_chunks = [node.get_content() for node in nodes]
IndexError: list index out of range
The retriever returned zero nodes, and something downstream tried to index into an empty list. It shows up constantly in LlamaIndex RAG setups β and it's almost always a silent failure with no warning before the crash.
Why It Happens
Every case traces back to one thing: the retriever found no matching nodes for the query. Here are the usual culprits:
- The index was built from an empty document list, or documents weren't chunked properly
- Similarity threshold is set too high β no chunk scores above the cutoff (0.85+ is often too strict)
similarity_top_kis set to 0, or the index has fewer nodes than requested- Embedding mismatch β index was built with one model, queried with another
- Vector store connection issue (Pinecone, Weaviate, Chroma) silently returning empty results
- The index file got corrupted or wasn't persisted correctly before loading
Step-by-Step Fix
Step 1 β Verify the retriever actually returns nodes
Start by isolating the retriever. Call it directly, bypassing the query engine entirely:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
# Build or load your index
index = VectorStoreIndex.from_documents(documents)
retriever = index.as_retriever(similarity_top_k=5)
# Direct retriever call β bypass the query engine
nodes = retriever.retrieve("What is the refund policy?")
print(f"Retrieved {len(nodes)} nodes")
for node in nodes:
print(node.score, node.get_content()[:100])
If len(nodes) == 0, the bug is in retriever configuration or index data β not the query engine. That narrows things down fast.
Step 2 β Check your index actually has data
# For an in-memory VectorStoreIndex
print(f"Index has {len(index.docstore.docs)} documents")
# For a persisted index
from llama_index.core import StorageContext, load_index_from_storage
storage_context = StorageContext.from_defaults(persist_dir="./storage")
index = load_index_from_storage(storage_context)
print(f"Loaded index with {len(index.docstore.docs)} docs")
Count is 0? Your documents weren't indexed. Re-run the ingestion pipeline and confirm documents load correctly before indexing.
Step 3 β Lower or remove the similarity threshold
A node postprocessor with an aggressive similarity cutoff can silently filter out every result:
from llama_index.core.postprocessor import SimilarityPostprocessor
# Too aggressive β likely cutting all results
postprocessor = SimilarityPostprocessor(similarity_cutoff=0.85)
# More reasonable starting point
postprocessor = SimilarityPostprocessor(similarity_cutoff=0.5)
Disable the postprocessor first. Confirm retrieval works. Then add filtering back in small steps β drop the cutoff by 0.1 at a time until you find a value that doesn't kill all results.
Step 4 β Fix embedding model mismatch
Built the index with one model and queried with another? Cosine similarity scores will be garbage β nothing will match. Always pin the embedding model explicitly:
from llama_index.core import Settings
from llama_index.embeddings.openai import OpenAIEmbedding
# Set globally β applies to both indexing and querying
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
# Now build index
index = VectorStoreIndex.from_documents(documents)
# And query β same embedding model used automatically
query_engine = index.as_query_engine()
Changed embedding models after the index was already built? You have to rebuild from scratch. There's no shortcut here.
Step 5 β Guard against empty results in your code
Either way, your app shouldn't crash on empty retrieval. Add a guard:
from llama_index.core import VectorStoreIndex
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.response_synthesizers import get_response_synthesizer
retriever = index.as_retriever(similarity_top_k=5)
# Option A: check nodes before synthesizing
nodes = retriever.retrieve(query)
if not nodes:
print("No relevant documents found for this query.")
else:
synth = get_response_synthesizer()
response = synth.synthesize(query, nodes=nodes)
print(response)
# Option B: wrap query engine call
try:
response = query_engine.query(query)
if not str(response).strip():
print("Empty response β no matching content found.")
except IndexError:
print("Retriever returned no results for this query.")
Step 6 β For external vector stores (Pinecone, Chroma, Weaviate)
External stores can return empty results with no error at all. Check the collection count first:
# Chroma example
import chromadb
client = chromadb.PersistentClient(path="./chroma_db")
collection = client.get_collection("my_collection")
print(f"Chroma collection has {collection.count()} entries")
# If 0, re-run your ingestion script
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import StorageContext, VectorStoreIndex
vector_store = ChromaVectorStore(chroma_collection=collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)
Zero entries means the ingestion never ran, or wrote to a different collection name. Double-check the collection_name matches between your ingestion and query scripts.
Verify the Fix
Run this sanity check after applying any fix:
import logging
logging.basicConfig(level=logging.DEBUG)
retriever = index.as_retriever(similarity_top_k=3)
nodes = retriever.retrieve("test query matching your documents")
assert len(nodes) > 0, "Still returning empty β check index and embeddings"
print(f"OK: retrieved {len(nodes)} nodes")
print("Top result score:", nodes[0].score)
print("Top result preview:", nodes[0].get_content()[:200])
You should see at least one node with a non-zero score. Score comes back as None? Your vector store isn't returning similarity scores β dig into the store's configuration, not the retriever.
Quick Tips
- Always rebuild the index after changing chunk size or embedding model β stale vectors cause subtle retrieval failures that are hard to trace
- Use
similarity_top_k=10during debugging β narrow it down after confirming retrieval actually works - Enable LlamaIndex debug events to trace exactly what the retriever is doing:
from llama_index.core import set_global_handler; set_global_handler("simple") - Very short documents (under 100 tokens) often produce nodes that score too low to pass any threshold β bump
Settings.chunk_sizeto at least 256 and re-index - In production, always handle the empty-nodes case explicitly. Don't let the query engine decide what error to raise.

