Metadata Filtering
Filter search results based on document metadata to narrow down your queries and get more relevant results.
Basic Metadata Filters
Use MetadataFilters to restrict search results based on document metadata:
from llama_index.core.vector_stores.types import MetadataFilters, MetadataFilter, FilterOperator
# Create a filtered retriever to only search within AI-related documents
ai_filter = MetadataFilter(key="category", value="ai", operator=FilterOperator.EQ)
ai_filters = MetadataFilters(filters=[ai_filter])
# Create a filtered query engine
filtered_query_engine = index.as_query_engine(filters=ai_filters)
# Ask a general question but only using AI documents
response = filtered_query_engine.query("What is learning from data?")
print("Filtered Query (AI category only): What is learning from data?")
print("Response:")
print(response)Available Filter Operators
| Operator | Description |
|---|---|
FilterOperator.EQ | Equal to |
FilterOperator.NE | Not equal to |
FilterOperator.GT | Greater than |
FilterOperator.GTE | Greater than or equal |
FilterOperator.LT | Less than |
FilterOperator.LTE | Less than or equal |
FilterOperator.IN | In list |
FilterOperator.NIN | Not in list |
Advanced Filtering with Multiple Conditions
Combine multiple metadata filters for precise results:
# Create a more complex filter: database category AND intermediate difficulty
category_filter = MetadataFilter(key="category", value="database", operator=FilterOperator.EQ)
difficulty_filter = MetadataFilter(key="difficulty", value="intermediate", operator=FilterOperator.EQ)
complex_filters = MetadataFilters(filters=[category_filter, difficulty_filter])
# Create a query engine with the complex filters
complex_filtered_engine = index.as_query_engine(filters=complex_filters)
# Query with the complex filters
response = complex_filtered_engine.query("Tell me about databases")
print("Complex Filtered Query (database category AND intermediate difficulty): Tell me about databases")
print("Response:")
print(response)Note: Multiple filters are combined with AND logic by default.
Filter Examples
Exact Match (EQ)
# Filter for documents with category = "programming"
filter = MetadataFilter(key="category", value="programming", operator=FilterOperator.EQ)Not Equal (NE)
# Filter for documents NOT in the "ai" category
filter = MetadataFilter(key="category", value="ai", operator=FilterOperator.NE)In List (IN)
# Filter for documents in multiple categories
filter = MetadataFilter(
key="category",
value=["programming", "database"],
operator=FilterOperator.IN
)Combining Multiple Filters
# Combine multiple conditions
filters = MetadataFilters(
filters=[
MetadataFilter(key="category", value="ai", operator=FilterOperator.EQ),
MetadataFilter(key="difficulty", value="advanced", operator=FilterOperator.EQ),
MetadataFilter(key="field", value="deep_learning", operator=FilterOperator.EQ)
]
)
# Create query engine with combined filters
query_engine = index.as_query_engine(filters=filters)Using Filters with Custom Retrievers
Apply filters when creating a custom retriever:
from llama_index.core.retrievers import VectorIndexRetriever
# Create a retriever with custom parameters
retriever = VectorIndexRetriever(
index=index,
similarity_top_k=3, # Return top 3 most similar results
filters=ai_filters # Apply metadata filters
)
# Retrieve nodes for a query
nodes = retriever.retrieve("What is deep learning?")
print(f"Retrieved {len(nodes)} nodes for query: 'What is deep learning?'")
for i, node in enumerate(nodes):
print(f"\nNode {i+1}:")
print(f"Text: {node.node.text}")
print(f"Metadata: {node.node.metadata}")
print(f"Score: {node.score:.4f}")Example Output:
Retrieved 2 nodes for query: 'What is deep learning?' (with AI category filter)
Node 1:
Text: Deep learning is part of a broader family of machine learning methods...
Metadata: {'category': 'ai', 'field': 'deep_learning', 'difficulty': 'advanced'}
Score: 0.8934
Node 2:
Text: Machine learning is a subset of artificial intelligence...
Metadata: {'category': 'ai', 'field': 'machine_learning', 'difficulty': 'advanced'}
Score: 0.7821