Skip to Content
IntegrationsLlamaIndexMetadata Filtering

Metadata Filtering

Filter search results based on document metadata to narrow down your queries and get more relevant results.

Basic Metadata Filters

Use MetadataFilters to restrict search results based on document metadata:

from llama_index.core.vector_stores.types import MetadataFilters, MetadataFilter, FilterOperator # Create a filtered retriever to only search within AI-related documents ai_filter = MetadataFilter(key="category", value="ai", operator=FilterOperator.EQ) ai_filters = MetadataFilters(filters=[ai_filter]) # Create a filtered query engine filtered_query_engine = index.as_query_engine(filters=ai_filters) # Ask a general question but only using AI documents response = filtered_query_engine.query("What is learning from data?") print("Filtered Query (AI category only): What is learning from data?") print("Response:") print(response)

Available Filter Operators

OperatorDescription
FilterOperator.EQEqual to
FilterOperator.NENot equal to
FilterOperator.GTGreater than
FilterOperator.GTEGreater than or equal
FilterOperator.LTLess than
FilterOperator.LTELess than or equal
FilterOperator.INIn list
FilterOperator.NINNot in list

Advanced Filtering with Multiple Conditions

Combine multiple metadata filters for precise results:

# Create a more complex filter: database category AND intermediate difficulty category_filter = MetadataFilter(key="category", value="database", operator=FilterOperator.EQ) difficulty_filter = MetadataFilter(key="difficulty", value="intermediate", operator=FilterOperator.EQ) complex_filters = MetadataFilters(filters=[category_filter, difficulty_filter]) # Create a query engine with the complex filters complex_filtered_engine = index.as_query_engine(filters=complex_filters) # Query with the complex filters response = complex_filtered_engine.query("Tell me about databases") print("Complex Filtered Query (database category AND intermediate difficulty): Tell me about databases") print("Response:") print(response)

Note: Multiple filters are combined with AND logic by default.

Filter Examples

Exact Match (EQ)

# Filter for documents with category = "programming" filter = MetadataFilter(key="category", value="programming", operator=FilterOperator.EQ)

Not Equal (NE)

# Filter for documents NOT in the "ai" category filter = MetadataFilter(key="category", value="ai", operator=FilterOperator.NE)

In List (IN)

# Filter for documents in multiple categories filter = MetadataFilter( key="category", value=["programming", "database"], operator=FilterOperator.IN )

Combining Multiple Filters

# Combine multiple conditions filters = MetadataFilters( filters=[ MetadataFilter(key="category", value="ai", operator=FilterOperator.EQ), MetadataFilter(key="difficulty", value="advanced", operator=FilterOperator.EQ), MetadataFilter(key="field", value="deep_learning", operator=FilterOperator.EQ) ] ) # Create query engine with combined filters query_engine = index.as_query_engine(filters=filters)

Using Filters with Custom Retrievers

Apply filters when creating a custom retriever:

from llama_index.core.retrievers import VectorIndexRetriever # Create a retriever with custom parameters retriever = VectorIndexRetriever( index=index, similarity_top_k=3, # Return top 3 most similar results filters=ai_filters # Apply metadata filters ) # Retrieve nodes for a query nodes = retriever.retrieve("What is deep learning?") print(f"Retrieved {len(nodes)} nodes for query: 'What is deep learning?'") for i, node in enumerate(nodes): print(f"\nNode {i+1}:") print(f"Text: {node.node.text}") print(f"Metadata: {node.node.metadata}") print(f"Score: {node.score:.4f}")

Example Output:

Retrieved 2 nodes for query: 'What is deep learning?' (with AI category filter) Node 1: Text: Deep learning is part of a broader family of machine learning methods... Metadata: {'category': 'ai', 'field': 'deep_learning', 'difficulty': 'advanced'} Score: 0.8934 Node 2: Text: Machine learning is a subset of artificial intelligence... Metadata: {'category': 'ai', 'field': 'machine_learning', 'difficulty': 'advanced'} Score: 0.7821

Next Steps