Metadata Filtering
Filter search results based on document metadata for more precise and relevant results.
Basic Metadata Filter
Use filters to restrict search results based on metadata:
# Search with a filter
query = "Tell me about programming languages"
filter_array = [{"category": {"$eq": "programming"}}]
filtered_results = vector_store.similarity_search(
query=query,
k=3,
filter=filter_array
)
print(f"Query: '{query}' with filter: {filter_array}")
print(f"\nFound {len(filtered_results)} filtered results:")
for i, doc in enumerate(filtered_results):
print(f"\nResult {i+1}:")
print(f"Content: {doc.page_content}")
print(f"Metadata: {doc.metadata}")Filter Operators
| Operator | Description | Example |
|---|---|---|
$eq | Equal to | {"category": {"$eq": "programming"}} |
$in | In list | {"tags": {"$in": ["ai", "ml"]}} |
$range | Numeric range | {"score": {"$range": [70, 95]}} |
Multiple Metadata Filters
Combine multiple filters for more specific results:
# Search with multiple filters
query = "Tell me about AI"
filter_array = [{"category": {"$eq": "ai"}}, {"difficulty": {"$eq": "advanced"}}]
multi_filtered_results = vector_store.similarity_search(
query=query,
k=3,
filter=filter_array
)
print(f"Query: '{query}' with filter: {filter_array}")
print(f"\nFound {len(multi_filtered_results)} filtered results:")
for i, doc in enumerate(multi_filtered_results):
print(f"\nResult {i+1}:")
print(f"Content: {doc.page_content}")
print(f"Metadata: {doc.metadata}")Note: Multiple filters are combined with AND logic — documents must match all conditions.
Filter Examples
Exact Match
# Filter for programming category only
filter_array = [{"category": {"$eq": "programming"}}]
results = vector_store.similarity_search(
query="What languages are popular?",
k=3,
filter=filter_array
)Multiple Conditions
# Filter for AI category AND advanced difficulty
filter_array = [
{"category": {"$eq": "ai"}},
{"difficulty": {"$eq": "advanced"}}
]
results = vector_store.similarity_search(
query="Tell me about neural networks",
k=3,
filter=filter_array
)Using $in Operator
# Filter for documents in multiple categories
filter_array = [{"category": {"$in": ["programming", "database"]}}]
results = vector_store.similarity_search(
query="What tools do developers use?",
k=5,
filter=filter_array
)Combining Different Operators
# Complex filter with multiple operators
filter_array = [
{"category": {"$eq": "database"}},
{"type": {"$eq": "vector"}},
{"difficulty": {"$in": ["beginner", "intermediate"]}}
]
results = vector_store.similarity_search(
query="What are vector databases?",
k=3,
filter=filter_array
)Filtered Search with Scores
Combine filtering with similarity scores:
# Filtered search with scores
results_with_scores = vector_store.similarity_search_with_score(
query="Tell me about databases",
k=3,
filter=[{"category": {"$eq": "database"}}]
)
for doc, score in results_with_scores:
print(f"Score: {score:.4f} - {doc.page_content}")Important Notes
- Filter operators are case-sensitive and must be prefixed with
$ - Filters operate on fields provided in the
metadatasduring document insertion - The
$rangeoperator supports values only within [0 – 999] — normalize larger values before inserting