Filtering
Use filters to restrict search results to vectors that match specific conditions. All filter conditions are combined with logical AND — a vector must satisfy every condition to be returned.
Operators
| Operator | Description | Example |
|---|---|---|
| $eq | Exact match | {“status”: {“$eq”: “published”}} |
| $in | Match any value in a list | {“tags”: {“$in”: [“ai”, “ml”]}} |
| $range | Numeric range (inclusive on both ends) | {“score”: {“$range”: [70, 95]}} |
Python
# $eq — exact match
results = index.query(
vector=[...], top_k=5,
filter=[{"category": {"$eq": "tech"}}]
)
# $in — match any value in a list
results = index.query(
vector=[...], top_k=5,
filter=[{"tags": {"$in": ["ai", "ml", "nlp"]}}]
)
# $range — numeric range (inclusive, values 0–999)
results = index.query(
vector=[...], top_k=5,
filter=[{"score": {"$range": [80, 100]}}]
)
# combined — all conditions ANDed
results = index.query(
vector=[...], top_k=5,
filter=[
{"category": {"$eq": "tech"}},
{"score": {"$range": [80, 100]}}
]
)Notes:
- Operators are case-sensitive
- The
$rangeoperator only supports values within [0 – 999] — normalize larger values before upserting - Multiple conditions are ANDed together
Filter Tuning
When using filtered queries, two optional parameters let you tune the trade-off between search speed and recall.
Prefilter Cardinality Threshold
Controls when the search strategy switches from HNSW filtered search to brute-force prefiltering.
When very few vectors match your filter, HNSW may struggle to find enough valid candidates through graph traversal. In that case, scanning the matched subset directly (prefiltering) is faster and more accurate.
- Default:
10,000 - Valid range:
1,000–1,000,000 - Raising the threshold → prefiltering kicks in more often (favors exhaustive scan)
- Lowering the threshold → HNSW graph search is used more (favors speed on large datasets)
Filter Boost Percentage
When using HNSW filtered search, candidates explored during graph traversal that fail the filter are discarded — this can leave you with fewer results than top_k. This parameter expands the internal candidate pool before filtering is applied to compensate.
- Default:
0(no boost) - Maximum:
100(doubles the candidate pool)
Python
results = index.query(
vector=[...],
top_k=10,
filter=[{"category": {"$eq": "rare"}}],
prefilter_cardinality_threshold=5000,
filter_boost_percentage=25,
)Start with the defaults. If filtered queries return fewer results than expected, increase filter_boost_percentage. If filtered queries are slow on selective filters, lower the cardinality threshold.
Updating Filters
You can update the filter fields of existing vectors by providing their IDs and a new filter object.
| Parameter | Required | Description |
|---|---|---|
| ID | Yes | ID of the vector to update |
| Filter | Yes | New filter object — replaces the existing filters entirely |
Python
index.update_filters([
{"id": "doc1", "filter": {"category": "science", "year": 2024}},
{"id": "doc2", "filter": {"category": "tech"}},
])Filter updates are destructive replacements. Any filter keys not included in the new filter object will be removed from the vector. There is no partial-merge option.