Python SDK Usage

This guide covers the core operations for working with vectors in Endee: upserting, querying, and deleting data.

Setting Up Your Domain

The Endee client allows you to configure a custom domain URL and port (default port is 8080):


from endee import Endee
 
# Initialize with your API token
client = Endee()
 
# Set custom base URL
client.set_base_url('http://0.0.0.0:8081/api/v1')

Use set_base_url() when running on a non-default port.

Upserting Vectors

The index.upsert() method adds or updates vectors in an existing index.


from endee import Endee
 
client = Endee()
index = client.get_index(name="my_index")
 
index.upsert([
    {
        "id": "vec1",
        "vector": [...],
        "meta": {"title": "First document"},
        "filter": {"category": "tech"}
    },
    {
        "id": "vec2",
        "vector": [...],
        "meta": {"title": "Second document"},
        "filter": {"category": "science"}
    }
])

Vector Object Fields:

Field	Required	Description
`id`	Yes	Unique identifier for the vector
`vector`	Yes	Array of floats representing the embedding
`meta`	No	Arbitrary metadata object
`filter`	No	Key-value pairs for filtering during queries

Querying the Index

The index.query() method performs a similarity search using a query vector.


results = index.query(
    vector=[...],
    top_k=5,
    ef=128,
    include_vectors=True
)
 
for item in results:
    print(f"ID: {item['id']}, Similarity: {item['similarity']}")

Query Parameters:

Parameter	Description
`vector`	Query vector (must match index dimension)
`top_k`	Number of results to return (default: 10, max: 512)
`ef`	Search quality parameter (default: 128, max: 1024)
`include_vectors`	Include vector data in results (default: False)
`prefilter_cardinality_threshold`	Controls when search switches from HNSW filtered search to brute-force prefiltering on the matched subset (default: 10,000, range: 1,000–1,000,000)
`filter_boost_percentage`	Expands the internal HNSW candidate pool by this percentage when a filter is active, compensating for filtered-out results (default: 0, range: 0–100)

Filtered Querying

Use the filter parameter to restrict results based on filter conditions. All filters are combined with logical AND.


results = index.query(
    vector=[...],
    top_k=5,
    filter=[
        {"category": {"$eq": "tech"}},
        {"score": {"$range": [80, 100]}}
    ]
)

Filtering Operators

Operator	Description	Example
`$eq`	Exact match	`{"status": {"$eq": "published"}}`
`$in`	Match any in list	`{"tags": {"$in": ["ai", "ml"]}}`
`$range`	Numeric range (inclusive)	`{"score": {"$range": [70, 95]}}`

The $range operator supports values within [0 – 999]. Normalize larger values before upserting.

Filter Tuning

When using filtered queries, two optional parameters let you tune the trade-off between search speed and recall.

Prefilter Cardinality Threshold

Controls when the search strategy switches from HNSW filtered search (fast, graph-based) to brute-force prefiltering (exhaustive scan on the matched subset).

Value	Behavior
`1,000`	Prefilter only for very selective filters — minimum value
`10,000`	Prefilter only when the filter matches ≤ 10,000 vectors (default)
`1,000,000`	Prefilter for almost all filtered searches — maximum value

When very few vectors match your filter, HNSW may struggle to find enough valid candidates through graph traversal. In that case, scanning the filtered subset directly (prefiltering) is faster and more accurate. Raising the threshold means prefiltering kicks in more often; lowering it favors HNSW graph search.


# Only prefilter when filter matches ≤5,000 vectors
results = index.query(
    vector=[...],
    top_k=10,
    filter=[{"category": {"$eq": "rare"}}],
    prefilter_cardinality_threshold=5000,
)

Filter Boost Percentage

When using HNSW filtered search, some candidates explored during graph traversal are discarded by the filter, which can leave you with fewer results than top_k. filter_boost_percentage compensates by expanding the internal candidate pool before filtering is applied.

0 → no boost, standard candidate pool size (default)
20 → fetch 20% more candidates internally before applying the filter
Maximum: 100 (doubles the candidate pool)


# Fetch 30% more candidates to compensate for aggressive filtering
results = index.query(
    vector=[...],
    top_k=10,
    filter=[{"visibility": {"$eq": "public"}}],
    filter_boost_percentage=30,
)

Using Both Together


results = index.query(
    vector=[...],
    top_k=10,
    filter=[{"category": {"$eq": "rare"}}],
    prefilter_cardinality_threshold=5000,  # switch to brute-force for small match sets
    filter_boost_percentage=25,             # boost candidates for HNSW filtered search
)

Start with the defaults (prefilter_cardinality_threshold=10,000, filter_boost_percentage=0). If filtered queries return fewer results than expected, try increasing filter_boost_percentage. If filtered queries are slow on selective filters, try lowering prefilter_cardinality_threshold. Valid range for the threshold is 1,000–1,000,000.

Updating Filters

The index.update_filters() method updates the filters for one or more vectors without modifying the vector data or metadata.


index = client.get_index(name="my_index")
 
index.update_filters([
    {"id": "vec1", "filter": {"category": "B", "tags": "updated"}},
    {"id": "vec2", "filter": {"category": "C", "priority": 1}},
    {"id": "vec3", "filter": {"visibility": "private"}}
])

Update Object Fields:

Field	Required	Description
`id`	Yes	Unique identifier of the vector to update
`filter`	Yes	New filter object that replaces the existing filters

Filter updates are destructive replacements. Any filter keys not included in the new filter object will be removed from the vector.

Hybrid Search

Hybrid indexes combine dense and sparse vector search. Create a hybrid index by specifying sparse_dim:


client.create_index(
    name="hybrid_index",
    dimension=384,
    sparse_dim=30000,
    space_type="cosine"
)

Upserting Hybrid Vectors

Provide both dense vectors and sparse vector representations using sparse_indices and sparse_values:


index = client.get_index(name="hybrid_index")
 
index.upsert([
    {
        "id": "doc1",
        "vector": [0.1, 0.2, ...],          # Dense vector
        "sparse_indices": [10, 50, 200],     # Non-zero term positions
        "sparse_values": [0.8, 0.5, 0.3],    # Weights for each position
        "meta": {"title": "Document 1"}
    },
    {
        "id": "doc2",
        "vector": [0.3, 0.4, ...],
        "sparse_indices": [15, 100, 500],
        "sparse_values": [0.9, 0.4, 0.6],
        "meta": {"title": "Document 2"}
    }
])

Hybrid Vector Fields:

Field	Required	Description
`id`	Yes	Unique identifier
`vector`	Yes	Dense embedding vector
`sparse_indices`	Yes (hybrid)	Non-zero term positions in sparse vector
`sparse_values`	Yes (hybrid)	Weights for each sparse index
`meta`	No	Metadata dictionary
`filter`	No	Filter fields

sparse_indices and sparse_values must have the same length. Values in sparse_indices must be within [0, sparse_dim).

Querying Hybrid Index

Provide both dense and sparse query vectors:


results = index.query(
    vector=[0.15, 0.25, ...],           # Dense query
    sparse_indices=[10, 100, 300],       # Sparse query positions
    sparse_values=[0.7, 0.5, 0.4],       # Sparse query weights
    top_k=5
)
 
for item in results:
    print(f"ID: {item['id']}, Similarity: {item['similarity']}")

Deletion Methods

Delete by ID


index.delete_vector("vec1")

Delete Index


client.delete_index("my_index")

Deletion operations are irreversible.

Additional Operations

Get Vector by ID


vector = index.get_vector("vec1")

Describe Index


info = index.describe()

API Reference

Endee Class

Method	Description
`create_index(...)`	Create a new index (add `sparse_dim` for hybrid)
`list_indexes()`	List all indexes
`delete_index(name)`	Delete an index
`get_index(name)`	Get reference to an index
`set_base_url(url)`	To set a new base url

Index Class

Method	Description
`upsert(vectors)`	Insert or update vectors
`query(...)`	Search for similar vectors
`update_filters(updates)`	Replace filters for one or more vectors
`delete_vector(id)`	Delete a vector by ID
`get_vector(id)`	Get a vector by ID
`describe()`	Get index info