Concepts

Understanding the core concepts of Endee Vector Database.

Indexes

Indexes are the vector stores where your data lives. Endee supports two types of indexes:

Dense Indexes: For semantic similarity search using vector embeddings
Hybrid Indexes: Combine dense vectors with sparse vector search

Both index types are optimized for fast Approximate Nearest Neighbor (ANN) searches using the HNSW algorithm.

Dense Indexes

Dense indexes enable semantic search — finding items based on meaning rather than exact keyword matches. Use dense indexes when you want to:

Find semantically similar documents
Power recommendation systems
Enable image/video similarity search

Hybrid Indexes

Hybrid indexes combine dense vector search with sparse vector search. To create a hybrid index, specify the sparse dimension parameter along with the dense dimension.

Use hybrid indexes for document retrieval in RAG pipelines where you need both semantic and keyword matching.

Index Parameters

When creating an index, you can configure the following parameters:

Parameter	Description	Default
Name	Unique name for your index	Required
Dimension	Dense vector dimensionality (max 10000)	Required
Space Type	Distance metric: cosine, L2, or inner product	Required
Sparse Dimension	Sparse vector dimension (enables hybrid search)	None
M	HNSW graph connectivity	16
EF Construction	HNSW construction parameter	128
Precision	Vector quantization precision	INT8

Distance Metrics

Choose the appropriate distance metric based on your use case:

Metric	Description	Best For
Cosine	Measures angle between vectors	Text embeddings, normalized vectors
L2	Euclidean distance	Image embeddings, spatial data
Inner Product	Dot product similarity	Maximum inner product search

Precision Options

The precision parameter controls how vectors are stored:

Precision	Description	Storage	Speed	Accuracy
Binary	1-bit quantization	Smallest	Fastest	Lower
INT8	8-bit integer	Small	Fast	Good
INT16	16-bit integer (default)	Medium	Medium	Higher
FLOAT16	16-bit float	Medium	Medium	High
FLOAT32	32-bit float	Largest	Slower	Highest

Querying

Query Parameters

Parameter	Description	Default
Vector	Query vector (must match index dimension)	Required
Top K	Number of results to return (max 512)	10
EF	Search quality parameter (max 1024)	128
Include Vectors	Include vector data in results	False
Filter	Filter criteria (array of filter objects)	None
Prefilter Cardinality Threshold	Switch to brute-force prefiltering when the filter matches ≤ N vectors (range: 1,000–1,000,000)	10,000
Filter Boost Percentage	Expand the internal candidate pool by this percentage before applying the filter (max: 100)	0

Hybrid Query Parameters

For hybrid indexes, include sparse query components:

Parameter	Description
Sparse Indices	Non-zero term positions in the sparse query vector
Sparse Values	Weights corresponding to each sparse index position

Query Modes

Hybrid indexes support three query modes:

Dense only: Provide only the dense query vector for semantic search
Sparse only: Provide only sparse indices and values for keyword-based search
Hybrid: Provide both dense and sparse components for combined results

Result Fields

Query results include the following fields:

Field	Description
ID	Vector identifier
Similarity	Similarity score (higher is more similar)
Distance	Distance score (lower is more similar)
Metadata	Metadata dictionary attached to the vector
Norm	Vector norm value
Vector	Vector data (if include vectors is enabled)

Vectors

Vectors are the fundamental data units stored in an index.

Dense Vector Fields

Field	Required	Description
ID	Yes	Unique identifier for the vector
Vector	Yes	Dense embedding array
Metadata	No	Arbitrary metadata object for storing additional information
Filter	No	Key-value pairs for filtering during queries

Hybrid Vector Fields

For hybrid indexes, include sparse vector components alongside dense vectors:

Field	Required	Description
Sparse Indices	Yes	Non-zero term positions in the sparse vector
Sparse Values	Yes	Weights for each sparse index position

Sparse indices and sparse values must have the same length. Sparse index values must be within the range [0, sparse dimension).

Maximum batch size is 1000 vectors per upsert call.

Filtering

Use filters to restrict search results based on conditions. All filters are combined with logical AND.

Operators

Operator	Description	Example
Equals	Exact match	Match status equal to “published”
In	Match any value in a list	Match tags containing “ai” or “ml”
Range	Numeric range (inclusive)	Match score between 70 and 95

Notes:

Operators are case-sensitive in SDK implementations
The range operator supports values within [0 – 999] — normalize larger values before upserting

Filter Tuning

When using filtered queries, two optional parameters let you tune the trade-off between search speed and recall.

Prefilter Cardinality Threshold

Controls when the search strategy switches from HNSW filtered search to brute-force prefiltering. When very few vectors match your filter, HNSW may struggle to find enough valid candidates through graph traversal — in that case, scanning the filtered subset directly (prefiltering) is faster and more accurate.

Default: 10,000
Valid range: 1,000 – 1,000,000
Raising the threshold means prefiltering kicks in more often; lowering it favors HNSW graph search.

Filter Boost Percentage

When using HNSW filtered search, some candidates explored during graph traversal are discarded by the filter, which can leave you with fewer results than requested. This parameter expands the internal candidate pool before filtering is applied to compensate.

Default: 0 (no boost)
Maximum: 100 (doubles the candidate pool)

Start with the defaults. If filtered queries return fewer results than expected, increase the boost percentage. If filtered queries are slow on selective filters, lower the cardinality threshold.

Updating filters

You can update the filters of an existing vectors by providing their IDs and a new filter objects. The update replaces the entire filter object — it does not merge with existing filters.

Parameter	Required	Description
ID	Yes	Unique identifier of the vector to update
Filter	Yes	New filter object that replaces the existing filters

Filter updates are destructive replacements. Any filter keys not included in the new filter object will be removed from the vector.

Authentication

By default, Endee runs without authentication for easy local development. To secure your instance, set the NDD_AUTH_TOKEN environment variable when starting the server.

Enabling Authentication

In your docker-compose.yml:


environment:
  NDD_AUTH_TOKEN: "your-secret-token"

Using Authentication

When authentication is enabled, you must provide the same token in the following places:

1. Dashboard Access

Enter the token in the dashboard login to access the web interface at http://localhost:8080.

2. SDK Clients

Pass the token when initializing the client. See the respective SDK’s documentation for details.

3. REST API

Include the token in the Authorization header:


curl -X GET http://localhost:8080/api/v1/index/list \
  -H "Authorization: Bearer your-secret-token"

All API requests will fail if the token doesn’t match the server’s NDD_AUTH_TOKEN. Keep your token secure and never expose it in client-side code.