Concepts
Understanding the core concepts of Endee Vector Database.
Indexes
Indexes are the vector stores where your data lives. Endee supports two types of indexes:
- Dense Indexes: For semantic similarity search using vector embeddings
- Hybrid Indexes: Combine dense vectors with sparse vector search
Both index types are optimized for fast Approximate Nearest Neighbor (ANN) searches using the HNSW algorithm.
Dense Indexes
Dense indexes enable semantic search — finding items based on meaning rather than exact keyword matches. Use dense indexes when you want to:
- Find semantically similar documents
- Power recommendation systems
- Enable image/video similarity search
Hybrid Indexes
Hybrid indexes combine dense vector search with sparse vector search. To create a hybrid index, specify the sparse dimension parameter along with the dense dimension.
Use hybrid indexes for document retrieval in RAG pipelines where you need both semantic and keyword matching.
Index Parameters
When creating an index, you can configure the following parameters:
| Parameter | Description | Default |
|---|---|---|
| Name | Unique name for your index | Required |
| Dimension | Dense vector dimensionality (max 10000) | Required |
| Space Type | Distance metric: cosine, L2, or inner product | Required |
| Sparse Dimension | Sparse vector dimension (enables hybrid search) | None |
| M | HNSW graph connectivity | 16 |
| EF Construction | HNSW construction parameter | 128 |
| Precision | Vector quantization precision | INT8 |
Distance Metrics
Choose the appropriate distance metric based on your use case:
| Metric | Description | Best For |
|---|---|---|
| Cosine | Measures angle between vectors | Text embeddings, normalized vectors |
| L2 | Euclidean distance | Image embeddings, spatial data |
| Inner Product | Dot product similarity | Maximum inner product search |
Precision Options
The precision parameter controls how vectors are stored:
| Precision | Description | Storage | Speed | Accuracy |
|---|---|---|---|---|
| Binary | 1-bit quantization | Smallest | Fastest | Lower |
| INT8 | 8-bit integer | Small | Fast | Good |
| INT16 | 16-bit integer (default) | Medium | Medium | Higher |
| FLOAT16 | 16-bit float | Medium | Medium | High |
| FLOAT32 | 32-bit float | Largest | Slower | Highest |
Querying
Query Parameters
| Parameter | Description | Default |
|---|---|---|
| Vector | Query vector (must match index dimension) | Required |
| Top K | Number of results to return (max 512) | 10 |
| EF | Search quality parameter (max 1024) | 128 |
| Include Vectors | Include vector data in results | False |
| Filter | Filter criteria (array of filter objects) | None |
| Prefilter Cardinality Threshold | Switch to brute-force prefiltering when the filter matches ≤ N vectors (range: 1,000–1,000,000) | 10,000 |
| Filter Boost Percentage | Expand the internal candidate pool by this percentage before applying the filter (max: 100) | 0 |
Hybrid Query Parameters
For hybrid indexes, include sparse query components:
| Parameter | Description |
|---|---|
| Sparse Indices | Non-zero term positions in the sparse query vector |
| Sparse Values | Weights corresponding to each sparse index position |
Query Modes
Hybrid indexes support three query modes:
- Dense only: Provide only the dense query vector for semantic search
- Sparse only: Provide only sparse indices and values for keyword-based search
- Hybrid: Provide both dense and sparse components for combined results
Result Fields
Query results include the following fields:
| Field | Description |
|---|---|
| ID | Vector identifier |
| Similarity | Similarity score (higher is more similar) |
| Distance | Distance score (lower is more similar) |
| Metadata | Metadata dictionary attached to the vector |
| Norm | Vector norm value |
| Vector | Vector data (if include vectors is enabled) |
Vectors
Vectors are the fundamental data units stored in an index.
Dense Vector Fields
| Field | Required | Description |
|---|---|---|
| ID | Yes | Unique identifier for the vector |
| Vector | Yes | Dense embedding array |
| Metadata | No | Arbitrary metadata object for storing additional information |
| Filter | No | Key-value pairs for filtering during queries |
Hybrid Vector Fields
For hybrid indexes, include sparse vector components alongside dense vectors:
| Field | Required | Description |
|---|---|---|
| Sparse Indices | Yes | Non-zero term positions in the sparse vector |
| Sparse Values | Yes | Weights for each sparse index position |
Sparse indices and sparse values must have the same length. Sparse index values must be within the range [0, sparse dimension).
Maximum batch size is 1000 vectors per upsert call.
Filtering
Use filters to restrict search results based on conditions. All filters are combined with logical AND.
Operators
| Operator | Description | Example |
|---|---|---|
| Equals | Exact match | Match status equal to “published” |
| In | Match any value in a list | Match tags containing “ai” or “ml” |
| Range | Numeric range (inclusive) | Match score between 70 and 95 |
Notes:
- Operators are case-sensitive in SDK implementations
- The range operator supports values within [0 – 999] — normalize larger values before upserting
Filter Tuning
When using filtered queries, two optional parameters let you tune the trade-off between search speed and recall.
Prefilter Cardinality Threshold
Controls when the search strategy switches from HNSW filtered search to brute-force prefiltering. When very few vectors match your filter, HNSW may struggle to find enough valid candidates through graph traversal — in that case, scanning the filtered subset directly (prefiltering) is faster and more accurate.
- Default:
10,000 - Valid range:
1,000–1,000,000 - Raising the threshold means prefiltering kicks in more often; lowering it favors HNSW graph search.
Filter Boost Percentage
When using HNSW filtered search, some candidates explored during graph traversal are discarded by the filter, which can leave you with fewer results than requested. This parameter expands the internal candidate pool before filtering is applied to compensate.
- Default:
0(no boost) - Maximum:
100(doubles the candidate pool)
Start with the defaults. If filtered queries return fewer results than expected, increase the boost percentage. If filtered queries are slow on selective filters, lower the cardinality threshold.
Updating filters
You can update the filters of an existing vectors by providing their IDs and a new filter objects. The update replaces the entire filter object — it does not merge with existing filters.
| Parameter | Required | Description |
|---|---|---|
| ID | Yes | Unique identifier of the vector to update |
| Filter | Yes | New filter object that replaces the existing filters |
Filter updates are destructive replacements. Any filter keys not included in the new filter object will be removed from the vector.
Authentication
By default, Endee runs without authentication for easy local development. To secure your instance, set the NDD_AUTH_TOKEN environment variable when starting the server.
Enabling Authentication
In your docker-compose.yml:
environment:
NDD_AUTH_TOKEN: "your-secret-token"Using Authentication
When authentication is enabled, you must provide the same token in the following places:
1. Dashboard Access
Enter the token in the dashboard login to access the web interface at http://localhost:8080.
2. SDK Clients
Pass the token when initializing the client. See the respective SDK’s documentation for details.
3. REST API
Include the token in the Authorization header:
curl -X GET http://localhost:8080/api/v1/index/list \
-H "Authorization: Bearer your-secret-token"All API requests will fail if the token doesn’t match the server’s NDD_AUTH_TOKEN. Keep your token secure and never expose it in client-side code.