Skip to Content
Concepts

Concepts

Understanding the core concepts of Endee Vector Database.

Indexes

Indexes are the vector stores where your data lives. Endee supports two types of indexes:

  • Dense Indexes: For semantic similarity search using vector embeddings
  • Hybrid Indexes: Combine dense vectors with sparse vector search

Both index types are optimized for fast Approximate Nearest Neighbor (ANN) searches using the HNSW algorithm.

Dense Indexes

Dense indexes enable semantic search — finding items based on meaning rather than exact keyword matches. Use dense indexes when you want to:

  • Find semantically similar documents
  • Power recommendation systems
  • Enable image/video similarity search

Hybrid Indexes

Hybrid indexes combine dense vector search with sparse vector search. To create a hybrid index, specify the sparse dimension parameter along with the dense dimension.

Use hybrid indexes for document retrieval in RAG pipelines where you need both semantic and keyword matching.

Index Parameters

When creating an index, you can configure the following parameters:

ParameterDescriptionDefault
NameUnique name for your indexRequired
DimensionDense vector dimensionality (max 10000)Required
Space TypeDistance metric: cosine, L2, or inner productRequired
Sparse DimensionSparse vector dimension (enables hybrid search)None
MHNSW graph connectivity16
EF ConstructionHNSW construction parameter128
PrecisionVector quantization precisionINT8

Distance Metrics

Choose the appropriate distance metric based on your use case:

MetricDescriptionBest For
CosineMeasures angle between vectorsText embeddings, normalized vectors
L2Euclidean distanceImage embeddings, spatial data
Inner ProductDot product similarityMaximum inner product search

Precision Options

The precision parameter controls how vectors are stored:

PrecisionDescriptionStorageSpeedAccuracy
Binary1-bit quantizationSmallestFastestLower
INT88-bit integerSmallFastGood
INT1616-bit integer (default)MediumMediumHigher
FLOAT1616-bit floatMediumMediumHigh
FLOAT3232-bit floatLargestSlowerHighest

Querying

Query Parameters

ParameterDescriptionDefault
VectorQuery vector (must match index dimension)Required
Top KNumber of results to return (max 512)10
EFSearch quality parameter (max 1024)128
Include VectorsInclude vector data in resultsFalse
FilterFilter criteria (array of filter objects)None
Prefilter Cardinality ThresholdSwitch to brute-force prefiltering when the filter matches ≤ N vectors (range: 1,000–1,000,000)10,000
Filter Boost PercentageExpand the internal candidate pool by this percentage before applying the filter (max: 100)0

Hybrid Query Parameters

For hybrid indexes, include sparse query components:

ParameterDescription
Sparse IndicesNon-zero term positions in the sparse query vector
Sparse ValuesWeights corresponding to each sparse index position

Query Modes

Hybrid indexes support three query modes:

  • Dense only: Provide only the dense query vector for semantic search
  • Sparse only: Provide only sparse indices and values for keyword-based search
  • Hybrid: Provide both dense and sparse components for combined results

Result Fields

Query results include the following fields:

FieldDescription
IDVector identifier
SimilaritySimilarity score (higher is more similar)
DistanceDistance score (lower is more similar)
MetadataMetadata dictionary attached to the vector
NormVector norm value
VectorVector data (if include vectors is enabled)

Vectors

Vectors are the fundamental data units stored in an index.

Dense Vector Fields

FieldRequiredDescription
IDYesUnique identifier for the vector
VectorYesDense embedding array
MetadataNoArbitrary metadata object for storing additional information
FilterNoKey-value pairs for filtering during queries

Hybrid Vector Fields

For hybrid indexes, include sparse vector components alongside dense vectors:

FieldRequiredDescription
Sparse IndicesYesNon-zero term positions in the sparse vector
Sparse ValuesYesWeights for each sparse index position

Sparse indices and sparse values must have the same length. Sparse index values must be within the range [0, sparse dimension).

Maximum batch size is 1000 vectors per upsert call.

Filtering

Use filters to restrict search results based on conditions. All filters are combined with logical AND.

Operators

OperatorDescriptionExample
EqualsExact matchMatch status equal to “published”
InMatch any value in a listMatch tags containing “ai” or “ml”
RangeNumeric range (inclusive)Match score between 70 and 95

Notes:

  • Operators are case-sensitive in SDK implementations
  • The range operator supports values within [0 – 999] — normalize larger values before upserting

Filter Tuning

When using filtered queries, two optional parameters let you tune the trade-off between search speed and recall.

Prefilter Cardinality Threshold

Controls when the search strategy switches from HNSW filtered search to brute-force prefiltering. When very few vectors match your filter, HNSW may struggle to find enough valid candidates through graph traversal — in that case, scanning the filtered subset directly (prefiltering) is faster and more accurate.

  • Default: 10,000
  • Valid range: 1,0001,000,000
  • Raising the threshold means prefiltering kicks in more often; lowering it favors HNSW graph search.

Filter Boost Percentage

When using HNSW filtered search, some candidates explored during graph traversal are discarded by the filter, which can leave you with fewer results than requested. This parameter expands the internal candidate pool before filtering is applied to compensate.

  • Default: 0 (no boost)
  • Maximum: 100 (doubles the candidate pool)

Start with the defaults. If filtered queries return fewer results than expected, increase the boost percentage. If filtered queries are slow on selective filters, lower the cardinality threshold.

Updating filters

You can update the filters of an existing vectors by providing their IDs and a new filter objects. The update replaces the entire filter object — it does not merge with existing filters.

ParameterRequiredDescription
IDYesUnique identifier of the vector to update
FilterYesNew filter object that replaces the existing filters

Filter updates are destructive replacements. Any filter keys not included in the new filter object will be removed from the vector.

Authentication

By default, Endee runs without authentication for easy local development. To secure your instance, set the NDD_AUTH_TOKEN environment variable when starting the server.

Enabling Authentication

In your docker-compose.yml:

environment: NDD_AUTH_TOKEN: "your-secret-token"

Using Authentication

When authentication is enabled, you must provide the same token in the following places:

1. Dashboard Access

Enter the token in the dashboard login to access the web interface at http://localhost:8080.

2. SDK Clients

Pass the token when initializing the client. See the respective SDK’s documentation for details.

3. REST API

Include the token in the Authorization header:

curl -X GET http://localhost:8080/api/v1/index/list \ -H "Authorization: Bearer your-secret-token"

All API requests will fail if the token doesn’t match the server’s NDD_AUTH_TOKEN. Keep your token secure and never expose it in client-side code.