Vectors

Vectors are the fundamental data units stored in an index. Each vector has a unique ID, an embedding array, and optional metadata and filter fields.

Dense Vector Fields

Field	Required	Description
ID	Yes	Unique string identifier for the vector
Vector	Yes	Dense embedding array (must match the index dimension)
Meta	No	Arbitrary metadata object (returned in query results)
Filter	No	Key-value pairs used to filter results during queries

meta vs filter: Use meta for data you want returned with results (titles, URLs, display text). Use filter for fields you intend to query against. Any field not declared in filter at upsert time cannot be used in a filter expression later.

Python


index = client.get_index(name="my_index")
 
index.upsert([
    {
        "id": "doc1",
        "vector": [0.12, -0.34, 0.89, ...],
        "meta": {"title": "Introduction to ML"},
        "filter": {"category": "tech", "year": 2024}
    },
    {
        "id": "doc2",
        "vector": [0.55, 0.21, -0.10, ...],
        "meta": {"title": "Deep Learning Basics"},
        "filter": {"category": "tech", "year": 2023}
    }
])

TypeScript


const index = await client.getIndex('my_index');
 
await index.upsert([
  {
    id: 'doc1',
    vector: [0.12, -0.34, 0.89, ...],
    meta: { title: 'Introduction to ML' },
    filter: { category: 'tech', year: 2024 },
  },
  {
    id: 'doc2',
    vector: [0.55, 0.21, -0.10, ...],
    meta: { title: 'Deep Learning Basics' },
    filter: { category: 'tech', year: 2023 },
  },
]);

Java


import io.endee.client.Index;
import io.endee.client.types.VectorItem;
 
Index index = client.getIndex("my_index");
 
index.upsert(List.of(
    VectorItem.builder("doc1", new double[]{ 0.12, -0.34, 0.89 /* ... */ })
        .meta(Map.of("title", "Introduction to ML"))
        .filter(Map.<String, Object>of("category", "tech", "year", 2024))
        .build(),
    VectorItem.builder("doc2", new double[]{ 0.55, 0.21, -0.10 /* ... */ })
        .meta(Map.of("title", "Deep Learning Basics"))
        .filter(Map.<String, Object>of("category", "tech", "year", 2023))
        .build()
));

BM25 sparse vectors are generated from text using term-frequency weighting. Endee’s endee-model library handles this automatically. See the Sparse Vectors (BM25) guide for setup and usage.

Hybrid Vector Fields

For hybrid indexes, include sparse vector components alongside the dense vector:

Field	Required	Description
Sparse Indices	Yes	Non-zero term positions in the sparse vector
Sparse Values	Yes	BM25 weights corresponding to each index position

Python


index.upsert([
    {
        "id": "doc1",
        "vector": [0.12, -0.34, 0.89, ...],
        "sparse_indices": [4821, 19043, 73201],
        "sparse_values":  [1.42,  0.87,  1.15],
        "meta": {"title": "Introduction to ML"}
    }
])

TypeScript


await index.upsert([
  {
    id: 'doc1',
    vector: [0.12, -0.34, 0.89, ...],
    sparseIndices: [4821, 19043, 73201],
    sparseValues:  [1.42,  0.87,  1.15],
    meta: { title: 'Introduction to ML' },
  },
]);

Java


index.upsert(List.of(
    VectorItem.builder("doc1", new double[]{ 0.12, -0.34, 0.89 /* ... */ })
        .sparseIndices(new int[]{ 4821, 19043, 73201 })
        .sparseValues(new double[]{ 1.42, 0.87, 1.15 })
        .meta(Map.of("title", "Introduction to ML"))
        .build()
));

sparse_indices and sparse_values must have the same length. Each position in sparse_indices maps to the weight at the same position in sparse_values.

The sparse_model parameter

For sparse_model you have two options depending on which sparse model you use:

sparse_model="endee_bm25" — use this when your sparse vectors come from endee/bm25. Endee holds the IDF weights on its server and applies them automatically, so you only need to send the TF weights from your client.
sparse_model="default" — use this for SPLADE models or any other BM25 model. In this case Endee treats the values you send as final scores and does no further calculation. If you are using another BM25 model (not endee/bm25), you must compute the full IDF scores yourself on the client before sending them.

Maximum batch size is 10,000 vectors per upsert call.

Precision

The precision parameter controls how vectors are stored internally. Lower precision reduces memory and speeds up search at the cost of some accuracy.

Precision	Bits	Storage	Speed	Accuracy
BINARY	1-bit	Smallest	Fastest	Lower
INT8	8-bit	Small	Fast	Good
INT16	16-bit integer	Medium	Medium	Higher
FLOAT16	16-bit float	Medium	Medium	High
FLOAT32	32-bit float	Largest	Slower	Highest

Python


from endee import Precision
 
# INT16 — recommended for most use cases
client.create_index(name="my_index", dimension=384,
                    space_type="cosine", precision=Precision.INT16)
 
# FLOAT32 — maximum accuracy
client.create_index(name="precise_index", dimension=384,
                    space_type="cosine", precision=Precision.FLOAT32)
 
# BINARY — minimum memory
client.create_index(name="large_index", dimension=384,
                    space_type="cosine", precision=Precision.BINARY)

TypeScript


import { Precision } from 'endee';
 
// INT16 — recommended for most use cases
await client.createIndex({ name: 'my_index', dimension: 384,
  spaceType: 'cosine', precision: Precision.INT16 });
 
// FLOAT32 — maximum accuracy
await client.createIndex({ name: 'precise_index', dimension: 384,
  spaceType: 'cosine', precision: Precision.FLOAT32 });
 
// BINARY — minimum memory
await client.createIndex({ name: 'large_index', dimension: 384,
  spaceType: 'cosine', precision: Precision.BINARY });

Java


import io.endee.client.types.Precision;
import io.endee.client.types.SpaceType;
 
// INT16 — recommended for most use cases
client.createIndex(CreateIndexOptions.builder("my_index", 384)
    .spaceType(SpaceType.COSINE).precision(Precision.INT16).build());
 
// FLOAT32 — maximum accuracy
client.createIndex(CreateIndexOptions.builder("precise_index", 384)
    .spaceType(SpaceType.COSINE).precision(Precision.FLOAT32).build());
 
// BINARY — minimum memory
client.createIndex(CreateIndexOptions.builder("large_index", 384)
    .spaceType(SpaceType.COSINE).precision(Precision.BINARY).build());

Recommendations:

INT16: best balance of speed, memory, and accuracy for most use cases (recommended)
INT8: faster than INT16 with slightly lower accuracy; good for latency-sensitive workloads (default)
FLOAT32: use when maximum recall accuracy is critical and memory is not a concern
BINARY: use for very large indexes where memory is the primary constraint

Precision is set at index creation time and cannot be changed without recreating the index.

Get Vector by ID

Retrieve a single vector and its metadata by ID.

Python


vector = index.get_vector("doc1")

TypeScript


const vector = await index.getVector('doc1');

Java


import io.endee.client.types.VectorInfo;
 
VectorInfo vector = index.getVector("doc1");

Delete Vector by ID

Deletion is irreversible.

Python


index.delete_vector("doc1")

TypeScript


await index.deleteVector('doc1');

Java


index.deleteVector("doc1");

Delete Vectors by Filter

Delete all vectors matching specific filter conditions.

Python


index.delete_with_filter([/* filter expression */])
 
e.g: index.delete_with_filter([{"tags": {"$eq": "important"}}])

TypeScript


await index.deleteWithFilter([/* filter expression */]);
 
e.g: await index.deleteWithFilter([{ tags: { $eq: 'important' } }]);

Java


index.deleteWithFilter(List.of(/* filter expression */));
 
// e.g:
index.deleteWithFilter(List.of(Map.of("tags", Map.of("$eq", "important"))));

For filter expression , see Filtering: Operators.