Endee + LlamaIndex
Use Endee as a vector store in LlamaIndex for semantic search, metadata filtering, hybrid retrieval, and RAG pipelines.
LlamaIndex is a data framework for building LLM applications over external data. Endee integrates with LlamaIndex through the llama-index-vector-stores-endee package, supporting document indexing, similarity search, metadata filtering, and hybrid search.
Installation
pip install -q llama-index-vector-stores-endee llama-index-embeddings-huggingface
pip install numpy==2.0.0Authentication
Local server
API_TOKEN = "ndd-auth-token" # same token set in NDD_AUTH_TOKEN on your server
BASE_URL = "http://127.0.0.1:8080/api/v1"Endee Cloud
- Go to https://app.endee.io
- Create a token
- Paste it below:
API_TOKEN = "your-serverless-token"
BASE_URL = "" # leave empty - Endee figures out the cloud URLCreating a vector store
Create a vector store, set the embedding model, and you’re ready to add documents:
from llama_index.core import Document, StorageContext, VectorStoreIndex, Settings
from llama_index.core.vector_stores.types import MetadataFilters, MetadataFilter, FilterOperator
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index_endee import EndeeVectorStore
INDEX_NAME = "rag_demo_index"
DIMENSION = 384 # size of the meaning fingerprint (must match the model below)
vector_store = EndeeVectorStore.from_params(
api_token=API_TOKEN,
base_url=BASE_URL,
index_name=INDEX_NAME,
dimension=DIMENSION,
sparse_model="endee_bm25", # also enables keyword matching alongside meaning search
)
embed_model = HuggingFaceEmbedding(model_name="sentence-transformers/all-MiniLM-L6-v2")
Settings.embed_model = embed_model # set globally so all operations use it automaticallyAdding documents
Documents have two parts: the text you want to search, and metadata (tags) you can filter by later.
documents = [
Document(
text="Python is a high-level programming language known for readability.",
metadata={"category": "programming", "language": "python", "level": "beginner"},
),
# Add as many documents as you need...
]
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)Similarity search
Find documents by meaning - you don’t need exact words to match:
retriever = index.as_retriever(similarity_top_k=2) # return the 2 best matches
query = "Tell me about vector databases"
results = retriever.retrieve(query)
print(f"Query: '{query}'\n")
for i, node in enumerate(results, 1):
print(f"{i}. score={node.get_score():.3f}") # score closer to 1.0 = more relevant
print(f" {node.text}")
print(f" {node.metadata}\n")Search with metadata filters
Narrow results to specific subsets using metadata filters:
# EQ - only return documents tagged as "ai"
eq_filters = MetadataFilters(
filters=[MetadataFilter(key="category", value="ai", operator=FilterOperator.EQ)]
)
results = index.as_retriever(similarity_top_k=2, filters=eq_filters).retrieve(
"How do systems learn from data?"
)
print("EQ filter (category == ai):")
for node in results:
print(f" [{node.metadata.get('field')}] {node.text[:70]}\n")
# IN - return documents tagged as either "ai" or "database"
in_filters = MetadataFilters(
filters=[MetadataFilter(key="category", value=["ai", "database"], operator=FilterOperator.IN)]
)
results = index.as_retriever(similarity_top_k=3, filters=in_filters).retrieve(
"vector search and machine learning"
)
print("IN filter (category in [ai, database]):")
for i, node in enumerate(results, 1):
print(f" {i}. {node.text[:70]}")Hybrid search
Hybrid search matches both meaning and exact keywords. Useful when users search for specific names, product codes, or technical terms.
query = "privacy vector search"
print(f"Query: '{query}'\n")
# Try different blends of meaning vs. keyword matching
for weight, label in [(1.0, "meaning only"), (0.5, "balanced"), (0.0, "keywords only")]:
results = index.as_retriever(
similarity_top_k=3,
vector_store_kwargs={"dense_rrf_weight": weight},
).retrieve(query)
print(f"dense_rrf_weight={weight} ({label}):")
for i, node in enumerate(results, 1):
print(f" {i}. score={node.get_score():.4f} {node.text[:75]}\n")Tuning search parameters
Adjust ef to balance speed vs. accuracy:
# ef controls how thoroughly Endee searches - higher = more accurate but slower
query_text = "Endee features"
print(f"Query: '{query_text}'\n")
for ef_val in [64, 128, 256]:
results = index.as_retriever(
similarity_top_k=2,
vector_store_kwargs={"ef": ef_val},
).retrieve(query_text)
top = results[0] if results else None
print(f" ef={ef_val:4d} score={top.get_score():.4f} {top.text[:70]}\n")
# Combine ef tuning with a metadata filter
db_filters = MetadataFilters(
filters=[MetadataFilter(key="category", value="database", operator=FilterOperator.EQ)]
)
results = index.as_retriever(
similarity_top_k=2,
filters=db_filters,
vector_store_kwargs={
"ef": 200,
"prefilter_cardinality_threshold": 1000, # how many candidates to pre-filter before ranking
"filter_boost_percentage": 20, # how aggressively to boost filtered results
},
).retrieve("vector database search")
print("Fine-tuned search (category == database):")
for i, node in enumerate(results, 1):
print(f" {i}. {node.text[:80]}")Managing documents
Inspect, update tags, or delete individual documents:
# Fetch a stored vector by ID
sample_nodes = index.as_retriever(similarity_top_k=1).retrieve("vector database")
sample_id = sample_nodes[0].node.id_
fetched = vector_store.fetch([sample_id])
vec = fetched[0]
print(f"Fetched: {sample_id}")
print(f" embedding dimensions: {len(vec.get('vector', []))}")
print(f" metadata tags: {vec.get('filter', {})}")
print(f" has keyword data: {'sparse_indices' in vec}\n")
# Update tags without re-uploading the document (fast!)
print(f"Before: {vec.get('filter', {})}")
vector_store.update_filters([
{"id": sample_id, "filter": {"category": "database", "status": "reviewed", "priority": "high"}}
])
updated = vector_store.fetch([sample_id])
print(f"After : {updated[0].get('filter', {})}\n")
# Delete a single document
vector_store.delete_vector(sample_id)
print(f"Deleted: {sample_id}")
info = vector_store.describe()
print(f"Remaining vectors: {info.get('count')}")Cleanup
Remove the index when you no longer need it:
vector_store.clear()
print(f"Deleted: {INDEX_NAME}")API reference
| Method | Description |
|---|---|
VectorStoreIndex.from_documents() | Embed and store documents |
as_retriever() | Create a retriever for queries |
retrieve(query) | Find the most relevant documents |
fetch(ids) | Get full stored data for specific vectors |
update_filters() | Update tags without re-uploading the document |
delete_vector(id) | Remove a single document |
describe() | View index stats (document count, etc.) |
Configuration
| Parameter | Description |
|---|---|
similarity_top_k | Number of results to return. 2-3 is usually enough for RAG. |
dense_rrf_weight | Balance between meaning search (1.0) and keyword search (0.0). Default: 0.5. |
ef | How many candidates Endee checks before returning results. Higher = more accurate, slower. Default: 128. |
prefilter_cardinality_threshold | How many candidates to pre-filter before ranking. |
filter_boost_percentage | How aggressively to boost filtered results. |
Supported operators
| Operator | Description |
|---|---|
EQ | Exact match (one value) |
IN | Match any value in a list |