Skip to Content
v1IntegrationsLlamaIndex

Endee + LlamaIndex

Open In Colab

Use Endee as a vector store in LlamaIndex for semantic search, metadata filtering, hybrid retrieval, and RAG pipelines.

LlamaIndex  is a data framework for building LLM applications over external data. Endee integrates with LlamaIndex through the llama-index-vector-stores-endee package, supporting document indexing, similarity search, metadata filtering, and hybrid search.


Installation

pip install -q llama-index-vector-stores-endee llama-index-embeddings-huggingface pip install numpy==2.0.0

Authentication

Local server

API_TOKEN = "ndd-auth-token" # same token set in NDD_AUTH_TOKEN on your server BASE_URL = "http://127.0.0.1:8080/api/v1"

Endee Cloud

  1. Go to https://app.endee.io 
  2. Create a token
  3. Paste it below:
API_TOKEN = "your-serverless-token" BASE_URL = "" # leave empty - Endee figures out the cloud URL

Creating a vector store

Create a vector store, set the embedding model, and you’re ready to add documents:

from llama_index.core import Document, StorageContext, VectorStoreIndex, Settings from llama_index.core.vector_stores.types import MetadataFilters, MetadataFilter, FilterOperator from llama_index.embeddings.huggingface import HuggingFaceEmbedding from llama_index_endee import EndeeVectorStore INDEX_NAME = "rag_demo_index" DIMENSION = 384 # size of the meaning fingerprint (must match the model below) vector_store = EndeeVectorStore.from_params( api_token=API_TOKEN, base_url=BASE_URL, index_name=INDEX_NAME, dimension=DIMENSION, sparse_model="endee_bm25", # also enables keyword matching alongside meaning search ) embed_model = HuggingFaceEmbedding(model_name="sentence-transformers/all-MiniLM-L6-v2") Settings.embed_model = embed_model # set globally so all operations use it automatically

Adding documents

Documents have two parts: the text you want to search, and metadata (tags) you can filter by later.

documents = [ Document( text="Python is a high-level programming language known for readability.", metadata={"category": "programming", "language": "python", "level": "beginner"}, ), # Add as many documents as you need... ] storage_context = StorageContext.from_defaults(vector_store=vector_store) index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)

Find documents by meaning - you don’t need exact words to match:

retriever = index.as_retriever(similarity_top_k=2) # return the 2 best matches query = "Tell me about vector databases" results = retriever.retrieve(query) print(f"Query: '{query}'\n") for i, node in enumerate(results, 1): print(f"{i}. score={node.get_score():.3f}") # score closer to 1.0 = more relevant print(f" {node.text}") print(f" {node.metadata}\n")

Search with metadata filters

Narrow results to specific subsets using metadata filters:

# EQ - only return documents tagged as "ai" eq_filters = MetadataFilters( filters=[MetadataFilter(key="category", value="ai", operator=FilterOperator.EQ)] ) results = index.as_retriever(similarity_top_k=2, filters=eq_filters).retrieve( "How do systems learn from data?" ) print("EQ filter (category == ai):") for node in results: print(f" [{node.metadata.get('field')}] {node.text[:70]}\n") # IN - return documents tagged as either "ai" or "database" in_filters = MetadataFilters( filters=[MetadataFilter(key="category", value=["ai", "database"], operator=FilterOperator.IN)] ) results = index.as_retriever(similarity_top_k=3, filters=in_filters).retrieve( "vector search and machine learning" ) print("IN filter (category in [ai, database]):") for i, node in enumerate(results, 1): print(f" {i}. {node.text[:70]}")

Hybrid search matches both meaning and exact keywords. Useful when users search for specific names, product codes, or technical terms.

query = "privacy vector search" print(f"Query: '{query}'\n") # Try different blends of meaning vs. keyword matching for weight, label in [(1.0, "meaning only"), (0.5, "balanced"), (0.0, "keywords only")]: results = index.as_retriever( similarity_top_k=3, vector_store_kwargs={"dense_rrf_weight": weight}, ).retrieve(query) print(f"dense_rrf_weight={weight} ({label}):") for i, node in enumerate(results, 1): print(f" {i}. score={node.get_score():.4f} {node.text[:75]}\n")

Tuning search parameters

Adjust ef to balance speed vs. accuracy:

# ef controls how thoroughly Endee searches - higher = more accurate but slower query_text = "Endee features" print(f"Query: '{query_text}'\n") for ef_val in [64, 128, 256]: results = index.as_retriever( similarity_top_k=2, vector_store_kwargs={"ef": ef_val}, ).retrieve(query_text) top = results[0] if results else None print(f" ef={ef_val:4d} score={top.get_score():.4f} {top.text[:70]}\n") # Combine ef tuning with a metadata filter db_filters = MetadataFilters( filters=[MetadataFilter(key="category", value="database", operator=FilterOperator.EQ)] ) results = index.as_retriever( similarity_top_k=2, filters=db_filters, vector_store_kwargs={ "ef": 200, "prefilter_cardinality_threshold": 1000, # how many candidates to pre-filter before ranking "filter_boost_percentage": 20, # how aggressively to boost filtered results }, ).retrieve("vector database search") print("Fine-tuned search (category == database):") for i, node in enumerate(results, 1): print(f" {i}. {node.text[:80]}")

Managing documents

Inspect, update tags, or delete individual documents:

# Fetch a stored vector by ID sample_nodes = index.as_retriever(similarity_top_k=1).retrieve("vector database") sample_id = sample_nodes[0].node.id_ fetched = vector_store.fetch([sample_id]) vec = fetched[0] print(f"Fetched: {sample_id}") print(f" embedding dimensions: {len(vec.get('vector', []))}") print(f" metadata tags: {vec.get('filter', {})}") print(f" has keyword data: {'sparse_indices' in vec}\n") # Update tags without re-uploading the document (fast!) print(f"Before: {vec.get('filter', {})}") vector_store.update_filters([ {"id": sample_id, "filter": {"category": "database", "status": "reviewed", "priority": "high"}} ]) updated = vector_store.fetch([sample_id]) print(f"After : {updated[0].get('filter', {})}\n") # Delete a single document vector_store.delete_vector(sample_id) print(f"Deleted: {sample_id}") info = vector_store.describe() print(f"Remaining vectors: {info.get('count')}")

Cleanup

Remove the index when you no longer need it:

vector_store.clear() print(f"Deleted: {INDEX_NAME}")

API reference

MethodDescription
VectorStoreIndex.from_documents()Embed and store documents
as_retriever()Create a retriever for queries
retrieve(query)Find the most relevant documents
fetch(ids)Get full stored data for specific vectors
update_filters()Update tags without re-uploading the document
delete_vector(id)Remove a single document
describe()View index stats (document count, etc.)

Configuration

ParameterDescription
similarity_top_kNumber of results to return. 2-3 is usually enough for RAG.
dense_rrf_weightBalance between meaning search (1.0) and keyword search (0.0). Default: 0.5.
efHow many candidates Endee checks before returning results. Higher = more accurate, slower. Default: 128.
prefilter_cardinality_thresholdHow many candidates to pre-filter before ranking.
filter_boost_percentageHow aggressively to boost filtered results.

Supported operators

OperatorDescription
EQExact match (one value)
INMatch any value in a list