Skip to Content
IntegrationsLlamaIndex

Endee + LlamaIndex: Quick Start

Open In Colab

Use Endee as a LlamaIndex vector store for semantic search and RAG pipelines.


Install & Setup

Install the required libraries for LlamaIndex and Endee integration.

pip install -q llama-index-vector-stores-endee llama-index-embeddings-huggingface pip install numpy==2.0.0

Authentication

Choose your connection method: local server or serverless cloud.

Local Server: If your server has NDD_AUTH_TOKEN set, pass the same token when initializing:

API_TOKEN = "ndd-auth-token" BASE_URL = "http://127.0.0.1:8080/api/v1"

Endee Serverless: Go to https://app.endee.io , create a token, then pass it here:

API_TOKEN = "your-serverless-token" BASE_URL = "" # leave empty for serverless

Import & Create Vector Store

Import libraries and create a vector store connected to Endee.

from llama_index.core import Document, StorageContext, VectorStoreIndex, Settings from llama_index.core.vector_stores.types import MetadataFilters, MetadataFilter, FilterOperator from llama_index.embeddings.huggingface import HuggingFaceEmbedding from llama_index_endee import EndeeVectorStore INDEX_NAME = "rag_demo_index" DIMENSION = 384 # Create vector store (use API_TOKEN and BASE_URL from above) vector_store = EndeeVectorStore.from_params( api_token=API_TOKEN, base_url=BASE_URL, index_name=INDEX_NAME, dimension=DIMENSION, sparse_model="endee_bm25", # enables hybrid search automatically ) # Load embedding model globally embed_model = HuggingFaceEmbedding(model_name="sentence-transformers/all-MiniLM-L6-v2") Settings.embed_model = embed_model

Add Documents

Create documents with text and metadata, then index them all at once.

documents = [ Document( text="Python is a high-level programming language known for readability.", metadata={"category": "programming", "language": "python", "level": "beginner"}, ), ... # more documents ] storage_context = StorageContext.from_defaults(vector_store=vector_store) index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)

Reconnect to an Existing Index

If the index already exists, from_params reconnects — no data loss. You don’t need to pass dimension:

vector_store = EndeeVectorStore.from_params( api_token=API_TOKEN, base_url=BASE_URL, index_name=INDEX_NAME, ) # Loads the existing index for querying — no re-indexing needed index = VectorStoreIndex.from_vector_store(vector_store)

Find similar documents by semantic meaning using query embeddings.

retriever = index.as_retriever(similarity_top_k=2) query = "Tell me about vector databases" results = retriever.retrieve(query) print(f"Query: '{query}'\n") for i, node in enumerate(results, 1): print(f"{i}. score={node.get_score():.3f}") print(f" {node.text}") print(f" {node.metadata}\n")

Filter Results

Restrict search to specific documents using metadata filters.

# EQ — only AI documents eq_filters = MetadataFilters( filters=[MetadataFilter(key="category", value="ai", operator=FilterOperator.EQ)] ) results = index.as_retriever(similarity_top_k=2, filters=eq_filters).retrieve( "How do systems learn from data?" ) print("EQ filter (category == ai):") for node in results: print(f" [{node.metadata.get('field')}] {node.text[:70]}\n") # IN — multiple categories in_filters = MetadataFilters( filters=[MetadataFilter(key="category", value=["ai", "database"], operator=FilterOperator.IN)] ) results = index.as_retriever(similarity_top_k=3, filters=in_filters).retrieve( "vector search and machine learning" ) print("IN filter (category in [ai, database]):") for i, node in enumerate(results, 1): print(f" {i}. {node.text[:70]}")

Automatically combines dense vectors with BM25 keyword matching for better results.

query = "privacy vector search" print(f"Query: '{query}'\n") # Compare different dense/sparse weights for weight, label in [(1.0, "dense only"), (0.5, "balanced"), (0.0, "sparse only")]: results = index.as_retriever( similarity_top_k=3, vector_store_kwargs={"dense_rrf_weight": weight}, ).retrieve(query) print(f"dense_rrf_weight={weight} ({label}):") for i, node in enumerate(results, 1): print(f" {i}. score={node.get_score():.4f} {node.text[:75]}\n")

Query Tuning

Adjust search parameters for different performance and recall tradeoffs.

# Tune ef (higher = better recall, slower) query_text = "Endee features" print(f"Query: '{query_text}'\n") for ef_val in [64, 128, 256]: results = index.as_retriever( similarity_top_k=2, vector_store_kwargs={"ef": ef_val}, ).retrieve(query_text) top = results[0] if results else None print(f" ef={ef_val:4d} score={top.get_score():.4f} {top.text[:70]}\n") # Tune prefilter threshold with metadata filter db_filters = MetadataFilters( filters=[MetadataFilter(key="category", value="database", operator=FilterOperator.EQ)] ) results = index.as_retriever( similarity_top_k=2, filters=db_filters, vector_store_kwargs={ "ef": 200, "prefilter_cardinality_threshold": 1000, "filter_boost_percentage": 20, }, ).retrieve("vector database search") print("Prefilter tuning (category == database):") for i, node in enumerate(results, 1): print(f" {i}. {node.text[:80]}")

Vector Operations

Work directly with Endee to inspect, update, and delete vectors.

# Fetch a vector by ID sample_nodes = index.as_retriever(similarity_top_k=1).retrieve("vector database") sample_id = sample_nodes[0].node.id_ fetched = vector_store.fetch([sample_id]) vec = fetched[0] print(f"Fetched: {sample_id}") print(f" embedding dim: {len(vec.get('vector', []))}") print(f" metadata: {vec.get('filter', {})}") print(f" has sparse data: {'sparse_indices' in vec}\n") # Update metadata without re-embedding print(f"Before: {vec.get('filter', {})}") vector_store.update_filters([ {"id": sample_id, "filter": {"category": "database", "status": "reviewed", "priority": "high"}} ]) updated = vector_store.fetch([sample_id]) print(f"After : {updated[0].get('filter', {})}\n") # Delete the vector vector_store.delete_vector(sample_id) print(f"Deleted: {sample_id}") info = vector_store.describe() print(f"Remaining vectors: {info.get('count')}")

Cleanup

Delete the index.

vector_store.clear() print(f"Deleted: {INDEX_NAME}")

Key Methods

Each method handles a specific operation in the vector search workflow.

MethodWhat it does
VectorStoreIndex.from_documents()Embed and store documents
as_retriever()Create a retriever for queries
retrieve(query)Find similar documents
fetch(ids)Get full vector data
update_filters()Change metadata without re-embedding
delete_vector(id)Delete a single vector
describe()View index stats

Quick Tips

Simple guidelines for common scenarios.

  • Settings.embed_model - Set globally so all operations use it automatically
  • sparse_model=“endee_bm25” - Enables hybrid search automatically
  • similarity_top_k=2 or 3 - Usually enough for RAG context
  • MetadataFilters - Use EQ and IN operators to restrict results
  • dense_rrf_weight - 1.0=dense only, 0.5=balanced, 0.0=sparse only
  • ef parameter - Higher values improve recall but are slower
  • prefilter_cardinality_threshold - Switch point between HNSW and exact search