Skip to Content
v1IntegrationsLangChain

Endee + LangChain

Open In Colab

Use Endee as a vector store in LangChain for semantic search, metadata filtering, hybrid retrieval, and RAG pipelines.

LangChain  is a framework for building LLM-powered applications. Endee integrates with LangChain through the langchain-endee package, supporting document storage, similarity search, metadata filtering, and retrieval-augmented generation.


Installation

pip install -q langchain-endee langchain-huggingface sentence-transformers pip install numpy==2.0.0

Authentication

Local server

API_TOKEN = "ndd-auth-token" # same token set in NDD_AUTH_TOKEN on your server BASE_URL = "http://127.0.0.1:8080/api/v1"

Endee Cloud

  1. Go to https://app.endee.io 
  2. Create a token
  3. Paste it below:
API_TOKEN = "your-serverless-token" BASE_URL = "" # leave empty - Endee figures out the cloud URL

Creating a vector store

Create a vector store with an embedding model:

from langchain_core.documents import Document from langchain_huggingface import HuggingFaceEmbeddings from langchain_endee import EndeeVectorStore # This model converts text into meaning fingerprints (embeddings) embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2") # Create the vector store (uses API_TOKEN and BASE_URL from above) vector_store = EndeeVectorStore( index_name="rag_demo", # name for this collection of documents api_token=API_TOKEN, base_url=BASE_URL, dimension=384, # size of the meaning fingerprint (matches the model above) embedding=embeddings, space_type="cosine", # how similarity is measured precision="int16", # uses half the memory vs float32, with no noticeable accuracy loss )

Adding documents

Documents have two parts: the text you want to search, and metadata (tags) you can filter by later.

documents = [ Document( page_content="Python is a programming language known for readability.", metadata={"topic": "programming", "language": "python"}, ), Document( page_content="Rust guarantees memory safety without a garbage collector.", metadata={"topic": "programming", "language": "rust"}, ), # Add as many documents as you need... ] vector_store.add_documents(documents)

Search by meaning - you don’t need exact words to match:

# Basic search - returns the 2 most relevant documents results = vector_store.similarity_search("How to learn Python?", k=2) for doc in results: print(doc.page_content) # Search with scores - score closer to 1.0 means more relevant scored = vector_store.similarity_search_with_score("memory safety", k=2) for doc, score in scored: print(f"{score:.3f} - {doc.page_content}")

Search with metadata filters

Narrow results to specific subsets using metadata filters:

# Only return documents tagged as "programming" results = vector_store.similarity_search( "semantic search", k=5, filter=[{"topic": {"$eq": "programming"}}], ) # Return documents tagged as either "programming" or "database" results = vector_store.similarity_search( "search algorithms", k=5, filter=[{"topic": {"$in": ["programming", "database"]}}], )

Managing documents

Add, fetch, update tags, or delete documents at any time:

# Add more documents later new_ids = vector_store.add_texts( texts=["Go is a concurrent programming language built by Google."], metadatas=[{"topic": "programming", "language": "go"}], ) # Fetch a document by its ID docs = vector_store.get_by_ids(new_ids) # Update tags without re-processing the document (fast!) vector_store.update_filters([ {"id": new_ids[0], "filter": {"topic": "programming", "difficulty": "intermediate"}}, ]) # Remove a document vector_store.delete(ids=[new_ids[0]])

Use as a retriever (RAG)

RAG (Retrieval-Augmented Generation) means your AI answers questions using your own documents as context:

# Turn your vector store into a "retriever" that LangChain chains can use retriever = vector_store.as_retriever(search_kwargs={"k": 3}) # Optionally, limit retrieval to a specific topic retriever = vector_store.as_retriever( search_kwargs={"k": 3, "filter": [{"topic": {"$eq": "programming"}}]}, ) # Retrieve relevant docs for a question, then pass them to your LLM docs = retriever.invoke("How do I learn Python?") context = "\n\n".join([doc.page_content for doc in docs]) # → Pass `context` as part of your LLM prompt

Hybrid search matches both meaning and exact keywords. Useful when users search for specific names, product codes, or technical terms.

from langchain_endee import RetrievalMode, EndeeModelSparse sparse = EndeeModelSparse() # handles keyword matching hybrid_store = EndeeVectorStore( index_name="hybrid_demo", api_token=API_TOKEN, base_url=BASE_URL, dimension=384, embedding=embeddings, retrieval_mode=RetrievalMode.HYBRID, sparse_embedding=sparse, ) hybrid_store.add_documents(documents) # Now searches use both meaning AND keyword matching results = hybrid_store.similarity_search("Python programming", k=3)

Cleanup

Remove indexes when you no longer need them:

client.delete_index("rag_demo") client.delete_index("hybrid_demo")

API reference

MethodDescription
add_documents(docs)Upload and index documents
add_texts(texts, metadatas)Upload plain text strings with tags
similarity_search(query, k)Find the k most relevant documents
similarity_search_with_score(query, k)Same as above, but also returns a relevance score (0-1)
get_by_ids(ids)Fetch specific documents by their ID
update_filters(updates)Update tags without re-uploading the document
delete(ids=...)Remove documents
as_retriever(...)Wrap as a LangChain retriever for use in AI pipelines

Configuration

ParameterDescription
kNumber of results to return. 3-5 is usually enough for RAG.
precisionStorage format. "int16" cuts memory in half with almost no accuracy loss.
space_typeHow similarity is measured. "cosine" works well for most use cases.

Supported operators

OperatorDescriptionExample
$eqExact match{"topic": {"$eq": "programming"}}
$inOne of a list{"topic": {"$in": ["ai", "database"]}}
$rangeBetween two values{"year": {"$range": [2020, 2024]}}