Endee + LangChain

Use Endee as a vector store in LangChain for semantic search, metadata filtering, hybrid retrieval, and RAG pipelines.

LangChain is a framework for building LLM-powered applications. Endee integrates with LangChain through the langchain-endee package, supporting document storage, similarity search, metadata filtering, and retrieval-augmented generation.

Installation


pip install -q langchain-endee langchain-huggingface sentence-transformers
pip install numpy==2.0.0

Authentication

Local server


API_TOKEN = "ndd-auth-token"   # same token set in NDD_AUTH_TOKEN on your server
BASE_URL = "http://127.0.0.1:8080/api/v1"

Endee Cloud

Go to https://app.endee.io
Create a token
Paste it below:


API_TOKEN = "your-serverless-token"
BASE_URL = ""   # leave empty - Endee figures out the cloud URL

Creating a vector store

Create a vector store with an embedding model:


from langchain_core.documents import Document
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_endee import EndeeVectorStore
 
# This model converts text into meaning fingerprints (embeddings)
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
 
# Create the vector store (uses API_TOKEN and BASE_URL from above)
vector_store = EndeeVectorStore(
    index_name="rag_demo",       # name for this collection of documents
    api_token=API_TOKEN,
    base_url=BASE_URL,
    dimension=384,               # size of the meaning fingerprint (matches the model above)
    embedding=embeddings,
    space_type="cosine",         # how similarity is measured
    precision="int16",           # uses half the memory vs float32, with no noticeable accuracy loss
)

Adding documents

Documents have two parts: the text you want to search, and metadata (tags) you can filter by later.


documents = [
    Document(
        page_content="Python is a programming language known for readability.",
        metadata={"topic": "programming", "language": "python"},
    ),
    Document(
        page_content="Rust guarantees memory safety without a garbage collector.",
        metadata={"topic": "programming", "language": "rust"},
    ),
    # Add as many documents as you need...
]
 
vector_store.add_documents(documents)

Similarity search

Search by meaning - you don’t need exact words to match:


# Basic search - returns the 2 most relevant documents
results = vector_store.similarity_search("How to learn Python?", k=2)
for doc in results:
    print(doc.page_content)
 
# Search with scores - score closer to 1.0 means more relevant
scored = vector_store.similarity_search_with_score("memory safety", k=2)
for doc, score in scored:
    print(f"{score:.3f} - {doc.page_content}")

Search with metadata filters

Narrow results to specific subsets using metadata filters:


# Only return documents tagged as "programming"
results = vector_store.similarity_search(
    "semantic search",
    k=5,
    filter=[{"topic": {"$eq": "programming"}}],
)
 
# Return documents tagged as either "programming" or "database"
results = vector_store.similarity_search(
    "search algorithms",
    k=5,
    filter=[{"topic": {"$in": ["programming", "database"]}}],
)

Managing documents

Add, fetch, update tags, or delete documents at any time:


# Add more documents later
new_ids = vector_store.add_texts(
    texts=["Go is a concurrent programming language built by Google."],
    metadatas=[{"topic": "programming", "language": "go"}],
)
 
# Fetch a document by its ID
docs = vector_store.get_by_ids(new_ids)
 
# Update tags without re-processing the document (fast!)
vector_store.update_filters([
    {"id": new_ids[0], "filter": {"topic": "programming", "difficulty": "intermediate"}},
])
 
# Remove a document
vector_store.delete(ids=[new_ids[0]])

Use as a retriever (RAG)

RAG (Retrieval-Augmented Generation) means your AI answers questions using your own documents as context:


# Turn your vector store into a "retriever" that LangChain chains can use
retriever = vector_store.as_retriever(search_kwargs={"k": 3})
 
# Optionally, limit retrieval to a specific topic
retriever = vector_store.as_retriever(
    search_kwargs={"k": 3, "filter": [{"topic": {"$eq": "programming"}}]},
)
 
# Retrieve relevant docs for a question, then pass them to your LLM
docs = retriever.invoke("How do I learn Python?")
context = "\n\n".join([doc.page_content for doc in docs])
# → Pass `context` as part of your LLM prompt

Hybrid search

Hybrid search matches both meaning and exact keywords. Useful when users search for specific names, product codes, or technical terms.


from langchain_endee import RetrievalMode, EndeeModelSparse
 
sparse = EndeeModelSparse()   # handles keyword matching
 
hybrid_store = EndeeVectorStore(
    index_name="hybrid_demo",
    api_token=API_TOKEN,
    base_url=BASE_URL,
    dimension=384,
    embedding=embeddings,
    retrieval_mode=RetrievalMode.HYBRID,
    sparse_embedding=sparse,
)
hybrid_store.add_documents(documents)
 
# Now searches use both meaning AND keyword matching
results = hybrid_store.similarity_search("Python programming", k=3)

Cleanup

Remove indexes when you no longer need them:


client.delete_index("rag_demo")
client.delete_index("hybrid_demo")

API reference

Method	Description
`add_documents(docs)`	Upload and index documents
`add_texts(texts, metadatas)`	Upload plain text strings with tags
`similarity_search(query, k)`	Find the k most relevant documents
`similarity_search_with_score(query, k)`	Same as above, but also returns a relevance score (0-1)
`get_by_ids(ids)`	Fetch specific documents by their ID
`update_filters(updates)`	Update tags without re-uploading the document
`delete(ids=...)`	Remove documents
`as_retriever(...)`	Wrap as a LangChain retriever for use in AI pipelines

Configuration

Parameter	Description
`k`	Number of results to return. `3`-`5` is usually enough for RAG.
`precision`	Storage format. `"int16"` cuts memory in half with almost no accuracy loss.
`space_type`	How similarity is measured. `"cosine"` works well for most use cases.

Supported operators

Operator	Description	Example
`$eq`	Exact match	`{"topic": {"$eq": "programming"}}`
`$in`	One of a list	`{"topic": {"$in": ["ai", "database"]}}`
`$range`	Between two values	`{"year": {"$range": [2020, 2024]}}`