Endee + LangChain
Use Endee as a vector store in LangChain for semantic search, metadata filtering, hybrid retrieval, and RAG pipelines.
LangChain is a framework for building LLM-powered applications. Endee integrates with LangChain through the langchain-endee package, supporting document storage, similarity search, metadata filtering, and retrieval-augmented generation.
Installation
pip install -q langchain-endee langchain-huggingface sentence-transformers
pip install numpy==2.0.0Authentication
Local server
API_TOKEN = "ndd-auth-token" # same token set in NDD_AUTH_TOKEN on your server
BASE_URL = "http://127.0.0.1:8080/api/v1"Endee Cloud
- Go to https://app.endee.io
- Create a token
- Paste it below:
API_TOKEN = "your-serverless-token"
BASE_URL = "" # leave empty - Endee figures out the cloud URLCreating a vector store
Create a vector store with an embedding model:
from langchain_core.documents import Document
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_endee import EndeeVectorStore
# This model converts text into meaning fingerprints (embeddings)
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
# Create the vector store (uses API_TOKEN and BASE_URL from above)
vector_store = EndeeVectorStore(
index_name="rag_demo", # name for this collection of documents
api_token=API_TOKEN,
base_url=BASE_URL,
dimension=384, # size of the meaning fingerprint (matches the model above)
embedding=embeddings,
space_type="cosine", # how similarity is measured
precision="int16", # uses half the memory vs float32, with no noticeable accuracy loss
)Adding documents
Documents have two parts: the text you want to search, and metadata (tags) you can filter by later.
documents = [
Document(
page_content="Python is a programming language known for readability.",
metadata={"topic": "programming", "language": "python"},
),
Document(
page_content="Rust guarantees memory safety without a garbage collector.",
metadata={"topic": "programming", "language": "rust"},
),
# Add as many documents as you need...
]
vector_store.add_documents(documents)Similarity search
Search by meaning - you don’t need exact words to match:
# Basic search - returns the 2 most relevant documents
results = vector_store.similarity_search("How to learn Python?", k=2)
for doc in results:
print(doc.page_content)
# Search with scores - score closer to 1.0 means more relevant
scored = vector_store.similarity_search_with_score("memory safety", k=2)
for doc, score in scored:
print(f"{score:.3f} - {doc.page_content}")Search with metadata filters
Narrow results to specific subsets using metadata filters:
# Only return documents tagged as "programming"
results = vector_store.similarity_search(
"semantic search",
k=5,
filter=[{"topic": {"$eq": "programming"}}],
)
# Return documents tagged as either "programming" or "database"
results = vector_store.similarity_search(
"search algorithms",
k=5,
filter=[{"topic": {"$in": ["programming", "database"]}}],
)Managing documents
Add, fetch, update tags, or delete documents at any time:
# Add more documents later
new_ids = vector_store.add_texts(
texts=["Go is a concurrent programming language built by Google."],
metadatas=[{"topic": "programming", "language": "go"}],
)
# Fetch a document by its ID
docs = vector_store.get_by_ids(new_ids)
# Update tags without re-processing the document (fast!)
vector_store.update_filters([
{"id": new_ids[0], "filter": {"topic": "programming", "difficulty": "intermediate"}},
])
# Remove a document
vector_store.delete(ids=[new_ids[0]])Use as a retriever (RAG)
RAG (Retrieval-Augmented Generation) means your AI answers questions using your own documents as context:
# Turn your vector store into a "retriever" that LangChain chains can use
retriever = vector_store.as_retriever(search_kwargs={"k": 3})
# Optionally, limit retrieval to a specific topic
retriever = vector_store.as_retriever(
search_kwargs={"k": 3, "filter": [{"topic": {"$eq": "programming"}}]},
)
# Retrieve relevant docs for a question, then pass them to your LLM
docs = retriever.invoke("How do I learn Python?")
context = "\n\n".join([doc.page_content for doc in docs])
# → Pass `context` as part of your LLM promptHybrid search
Hybrid search matches both meaning and exact keywords. Useful when users search for specific names, product codes, or technical terms.
from langchain_endee import RetrievalMode, EndeeModelSparse
sparse = EndeeModelSparse() # handles keyword matching
hybrid_store = EndeeVectorStore(
index_name="hybrid_demo",
api_token=API_TOKEN,
base_url=BASE_URL,
dimension=384,
embedding=embeddings,
retrieval_mode=RetrievalMode.HYBRID,
sparse_embedding=sparse,
)
hybrid_store.add_documents(documents)
# Now searches use both meaning AND keyword matching
results = hybrid_store.similarity_search("Python programming", k=3)Cleanup
Remove indexes when you no longer need them:
client.delete_index("rag_demo")
client.delete_index("hybrid_demo")API reference
| Method | Description |
|---|---|
add_documents(docs) | Upload and index documents |
add_texts(texts, metadatas) | Upload plain text strings with tags |
similarity_search(query, k) | Find the k most relevant documents |
similarity_search_with_score(query, k) | Same as above, but also returns a relevance score (0-1) |
get_by_ids(ids) | Fetch specific documents by their ID |
update_filters(updates) | Update tags without re-uploading the document |
delete(ids=...) | Remove documents |
as_retriever(...) | Wrap as a LangChain retriever for use in AI pipelines |
Configuration
| Parameter | Description |
|---|---|
k | Number of results to return. 3-5 is usually enough for RAG. |
precision | Storage format. "int16" cuts memory in half with almost no accuracy loss. |
space_type | How similarity is measured. "cosine" works well for most use cases. |
Supported operators
| Operator | Description | Example |
|---|---|---|
$eq | Exact match | {"topic": {"$eq": "programming"}} |
$in | One of a list | {"topic": {"$in": ["ai", "database"]}} |
$range | Between two values | {"year": {"$range": [2020, 2024]}} |