Skip to Content
IntegrationsLangChain

LangChain Integration with Endee

Open In Colab

LangChain  is a framework for building LLM-powered applications. This integration uses Endee as a LangChain vector store — a drop-in replacement for any VectorStore-compatible backend. You get Endee’s hybrid search, metadata filtering, and persistent indexes while keeping the full LangChain interface: similarity_search(), as_retriever(), LCEL chains, and agents all work without modification.


Install Dependencies

# Core dependencies pip install langchain-endee endee endee-model # Pick an embedding model: # Option A: Local (no API key) pip install langchain-huggingface sentence-transformers # Option B: OpenAI # pip install langchain-openai # Optional: SPLADE sparse embeddings for hybrid search # pip install fastembed

Connect to Endee

Serverless: get a token from app.endee.io . Local: run Endee locally — no token needed (GitHub ).

See Quick Start for setup details.

import os from langchain_core.documents import Document from endee import Endee, Precision from langchain_endee import ( EndeeVectorStore, RetrievalMode, EndeeModelSparse, # native BM25 (server-side IDF via endee-model) FastEmbedSparse, # SPLADE / BM25 via fastembed (optional) ) ENDEE_TOKEN = os.environ.get("ENDEE_API_TOKEN", "")

Running on Google Colab? Use the Secrets tab (key icon in left sidebar) to store ENDEE_API_TOKEN. The notebook auto-detects Colab and reads secrets via google.colab.userdata.

Choose an Embedding Model

LangChain’s Embeddings interface provides a standard way to convert text into vectors. Any class implementing embed_documents() and embed_query() works — pick one below:

OptionModelDimensionNeeds API key?
A (local)all-MiniLM-L6-v2384No
B (cloud)text-embedding-3-small1536Yes (OPENAI_API_KEY)
# Option A — runs locally, no API key from langchain_huggingface import HuggingFaceEmbeddings embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2") DIMENSION = 384
# Option B — OpenAI (uncomment to use) # from langchain_openai import OpenAIEmbeddings # embeddings = OpenAIEmbeddings(model="text-embedding-3-small") # DIMENSION = 1536

Create the Vector Store

LangChain’s VectorStore base class provides factory methods for index creation and ingestion. from_documents() embeds and upserts all documents in a single call.

INDEX = "rag_demo" vector_store = EndeeVectorStore.from_documents( documents=documents, embedding=embeddings, index_name=INDEX, api_token=ENDEE_TOKEN, dimension=DIMENSION, space_type="cosine", precision=Precision.INT16, force_recreate=True, )

Alternative: from_texts() — pass raw strings and metadata lists instead of Document objects.

vector_store = EndeeVectorStore.from_texts( texts=["Python is great", "Rust is fast"], metadatas=[{"language": "python"}, {"language": "rust"}], embedding=embeddings, index_name="my_index", api_token=ENDEE_TOKEN, dimension=DIMENSION, )

Search Methods

LangChain’s VectorStore defines four standard search methods. Each embeds the query (or accepts a pre-computed vector), runs approximate nearest-neighbour search, and returns Document objects:

MethodInputReturns
similarity_search()query stringlist[Document]
similarity_search_with_score()query stringlist[tuple[Document, float]]
similarity_search_by_vector()embedding vectorlist[Document]
similarity_search_by_vector_with_score()embedding vectorlist[tuple[Document, float]]
results = vector_store.similarity_search(query="How does RAG work?", k=3) for doc in results: print(f" [{doc.metadata.get('topic')}] {doc.page_content[:70]}")

Similarity Search with Score

scored = vector_store.similarity_search_with_score(query="neural networks", k=3) for doc, score in scored: print(f" sim={score:.3f} {doc.page_content[:60]}")

Similarity Search by Vector

query_vec = embeddings.embed_query("programming language safety") results = vector_store.similarity_search_by_vector(embedding=query_vec, k=2)

Similarity Search by Vector with Score

scored_by_vec = vector_store.similarity_search_by_vector_with_score( embedding=query_vec, k=3, filter=[{"topic": {"$eq": "programming"}}], ) for doc, score in scored_by_vec: print(f" sim={score:.3f} {doc.page_content[:65]}")

Metadata Filters

All LangChain VectorStore search methods accept a filter parameter to narrow results by metadata. Filters are passed as a list of dicts (AND logic).

See Filtering for supported filter operators ($eq, $in, $range).

ai_docs = vector_store.similarity_search( query="learning from data", k=5, filter=[{"topic": {"$eq": "ai"}}], )
lang_docs = vector_store.similarity_search( query="memory safety", k=5, filter=[{"language": {"$in": ["python", "rust"]}}], )

Search Tuning Parameters

See Filtering for details on ef, prefilter_cardinality_threshold, and filter_boost_percentage.

advanced = vector_store.similarity_search_with_score( query="vector search algorithms", k=3, ef=256, filter=[{"topic": {"$eq": "database"}}], prefilter_cardinality_threshold=5_000, filter_boost_percentage=20, include_vectors=False, )

CRUD Operations

LangChain’s VectorStore interface provides methods for managing documents after initial ingestion: add_texts() to insert, get_by_ids() to fetch, and delete() to remove. Endee also supports update_filters() to modify metadata without re-embedding.

Add Texts

new_ids = vector_store.add_texts( texts=[ "Go is a statically typed language designed at Google for scalable services.", "TypeScript adds static typing to JavaScript for safer large codebases.", ], metadatas=[ {"topic": "programming", "language": "go"}, {"topic": "programming", "language": "typescript"}, ], batch_size=1000, embedding_chunk_size=100, )

Get by IDs

fetched = vector_store.get_by_ids(new_ids) for doc in fetched: print(f" [{doc.metadata.get('language')}] {doc.page_content[:60]}")

Update Filters

vector_store.update_filters([ { "id": new_ids[0], "filter": {"topic": "programming", "language": "go", "difficulty": "intermediate"}, }, ])

Delete

# Delete by IDs vector_store.delete(ids=[new_ids[1]]) # Delete by filter vector_store.delete(filter=[{"language": {"$eq": "go"}}])

LangChain Retriever

as_retriever() wraps any VectorStore into a LangChain Retriever — the standard interface for plugging search into chains, agents, and RAG pipelines. It implements invoke(query) -> list[Document].

retriever = vector_store.as_retriever(search_kwargs={"k": 3}) docs = retriever.invoke("What are vector databases used for?")

With metadata filters:

retriever_filtered = vector_store.as_retriever( search_type="similarity", search_kwargs={"k": 3, "filter": [{"topic": {"$eq": "ai"}}]}, ) docs_filtered = retriever_filtered.invoke("machine learning")

Hybrid search combines dense (semantic) and sparse (keyword) retrieval. Pass a sparse_embedding and retrieval_mode=RetrievalMode.HYBRID to enable it. All standard LangChain search methods then automatically fuse both signal types.

See Sparse Vectors (BM25) for sparse model options and Search for RRF tuning parameters.

# Option A: EndeeModelSparse (recommended) sparse = EndeeModelSparse() # Option B: FastEmbedSparse with SPLADE # sparse = FastEmbedSparse() # Option C: FastEmbedSparse with BM25 # sparse = FastEmbedSparse(model_name="Qdrant/bm25", batch_size=256) hybrid_store = EndeeVectorStore.from_documents( documents=documents, embedding=embeddings, index_name="rag_demo_hybrid", api_token=ENDEE_TOKEN, dimension=DIMENSION, space_type="cosine", retrieval_mode=RetrievalMode.HYBRID, sparse_embedding=sparse, force_recreate=True, )

Compare dense-only vs hybrid:

query = "vector database semantic search" dense_hits = vector_store.similarity_search_with_score(query, k=3) hybrid_hits = hybrid_store.similarity_search_with_score(query, k=3) print("Dense only:") for doc, score in dense_hits: print(f" [{score:.3f}] {doc.page_content[:65]}") print("\nHybrid (dense + BM25):") for doc, score in hybrid_hits: print(f" [{score:.3f}] {doc.page_content[:65]}")

Tune RRF (Reciprocal Rank Fusion)

rrf_hits = hybrid_store.similarity_search_with_score( query, k=3, rrf_rank_constant=60, dense_rrf_weight=0.7, )

Full RAG Chain

This uses LangChain’s LCEL (LangChain Expression Language)  to compose a retrieval-augmented generation pipeline:

  • Retriever fetches relevant documents
  • ChatPromptTemplate formats the context + question into a prompt
  • HuggingFacePipeline runs a local LLM (no API key needed)
  • RunnablePassthrough passes the user question through unchanged
  • StrOutputParser extracts the text from the LLM response
from langchain_huggingface import HuggingFacePipeline from langchain_core.prompts import ChatPromptTemplate from langchain_core.output_parsers import StrOutputParser from langchain_core.runnables import RunnablePassthrough def format_docs(docs): return "\n\n".join(doc.page_content for doc in docs) retriever = hybrid_store.as_retriever(search_kwargs={"k": 3}) # Runs locally — no API key needed llm = HuggingFacePipeline.from_model_id( model_id="google/flan-t5-base", task="text2text-generation", pipeline_kwargs={"max_new_tokens": 256}, ) prompt = ChatPromptTemplate.from_template( "Answer the question based only on the context below.\n\n" "Context:\n{context}\n\n" "Question: {question}" ) rag_chain = ( {"context": retriever | format_docs, "question": RunnablePassthrough()} | prompt | llm | StrOutputParser() ) answer = rag_chain.invoke("What is deep learning and what does it power?") print(answer)

Reconnect to an Existing Index

LangChain’s from_existing_index() connects to a previously created index without re-ingesting — ideal for production use.

existing = EndeeVectorStore.from_existing_index( index_name="rag_demo", embedding=embeddings, api_token=ENDEE_TOKEN, ) docs = existing.similarity_search("Python", k=1)

Key Takeaways

  • EndeeVectorStore implements LangChain’s VectorStore interface — all standard methods work.
  • Embeddings — any LangChain embedding model plugs in directly.
  • as_retriever() — wraps the store into a standard LangChain Retriever for chains and agents.
  • LCEL  — compose retriever + prompt + LLM into a RAG pipeline with the | operator.
  • Hybrid search — combines dense and sparse retrieval; see Sparse Vectors (BM25).
  • from_existing_index() — reconnect in production without re-ingesting.