Skip to Content

Endee + CrewAI

Open In Colab

Use Endee as the memory backend for CrewAI agents - semantic search, metadata filtering, and hybrid retrieval with no OpenAI key required.

CrewAI  lets you build AI agents that work together. By default, those agents store their memory using ChromaDB. This guide shows you how to swap ChromaDB for Endee using EndeeVectorStore - same interface, different backend.


Requirements


Installation

pip install crewai-endee sentence-transformers==3.0.0 pip install numpy==2.0.0

Pin sentence-transformers to 3.0.0 exactly. You’ll see a version conflict warning - that’s expected.


Imports

import os import time from getpass import getpass from crewai_endee import EndeeVectorStore

Authentication

Local server

API_TOKEN = "ndd-auth-token" # same token set in NDD_AUTH_TOKEN on your server BASE_URL = "http://127.0.0.1:8080/api/v1"

Endee Cloud

  1. Go to https://app.endee.io 
  2. Create a token
  3. Paste it below:
API_TOKEN = "your-serverless-token" BASE_URL = "" # leave empty - Endee figures out the cloud URL

Creating a memory store

Create a memory store that your agents will read and write to:

embedder_config = { "provider": "sentence-transformer", "config": { "model_name": "all-MiniLM-L6-v2", # runs locally, no API key needed "device": "cpu", # change to "cuda" or "mps" for GPU }, } store = EndeeVectorStore( type="demo_dense", # a name for this memory store embedder_config=embedder_config, api_token=API_TOKEN, space_type="cosine", # how similarity is measured precision="int8", # memory compression level base_url=BASE_URL, ) store.reset() # wipe any old data with the same name store.ensure_index() # create a fresh, empty store

store.reset() deletes the old store and clears an internal cache. Without clearing the cache, Endee would still think the old store exists and throw errors when you save new data.


Adding documents

Each document is a piece of text plus optional metadata tags. Agents search through these later.

documents = [ ( "Python is a high-level, interpreted language designed by Guido van Rossum in 1991. " "Typing: dynamic, strong. Uses: AI/ML, web, scripting.", {"lang": "Python", "year": 1991, "typing": "dynamic"}, ), ( "Java follows 'write once, run anywhere' on the JVM, designed by James Gosling in 1995. " "Typing: static, strong. Uses: enterprise, Android, backend.", {"lang": "Java", "year": 1995, "typing": "static"}, ), # ... more documents ] for text, meta in documents: store.save(value=text, metadata=meta)

The metadata fields (lang, year, typing) become filterable - you can search within a subset of your data.


Ask questions in plain English. Endee finds the most relevant documents by meaning.

time.sleep(2) # wait a moment - Endee indexes in the background queries = [ "Who created the Go programming language?", "Which languages use dynamic typing?", "Languages suitable for cloud-native microservices", ] for query in queries: results = store.search(query, limit=2) print(f"Query: '{query}'") for r in results: print(f" [{r['score']:.3f}] {r['content'][:80]}") print()

Each result includes:

  • id - the document’s unique ID
  • content - the original text
  • metadata - the tags you attached
  • score - how similar it is to your query (0 to 1, higher is better)

Hybrid search combines two approaches:

  • Dense search - finds results that mean the same thing
  • BM25 keyword search - finds results that use the same words

To enable it, add sparse_model_name="endee_bm25" when creating the store:

hybrid_store = EndeeVectorStore( type="demo_hybrid", embedder_config=embedder_config, api_token=API_TOKEN, space_type="cosine", sparse_model_name="endee_bm25", # ← this enables hybrid mode base_url=BASE_URL, ) hybrid_store.reset() hybrid_store.ensure_index() for text, meta in documents: hybrid_store.save(value=text, metadata=meta) time.sleep(2)

Compare the two side by side:

query = "Go cloud-native microservices" print("Dense only:") for r in store.search(query, limit=3): print(f" [{r['score']:.3f}] {r['metadata'].get('lang')}: {r['content'][:60]}") print("\nHybrid (balanced):") for r in hybrid_store.search(query, limit=3): print(f" [{r['score']:.3f}] {r['metadata'].get('lang')}: {r['content'][:60]}") print("\nHybrid (favour meaning over keywords):") for r in hybrid_store.search(query, limit=3, dense_rrf_weight=0.8, rrf_rank_constant=60): print(f" [{r['score']:.3f}] {r['metadata'].get('lang')}: {r['content'][:60]}")

Tune the balance with dense_rrf_weight:

ValueBehaviour
0.0Keywords only (BM25)
0.5Balanced (default)
1.0Meaning only (dense)

A hybrid store and a dense-only store are separate. You can’t switch between modes after creation - create a new store if you need to change.


Search with metadata filters

Narrow search results to a specific subset using filters:

# Only search documents tagged as static-typed languages results = store.search( query="high performance language", limit=3, filter=[{"typing": {"$eq": "static"}}], ) # Only dynamic-typed languages, and only if they're a strong match results = store.search( query="web development", limit=3, filter=[{"typing": {"$eq": "dynamic"}}], score_threshold=0.2, # ignore results below this similarity score )

Supported operators

OperatorDescriptionExample
$eqEquals{"typing": {"$eq": "static"}}
$inOne of a list{"lang": {"$in": ["Python", "Go"]}}
$rangeBetween two values{"year": {"$range": [1990, 2000]}}

Filters run during the search itself (not after), so they’re fast even on large datasets.

Get the raw vector for a result:

results = store.search("interpreted language", limit=1, include_vectors=True) vec = results[0]["vector"] # the raw embedding numbers

Managing documents

Fetch, update, and delete specific documents by ID:

# Add a document store.save( value="Rust is a systems language focused on memory safety, designed by Mozilla.", metadata={"lang": "Rust", "typing": "static"}, ) time.sleep(2) # Find it by searching rust_results = store.search("Rust memory safety", limit=1) if rust_results: vec_id = rust_results[0]["id"] # Look it up directly by ID vec_data = store.get_vector(vec_id) print(f"Fetched: {vec_data.get('meta', {})}") # Update its metadata tags (without re-embedding the text) store.update_filters([{"id": vec_id, "filter": {"reviewed": "true"}}]) # Delete it store.delete_vector(vec_id) # Delete many documents at once using a filter store.save(value="Temp doc 1", metadata={"category": "throwaway"}) store.save(value="Temp doc 2", metadata={"category": "throwaway"}) time.sleep(2) store.delete(filter=[{"category": {"$eq": "throwaway"}}])

Cleanup

Delete the demo stores when you’re done:

for s, name in [ (store, "demo_dense"), (hybrid_store, "demo_hybrid"), ]: try: s.reset() print(f"Deleted: {name}") except Exception as e: print(f"Could not delete {name}: {e}")

Configuration

ParameterDescription
typeThe name of your store. Must be unique per project.
embedder_configWhich model to use to turn text into numbers.
api_tokenYour cloud token. Leave empty for local.
space_typeHow to measure similarity. "cosine" works well for most use cases.
precisionHow to compress stored vectors. "int8" saves space with minimal quality loss.
ef_conBuild quality vs. speed trade-off for the index. Higher = better quality, slower build.
sparse_model_namePass "endee_bm25" to enable hybrid search.
base_urlOverride the server address. Only used in local mode.

API reference

MethodDescription
EndeeVectorStore(...)Create a memory store
store.ensure_index()Create a fresh, empty store
store.save(value, metadata)Add a document
store.search(query, limit)Find the most relevant documents
store.get_vector(id)Fetch a document by ID
store.update_filters([...])Update metadata tags without re-embedding
store.delete_vector(id)Delete one document
store.delete(filter=[...])Delete documents matching a filter
store.describe()View store stats
store.reset()Delete everything and start fresh