Endee + CrewAI
Use Endee as the memory backend for CrewAI agents - semantic search, metadata filtering, and hybrid retrieval with no OpenAI key required.
CrewAI lets you build AI agents that work together. By default, those agents store their memory using ChromaDB. This guide shows you how to swap ChromaDB for Endee using EndeeVectorStore - same interface, different backend.
Requirements
- Cloud: A free token from app.endee.io
- Local: Endee running from the GitHub repo
Installation
pip install crewai-endee sentence-transformers==3.0.0
pip install numpy==2.0.0Pin
sentence-transformersto 3.0.0 exactly. You’ll see a version conflict warning - that’s expected.
Imports
import os
import time
from getpass import getpass
from crewai_endee import EndeeVectorStoreAuthentication
Local server
API_TOKEN = "ndd-auth-token" # same token set in NDD_AUTH_TOKEN on your server
BASE_URL = "http://127.0.0.1:8080/api/v1"Endee Cloud
- Go to https://app.endee.io
- Create a token
- Paste it below:
API_TOKEN = "your-serverless-token"
BASE_URL = "" # leave empty - Endee figures out the cloud URLCreating a memory store
Create a memory store that your agents will read and write to:
embedder_config = {
"provider": "sentence-transformer",
"config": {
"model_name": "all-MiniLM-L6-v2", # runs locally, no API key needed
"device": "cpu", # change to "cuda" or "mps" for GPU
},
}
store = EndeeVectorStore(
type="demo_dense", # a name for this memory store
embedder_config=embedder_config,
api_token=API_TOKEN,
space_type="cosine", # how similarity is measured
precision="int8", # memory compression level
base_url=BASE_URL,
)
store.reset() # wipe any old data with the same name
store.ensure_index() # create a fresh, empty storestore.reset() deletes the old store and clears an internal cache. Without clearing the cache, Endee would still think the old store exists and throw errors when you save new data.
Adding documents
Each document is a piece of text plus optional metadata tags. Agents search through these later.
documents = [
(
"Python is a high-level, interpreted language designed by Guido van Rossum in 1991. "
"Typing: dynamic, strong. Uses: AI/ML, web, scripting.",
{"lang": "Python", "year": 1991, "typing": "dynamic"},
),
(
"Java follows 'write once, run anywhere' on the JVM, designed by James Gosling in 1995. "
"Typing: static, strong. Uses: enterprise, Android, backend.",
{"lang": "Java", "year": 1995, "typing": "static"},
),
# ... more documents
]
for text, meta in documents:
store.save(value=text, metadata=meta)The metadata fields (lang, year, typing) become filterable - you can search within a subset of your data.
Similarity search
Ask questions in plain English. Endee finds the most relevant documents by meaning.
time.sleep(2) # wait a moment - Endee indexes in the background
queries = [
"Who created the Go programming language?",
"Which languages use dynamic typing?",
"Languages suitable for cloud-native microservices",
]
for query in queries:
results = store.search(query, limit=2)
print(f"Query: '{query}'")
for r in results:
print(f" [{r['score']:.3f}] {r['content'][:80]}")
print()Each result includes:
id- the document’s unique IDcontent- the original textmetadata- the tags you attachedscore- how similar it is to your query (0 to 1, higher is better)
Hybrid search
Hybrid search combines two approaches:
- Dense search - finds results that mean the same thing
- BM25 keyword search - finds results that use the same words
To enable it, add sparse_model_name="endee_bm25" when creating the store:
hybrid_store = EndeeVectorStore(
type="demo_hybrid",
embedder_config=embedder_config,
api_token=API_TOKEN,
space_type="cosine",
sparse_model_name="endee_bm25", # ← this enables hybrid mode
base_url=BASE_URL,
)
hybrid_store.reset()
hybrid_store.ensure_index()
for text, meta in documents:
hybrid_store.save(value=text, metadata=meta)
time.sleep(2)Compare the two side by side:
query = "Go cloud-native microservices"
print("Dense only:")
for r in store.search(query, limit=3):
print(f" [{r['score']:.3f}] {r['metadata'].get('lang')}: {r['content'][:60]}")
print("\nHybrid (balanced):")
for r in hybrid_store.search(query, limit=3):
print(f" [{r['score']:.3f}] {r['metadata'].get('lang')}: {r['content'][:60]}")
print("\nHybrid (favour meaning over keywords):")
for r in hybrid_store.search(query, limit=3, dense_rrf_weight=0.8, rrf_rank_constant=60):
print(f" [{r['score']:.3f}] {r['metadata'].get('lang')}: {r['content'][:60]}")Tune the balance with dense_rrf_weight:
| Value | Behaviour |
|---|---|
0.0 | Keywords only (BM25) |
0.5 | Balanced (default) |
1.0 | Meaning only (dense) |
A hybrid store and a dense-only store are separate. You can’t switch between modes after creation - create a new store if you need to change.
Search with metadata filters
Narrow search results to a specific subset using filters:
# Only search documents tagged as static-typed languages
results = store.search(
query="high performance language",
limit=3,
filter=[{"typing": {"$eq": "static"}}],
)
# Only dynamic-typed languages, and only if they're a strong match
results = store.search(
query="web development",
limit=3,
filter=[{"typing": {"$eq": "dynamic"}}],
score_threshold=0.2, # ignore results below this similarity score
)Supported operators
| Operator | Description | Example |
|---|---|---|
$eq | Equals | {"typing": {"$eq": "static"}} |
$in | One of a list | {"lang": {"$in": ["Python", "Go"]}} |
$range | Between two values | {"year": {"$range": [1990, 2000]}} |
Filters run during the search itself (not after), so they’re fast even on large datasets.
Get the raw vector for a result:
results = store.search("interpreted language", limit=1, include_vectors=True)
vec = results[0]["vector"] # the raw embedding numbersManaging documents
Fetch, update, and delete specific documents by ID:
# Add a document
store.save(
value="Rust is a systems language focused on memory safety, designed by Mozilla.",
metadata={"lang": "Rust", "typing": "static"},
)
time.sleep(2)
# Find it by searching
rust_results = store.search("Rust memory safety", limit=1)
if rust_results:
vec_id = rust_results[0]["id"]
# Look it up directly by ID
vec_data = store.get_vector(vec_id)
print(f"Fetched: {vec_data.get('meta', {})}")
# Update its metadata tags (without re-embedding the text)
store.update_filters([{"id": vec_id, "filter": {"reviewed": "true"}}])
# Delete it
store.delete_vector(vec_id)
# Delete many documents at once using a filter
store.save(value="Temp doc 1", metadata={"category": "throwaway"})
store.save(value="Temp doc 2", metadata={"category": "throwaway"})
time.sleep(2)
store.delete(filter=[{"category": {"$eq": "throwaway"}}])Cleanup
Delete the demo stores when you’re done:
for s, name in [
(store, "demo_dense"),
(hybrid_store, "demo_hybrid"),
]:
try:
s.reset()
print(f"Deleted: {name}")
except Exception as e:
print(f"Could not delete {name}: {e}")Configuration
| Parameter | Description |
|---|---|
type | The name of your store. Must be unique per project. |
embedder_config | Which model to use to turn text into numbers. |
api_token | Your cloud token. Leave empty for local. |
space_type | How to measure similarity. "cosine" works well for most use cases. |
precision | How to compress stored vectors. "int8" saves space with minimal quality loss. |
ef_con | Build quality vs. speed trade-off for the index. Higher = better quality, slower build. |
sparse_model_name | Pass "endee_bm25" to enable hybrid search. |
base_url | Override the server address. Only used in local mode. |
API reference
| Method | Description |
|---|---|
EndeeVectorStore(...) | Create a memory store |
store.ensure_index() | Create a fresh, empty store |
store.save(value, metadata) | Add a document |
store.search(query, limit) | Find the most relevant documents |
store.get_vector(id) | Fetch a document by ID |
store.update_filters([...]) | Update metadata tags without re-embedding |
store.delete_vector(id) | Delete one document |
store.delete(filter=[...]) | Delete documents matching a filter |
store.describe() | View store stats |
store.reset() | Delete everything and start fresh |