Skip to Content
IntegrationsLlamaIndexAdvanced Usage

Advanced Usage

This guide covers advanced features like custom retrievers, direct vector store querying, and fine-grained control over the retrieval process.

Custom Retriever Setup

Create a custom retriever for fine-grained control:

from llama_index.core.retrievers import VectorIndexRetriever # Create a retriever with custom parameters retriever = VectorIndexRetriever( index=index, similarity_top_k=3, # Return top 3 most similar results filters=ai_filters # Use metadata filters ) # Retrieve nodes for a query nodes = retriever.retrieve("What is deep learning?") print(f"Retrieved {len(nodes)} nodes") for i, node in enumerate(nodes): print(f"\nNode {i+1}:") print(f"Text: {node.node.text}") print(f"Metadata: {node.node.metadata}") print(f"Score: {node.score:.4f}")

Example Output:

Retrieved 2 nodes for query: 'What is deep learning?' (with AI category filter) Node 1: Text: Deep learning is part of a broader family of machine learning methods... Metadata: {'category': 'ai', 'field': 'deep_learning', 'difficulty': 'advanced'} Score: 0.8934 Node 2: Text: Machine learning is a subset of artificial intelligence... Metadata: {'category': 'ai', 'field': 'machine_learning', 'difficulty': 'advanced'} Score: 0.7821

Using a Custom Retriever with Query Engine

Combine your custom retriever with a query engine for enhanced control:

from llama_index.core.query_engine import RetrieverQueryEngine # Create a query engine with our custom retriever custom_query_engine = RetrieverQueryEngine.from_args( retriever=retriever, verbose=True # Enable verbose mode to see the retrieved nodes ) # Query using the custom retriever query engine response = custom_query_engine.query("Explain the difference between machine learning and deep learning") print("\nFinal Response:") print(response)

Direct VectorStore Querying

Query the Endee vector store directly, bypassing the LlamaIndex query engine:

from llama_index.core.vector_stores.types import VectorStoreQuery, MetadataFilters, MetadataFilter, FilterOperator # Generate an embedding for our query query_text = "What are vector databases?" query_embedding = embed_model.get_text_embedding(query_text) # Create a VectorStoreQuery vector_store_query = VectorStoreQuery( query_embedding=query_embedding, similarity_top_k=2, filters=MetadataFilters(filters=[MetadataFilter(key="category", value="database", operator=FilterOperator.EQ)]) ) # Execute the query directly on the vector store query_result = vector_store.query(vector_store_query) print(f"Direct VectorStore query: '{query_text}'") print(f"Retrieved {len(query_result.nodes)} results with database category filter:") for i, (node, score) in enumerate(zip(query_result.nodes, query_result.similarities)): print(f"\nResult {i+1}:") print(f"Text: {node.text}") print(f"Metadata: {node.metadata}") print(f"Similarity score: {score:.4f}")

Tip: Direct querying is useful when you need raw results without LLM processing.

Custom Retriever Parameters

ParameterDescription
indexThe VectorStoreIndex to retrieve from
similarity_top_kNumber of top results to return
filtersMetadataFilters for filtering results
node_idsOptional list of specific node IDs to retrieve
doc_idsOptional list of specific document IDs to retrieve

VectorStoreQuery Parameters

ParameterDescription
query_embeddingThe query vector (list of floats)
similarity_top_kNumber of results to return
filtersMetadataFilters for filtering
node_idsOptional list of specific node IDs
alphaHybrid search alpha parameter

Complete Example

Here’s a complete example combining all the advanced features:

from endee_llamaindex import EndeeVectorStore from llama_index.core import VectorStoreIndex, StorageContext, Document from llama_index.core.retrievers import VectorIndexRetriever from llama_index.core.query_engine import RetrieverQueryEngine from llama_index.core.vector_stores.types import ( VectorStoreQuery, MetadataFilters, MetadataFilter, FilterOperator ) from llama_index.embeddings.openai import OpenAIEmbedding import os # Setup os.environ["OPENAI_API_KEY"] = "your-openai-api-key" endee_api_token = "your-endee-api-token" embed_model = OpenAIEmbedding() # Initialize Endee vector_store = EndeeVectorStore.from_params( api_token=endee_api_token, index_name="advanced_demo", dimension=1536, space_type="cosine" ) storage_context = StorageContext.from_defaults(vector_store=vector_store) # Create documents documents = [ Document(text="AI is transforming industries.", metadata={"category": "ai"}), Document(text="Databases store structured data.", metadata={"category": "database"}) ] # Build index index = VectorStoreIndex.from_documents( documents, storage_context=storage_context, embed_model=embed_model ) # Create custom retriever with filters ai_filter = MetadataFilters(filters=[ MetadataFilter(key="category", value="ai", operator=FilterOperator.EQ) ]) retriever = VectorIndexRetriever( index=index, similarity_top_k=5, filters=ai_filter ) # Create custom query engine query_engine = RetrieverQueryEngine.from_args(retriever=retriever) # Query response = query_engine.query("Tell me about AI") print(response) # Direct vector store query query_embedding = embed_model.get_text_embedding("database systems") direct_query = VectorStoreQuery( query_embedding=query_embedding, similarity_top_k=3 ) results = vector_store.query(direct_query) for node, score in zip(results.nodes, results.similarities): print(f"Score: {score:.4f} - {node.text}")

API Reference

EndeeVectorStore Methods

MethodDescription
from_params(...)Create a new EndeeVectorStore instance
add(nodes)Add nodes to the vector store
delete(ref_doc_id)Delete nodes by document ID
query(query)Execute a VectorStoreQuery

VectorIndexRetriever Methods

MethodDescription
retrieve(query_str)Retrieve nodes matching the query
aretrieve(query_str)Async version of retrieve