Babysitter rag-hybrid-search
Hybrid search combining semantic and keyword retrieval for RAG pipelines. Implement BM25 + dense vector search with fusion strategies.
install
source · Clone the upstream repo
git clone https://github.com/a5c-ai/babysitter
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/a5c-ai/babysitter "$T" && mkdir -p ~/.claude/skills && cp -r "$T/library/specializations/ai-agents-conversational/skills/rag-hybrid-search" ~/.claude/skills/a5c-ai-babysitter-rag-hybrid-search && rm -rf "$T"
manifest:
library/specializations/ai-agents-conversational/skills/rag-hybrid-search/SKILL.mdsource content
rag-hybrid-search
Implement hybrid search combining semantic vector retrieval with keyword-based BM25 search for improved RAG pipeline accuracy and recall.
Overview
Hybrid search addresses the limitations of pure semantic or pure keyword search:
- Semantic search excels at conceptual similarity but may miss exact matches
- Keyword search finds exact terms but lacks semantic understanding
- Hybrid combines both for superior retrieval performance
Capabilities
Search Strategies
- Dense vector semantic search (embeddings)
- Sparse vector keyword search (BM25, TF-IDF)
- Hybrid fusion with configurable weighting
- Reciprocal Rank Fusion (RRF) combination
Retrieval Configuration
- Configure embedding models for dense search
- Tune BM25 parameters (k1, b values)
- Set retrieval limits and thresholds
- Apply metadata filtering
Ranking & Reranking
- Score normalization across search types
- Weighted score fusion
- Cross-encoder reranking
- MMR (Maximum Marginal Relevance) diversity
Index Management
- Create and update hybrid indexes
- Batch indexing with progress tracking
- Index optimization and maintenance
- Multi-index federation
Usage
Basic Hybrid Search with LangChain
from langchain_community.retrievers import BM25Retriever from langchain_community.vectorstores import Chroma from langchain.retrievers import EnsembleRetriever from langchain_openai import OpenAIEmbeddings # Create documents docs = [...] # Your document chunks # Dense retriever (semantic) embeddings = OpenAIEmbeddings() vectorstore = Chroma.from_documents(docs, embeddings) dense_retriever = vectorstore.as_retriever(search_kwargs={"k": 5}) # Sparse retriever (BM25) bm25_retriever = BM25Retriever.from_documents(docs) bm25_retriever.k = 5 # Hybrid ensemble hybrid_retriever = EnsembleRetriever( retrievers=[bm25_retriever, dense_retriever], weights=[0.4, 0.6] # Adjust based on use case ) # Query results = hybrid_retriever.invoke("How do I configure the system?")
Reciprocal Rank Fusion
def reciprocal_rank_fusion(results_lists: list, k: int = 60) -> list: """ Combine multiple ranked lists using RRF. k is a constant (typically 60) for smoothing. """ fused_scores = {} for results in results_lists: for rank, doc in enumerate(results): doc_id = doc.metadata.get("id", str(doc.page_content[:50])) if doc_id not in fused_scores: fused_scores[doc_id] = {"doc": doc, "score": 0} fused_scores[doc_id]["score"] += 1 / (k + rank + 1) # Sort by fused score sorted_docs = sorted( fused_scores.values(), key=lambda x: x["score"], reverse=True ) return [item["doc"] for item in sorted_docs] # Use with multiple retrievers semantic_results = dense_retriever.invoke(query) keyword_results = bm25_retriever.invoke(query) hybrid_results = reciprocal_rank_fusion([semantic_results, keyword_results])
Pinecone Hybrid Search
from pinecone import Pinecone from pinecone_text.sparse import BM25Encoder # Initialize Pinecone pc = Pinecone(api_key="your-api-key") index = pc.Index("hybrid-index") # Prepare sparse encoder bm25 = BM25Encoder() bm25.fit(corpus) # Fit on your document corpus def hybrid_query(query: str, alpha: float = 0.5, top_k: int = 10): """ Query with hybrid search. alpha: weight for dense vectors (1-alpha for sparse) """ # Get dense embedding dense_embedding = embeddings.embed_query(query) # Get sparse embedding sparse_embedding = bm25.encode_queries([query])[0] # Hybrid query results = index.query( vector=dense_embedding, sparse_vector=sparse_embedding, top_k=top_k, include_metadata=True ) return results
Weaviate Hybrid Search
import weaviate client = weaviate.Client("http://localhost:8080") def weaviate_hybrid_search(query: str, alpha: float = 0.5, limit: int = 10): """ Weaviate native hybrid search. alpha: 0 = pure BM25, 1 = pure vector """ result = ( client.query .get("Document", ["content", "title", "metadata"]) .with_hybrid( query=query, alpha=alpha, properties=["content", "title"] ) .with_limit(limit) .do() ) return result["data"]["Get"]["Document"]
Task Definition
const ragHybridSearchTask = defineTask({ name: 'rag-hybrid-search-setup', description: 'Configure hybrid search for RAG pipeline', inputs: { vectorStore: { type: 'string', required: true }, // 'pinecone', 'weaviate', 'chroma', etc. embeddingModel: { type: 'string', default: 'text-embedding-3-small' }, bm25Params: { type: 'object', default: { k1: 1.5, b: 0.75 } }, fusionStrategy: { type: 'string', default: 'rrf' }, // 'rrf', 'weighted', 'custom' denseWeight: { type: 'number', default: 0.6 }, topK: { type: 'number', default: 10 } }, outputs: { retrieverConfigured: { type: 'boolean' }, indexStats: { type: 'object' }, artifacts: { type: 'array' } }, async run(inputs, taskCtx) { return { kind: 'skill', title: `Configure hybrid search with ${inputs.vectorStore}`, skill: { name: 'rag-hybrid-search', context: { vectorStore: inputs.vectorStore, embeddingModel: inputs.embeddingModel, bm25Params: inputs.bm25Params, fusionStrategy: inputs.fusionStrategy, denseWeight: inputs.denseWeight, topK: inputs.topK, instructions: [ 'Validate vector store connection and configuration', 'Set up dense embedding pipeline', 'Configure BM25/sparse encoding', 'Implement fusion strategy', 'Test retrieval quality with sample queries', 'Document configuration and tuning parameters' ] } }, io: { inputJsonPath: `tasks/${taskCtx.effectId}/input.json`, outputJsonPath: `tasks/${taskCtx.effectId}/result.json` } }; } });
Applicable Processes
- rag-pipeline-implementation
- advanced-rag-patterns
- knowledge-base-qa
- vector-database-setup
External Dependencies
- Vector database (Pinecone, Weaviate, Chroma, Milvus, Qdrant)
- Embedding provider (OpenAI, Cohere, Hugging Face)
- BM25 encoder (rank_bm25, pinecone-text)
References
- Claude Context (Zilliz) - Hybrid search MCP
- MCP Local RAG - Local-first RAG with hybrid search
- LangChain Anthropic MCP Server
- Pinecone Hybrid Search
- Weaviate Hybrid Search
Related Skills
- SK-RAG-001 rag-chunking-strategy
- SK-RAG-004 rag-reranking
- SK-RAG-005 rag-query-transformation
- SK-VDB-001 through SK-VDB-005 (vector database integrations)
Related Agents
- AG-RAG-001 rag-pipeline-architect
- AG-RAG-003 vector-db-specialist
- AG-RAG-004 retrieval-optimizer