RAGAdapter¶
gepa.adapters.generic_rag_adapter.generic_rag_adapter.GenericRAGAdapter(vector_store: VectorStoreInterface, llm_model, embedding_model: str = 'text-embedding-3-small', embedding_function=None, rag_config: dict[str, Any] | None = None, failure_score: float = 0.0)
¶
Bases: GEPAAdapter[RAGDataInst, RAGTrajectory, RAGOutput]
Generic GEPA adapter for RAG system optimization with pluggable vector stores.
This adapter enables GEPA's evolutionary prompt optimization to work with any vector store implementation through the VectorStoreInterface. It provides comprehensive evaluation of both retrieval and generation quality.
Optimizable Components
- Query reformulation prompts: Improve query understanding and reformulation
- Context synthesis prompts: Optimize document combination and summarization
- Answer generation prompts: Enhance final answer quality and formatting
- Reranking criteria: Improve document relevance ordering
Evaluation Metrics
- Retrieval Quality: Precision, recall, F1, mean reciprocal rank (MRR)
- Generation Quality: Token F1, BLEU score, faithfulness, answer relevance
- Combined Score: Weighted combination for overall system performance
Vector Store Support
Works with any vector store implementing VectorStoreInterface, including: ChromaDB, Weaviate, Qdrant, Pinecone, Milvus, and custom implementations.
Example
.. code-block:: python
from gepa.adapters.generic_rag_adapter import GenericRAGAdapter, ChromaVectorStore import gepa
vector_store = ChromaVectorStore.create_local("./kb", "docs") adapter = GenericRAGAdapter(vector_store=vector_store, llm_model="gpt-4")
result = gepa.optimize( seed_candidate={"answer_generation": "Answer based on context:"}, trainset=train_data, valset=val_data, adapter=adapter, max_metric_calls=50 ) print(result.best_candidate) # Optimized prompts
Initialize the GenericRAGAdapter for RAG system optimization.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
vector_store
|
VectorStoreInterface
|
Vector store implementation (ChromaDB, Weaviate, etc.) Must implement VectorStoreInterface for similarity search operations. |
required |
llm_model
|
LLM client for text generation. Can be: - String model name (uses litellm for inference) - Callable that takes messages and returns response text - Any object with a callable interface for LLM inference |
required | |
embedding_model
|
str
|
Model name for text embeddings (default: "text-embedding-3-small"). Used when embedding_function is not provided. |
'text-embedding-3-small'
|
embedding_function
|
Optional custom embedding function that takes text and returns List[float]. If None, uses default litellm embeddings. |
None
|
|
rag_config
|
dict[str, Any] | None
|
RAG pipeline configuration dictionary. Keys include: - "retrieval_strategy": "similarity", "hybrid", or "vector" - "top_k": Number of documents to retrieve (default: 5) - "retrieval_weight": Weight for retrieval in combined score (default: 0.3) - "generation_weight": Weight for generation in combined score (default: 0.7) - "hybrid_alpha": Semantic vs keyword balance for hybrid search (default: 0.5) - "filters": Default metadata filters for retrieval |
None
|
failure_score
|
float
|
Score assigned when evaluation fails (default: 0.0) |
0.0
|
Example
.. code-block:: python
vector_store = WeaviateVectorStore.create_local(collection_name="docs")
adapter = GenericRAGAdapter(
vector_store=vector_store,
llm_model="gpt-4",
rag_config={
"retrieval_strategy": "hybrid",
"top_k": 5,
"hybrid_alpha": 0.7
}
)
Source code in gepa/adapters/generic_rag_adapter/generic_rag_adapter.py
Attributes¶
vector_store = vector_store
instance-attribute
¶
rag_pipeline = RAGPipeline(vector_store=vector_store, llm_client=llm_model, embedding_model=embedding_model, embedding_function=embedding_function)
instance-attribute
¶
evaluator = RAGEvaluationMetrics()
instance-attribute
¶
config = rag_config or self._default_config()
instance-attribute
¶
failure_score = failure_score
instance-attribute
¶
propose_new_texts: ProposalFn | None = None
class-attribute
instance-attribute
¶
Functions¶
evaluate(batch: list[RAGDataInst], candidate: dict[str, str], capture_traces: bool = False) -> EvaluationBatch[RAGTrajectory, RAGOutput]
¶
Evaluate RAG system performance on a batch of query-answer examples.
This method runs the complete RAG pipeline on each example in the batch, evaluating both retrieval and generation quality using the provided prompt components.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
batch
|
list[RAGDataInst]
|
List of RAG evaluation examples, each containing: - query: Question to answer - ground_truth_answer: Expected correct answer - relevant_doc_ids: Documents that should be retrieved - metadata: Additional context for evaluation |
required |
candidate
|
dict[str, str]
|
Dictionary mapping prompt component names to their text. Supported components: - "query_reformulation": Prompt for improving user queries - "context_synthesis": Prompt for combining retrieved documents - "answer_generation": Prompt for generating final answers - "reranking_criteria": Criteria for reordering retrieved documents |
required |
capture_traces
|
bool
|
If True, capture detailed execution trajectories for each example. Required for reflective dataset generation but increases memory usage. |
False
|
Returns:
| Type | Description |
|---|---|
EvaluationBatch[RAGTrajectory, RAGOutput]
|
EvaluationBatch containing: |
EvaluationBatch[RAGTrajectory, RAGOutput]
|
|
EvaluationBatch[RAGTrajectory, RAGOutput]
|
|
EvaluationBatch[RAGTrajectory, RAGOutput]
|
|
Raises:
| Type | Description |
|---|---|
Exception
|
Individual example failures are caught and assigned failure_score. Only systemic failures (e.g., vector store unavailable) raise exceptions. |
Example
.. code-block:: python
prompts = {
"answer_generation": "Answer the question based on this context:"
}
result = adapter.evaluate(
batch=validation_data,
candidate=prompts,
capture_traces=True
)
avg_score = sum(result.scores) / len(result.scores)
print(f"Average RAG performance: {avg_score:.3f}")
Source code in gepa/adapters/generic_rag_adapter/generic_rag_adapter.py
205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 | |
make_reflective_dataset(candidate: dict[str, str], eval_batch: EvaluationBatch[RAGTrajectory, RAGOutput], components_to_update: list[str]) -> dict[str, list[dict[str, Any]]]
¶
Generate reflective dataset for evolutionary prompt optimization.
This method analyzes the evaluation results and creates training examples that GEPA's proposer can use to improve the specified prompt components. Each component gets a tailored dataset with input-output pairs and feedback.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
candidate
|
dict[str, str]
|
Current prompt components that were evaluated |
required |
eval_batch
|
EvaluationBatch[RAGTrajectory, RAGOutput]
|
Evaluation results from evaluate() with capture_traces=True. Must contain trajectories for analysis. |
required |
components_to_update
|
list[str]
|
List of component names to generate improvement suggestions for. Must be subset of candidate.keys(). |
required |
Returns:
| Type | Description |
|---|---|
dict[str, list[dict[str, Any]]]
|
Dictionary mapping component names to their reflective datasets. |
dict[str, list[dict[str, Any]]]
|
Each dataset is a list of examples with structure: |
dict[str, list[dict[str, Any]]]
|
|
dict[str, list[dict[str, Any]]]
|
|
dict[str, list[dict[str, Any]]]
|
|
Example
.. code-block:: python
reflective_data = adapter.make_reflective_dataset(
candidate=current_prompts,
eval_batch=evaluation_results, # with trajectories
components_to_update=["answer_generation", "context_synthesis"]
)
print(reflective_data["answer_generation"][0]["Feedback"])
# Output: "The generated answer lacks specific details from the context..."
Note
This method requires eval_batch to have been created with capture_traces=True, otherwise trajectories will be None.