Dspy-skills dspy-optimize-anything
Universal text artifact optimizer using GEPA's optimize_anything API for code, prompts, agent architectures, configs, and more
git clone https://github.com/OmidZamani/dspy-skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/OmidZamani/dspy-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/dspy-optimize-anything" ~/.claude/skills/omidzamani-dspy-skills-dspy-optimize-anything && rm -rf "$T"
skills/dspy-optimize-anything/SKILL.mdGEPA optimize_anything
Goal
Optimize any artifact representable as text — code, prompts, agent architectures, vector graphics, configurations — using a single declarative API powered by GEPA's reflective evolutionary search.
When to Use
- Beyond prompt optimization — optimizing code, configs, SVGs, scheduling policies, etc.
- Single hard problems — circle packing, kernel generation, algorithm discovery
- Batch related problems — CUDA kernels, code generation tasks with cross-transfer
- Generalization — agent skills, policies, or prompts that must transfer to unseen inputs
- When you can express quality as a score and provide diagnostic feedback (ASI)
Inputs
| Input | Type | Description |
|---|---|---|
| | Starting artifact text, or for seedless mode |
| | Returns score (higher=better), optionally with ASI dict |
| | Training examples (for multi-task and generalization modes) |
| | Validation set (for generalization mode) |
| | Natural language description of what to optimize for |
| | Domain knowledge and constraints |
| | Engine, reflection, and tracking settings |
Outputs
| Output | Type | Description |
|---|---|---|
| | Best optimized artifact |
Workflow
Phase 1: Install
pip install gepa
Phase 2: Define Evaluator with ASI
The evaluator scores a candidate and returns Actionable Side Information (ASI) — diagnostic feedback that guides the LLM proposer during reflection.
Simple evaluator (score only):
import gepa.optimize_anything as oa def evaluate(candidate: str) -> float: score, diagnostic = run_my_system(candidate) oa.log(f"Error: {diagnostic}") # captured as ASI return score
Rich evaluator (score + structured ASI):
def evaluate(candidate: str) -> tuple[float, dict]: result = execute_code(candidate) return result.score, { "Error": result.stderr, "Output": result.stdout, "Runtime": f"{result.time_ms:.1f}ms", }
ASI can include open-ended text, structured data, multi-objectives (via
scores), or images (via gepa.Image) for vision-capable LLMs.
Phase 3: Choose Optimization Mode
Mode 1 — Single-Task Search: Solve one hard problem. No dataset needed.
result = oa.optimize_anything( seed_candidate="<your initial artifact>", evaluator=evaluate, )
Mode 2 — Multi-Task Search: Solve a batch of related problems with cross-transfer.
result = oa.optimize_anything( seed_candidate="<your initial artifact>", evaluator=evaluate, dataset=tasks, )
Mode 3 — Generalization: Build a skill/prompt/policy that transfers to unseen problems.
result = oa.optimize_anything( seed_candidate="<your initial artifact>", evaluator=evaluate, dataset=train, valset=val, )
Seedless mode: Describe what you need instead of providing a seed.
result = oa.optimize_anything( evaluator=evaluate, objective="Generate a Python function `reverse()` that reverses a string.", )
Phase 4: Use Results
print(result.best_candidate)
Production Example
import gepa.optimize_anything as oa from gepa import Image import logging logger = logging.getLogger(__name__) # ---------- SVG optimization with VLM feedback ---------- GOAL = "a pelican riding a bicycle" VLM = "vertex_ai/gemini-3-flash-preview" VISUAL_ASPECTS = [ {"id": "overall", "criteria": f"Rate overall quality of this SVG ({GOAL}). SCORE: X/10"}, {"id": "anatomy", "criteria": "Rate pelican accuracy: beak, pouch, plumage. SCORE: X/10"}, {"id": "bicycle", "criteria": "Rate bicycle: wheels, frame, handlebars, pedals. SCORE: X/10"}, {"id": "composition", "criteria": "Rate how convincingly the pelican rides the bicycle. SCORE: X/10"}, ] def evaluate(candidate, example): """Render SVG, score with a VLM, return (score, ASI).""" image = render_image(candidate["svg_code"]) # via cairosvg score, feedback = get_vlm_score_feedback(VLM, image, example["criteria"]) return score, { "RenderedSVG": Image(base64_data=image, media_type="image/png"), "Feedback": feedback, } result = oa.optimize_anything( seed_candidate={"svg_code": "<svg>...</svg>"}, evaluator=evaluate, dataset=VISUAL_ASPECTS, background=f"Optimize SVG source code depicting '{GOAL}'. " "Improve anatomy, composition, and visual quality.", ) logger.info(f"Best SVG:\n{result.best_candidate['svg_code']}") # ---------- Code optimization (single-task) ---------- def evaluate_solver(candidate: str) -> tuple[float, dict]: """Evaluate a Python solver for a mathematical optimization problem.""" import subprocess, json proc = subprocess.run( ["python", "-c", candidate], capture_output=True, text=True, timeout=30, ) if proc.returncode != 0: oa.log(f"Runtime error: {proc.stderr}") return 0.0, {"Error": proc.stderr} try: output = json.loads(proc.stdout) return output["score"], { "Output": output.get("solution"), "Runtime": f"{output.get('time_ms', 0):.1f}ms", } except (json.JSONDecodeError, KeyError) as e: oa.log(f"Parse error: {e}") return 0.0, {"Error": str(e), "Stdout": proc.stdout} result = oa.optimize_anything( evaluator=evaluate_solver, objective="Write a Python solver for the bin packing problem that " "minimizes the number of bins. Output JSON with 'score' and 'solution'.", background="Use first-fit-decreasing as a starting heuristic. " "Higher score = fewer bins used.", ) print(result.best_candidate) # ---------- Agent architecture generalization ---------- def evaluate_agent(candidate: str, example: dict) -> tuple[float, dict]: """Run an agent architecture on a task and score it.""" exec_globals = {} exec(candidate, exec_globals) agent_fn = exec_globals.get("solve") if agent_fn is None: return 0.0, {"Error": "No `solve` function defined"} try: prediction = agent_fn(example["input"]) correct = prediction == example["expected"] score = 1.0 if correct else 0.0 feedback = "Correct" if correct else ( f"Expected '{example['expected']}', got '{prediction}'" ) return score, {"Prediction": prediction, "Feedback": feedback} except Exception as e: return 0.0, {"Error": str(e)} result = oa.optimize_anything( seed_candidate="def solve(input):\n return input", evaluator=evaluate_agent, dataset=train_tasks, valset=val_tasks, background="Discover a Python agent function `solve(input)` that " "generalizes across unseen reasoning tasks.", ) print(result.best_candidate)
Integration with DSPy
optimize_anything complements DSPy's built-in optimizers. Use DSPy optimizers (GEPA, MIPROv2, BootstrapFewShot) for DSPy programs, and optimize_anything for arbitrary text artifacts outside DSPy:
import dspy import gepa.optimize_anything as oa # DSPy program optimization (use dspy.GEPA) optimizer = dspy.GEPA( metric=gepa_metric, reflection_lm=dspy.LM("openai/gpt-4o"), auto="medium", ) compiled = optimizer.compile(agent, trainset=trainset) # Non-DSPy artifact optimization (use optimize_anything) result = oa.optimize_anything( seed_candidate=my_config_yaml, evaluator=eval_config, background="Optimize Kubernetes scheduling policy for cost.", )
Best Practices
- Rich ASI — The more diagnostic feedback you provide, the better the proposer can reason about improvements
- Use
— Route prints to the proposer as ASI instead of stdoutoa.log() - Structured returns — Return
tuples for multi-faceted diagnostics(score, dict) - Seedless for exploration — Use
when the solution space is large and unfamiliarobjective= - Background context — Provide domain knowledge via
to constrain the searchbackground= - Generalization mode — Always provide
when the artifact must transfer to unseen inputsvalset - Images as ASI — Use
to pass rendered outputs to vision-capable LLMsgepa.Image
Limitations
- Requires the
package (gepa
)pip install gepa - Evaluator must be deterministic or low-variance for stable optimization
- Compute cost scales with number of candidates explored
- Single-task mode does not generalize; use mode 3 with
for transfervalset - Currently powered by GEPA backend; API is backend-agnostic for future strategies