Agent-almanac chrysopoeia
git clone https://github.com/pjt222/agent-almanac
T=$(mktemp -d) && git clone --depth=1 https://github.com/pjt222/agent-almanac "$T" && mkdir -p ~/.claude/skills && cp -r "$T/i18n/caveman-ultra/skills/chrysopoeia" ~/.claude/skills/pjt222-agent-almanac-chrysopoeia-1e65d4 && rm -rf "$T"
i18n/caveman-ultra/skills/chrysopoeia/SKILL.mdChrysopoeia
Pull max value from code → find gold (high-val), lead (heavy), dross (dead). Amplify gold, transmute lead, purge dross.
Use When
- Working code sluggish → optimize perf
- API surface crufty → refine
- Bundle/mem/startup too big → shrink
- Prep open-source release → extract core
- Code works but dull → polish, not rewrite
In
- Required: Codebase/module (paths)
- Required: Value metric (perf, API clarity, bundle, readability)
- Optional: Profiling data/benchmarks
- Optional: Target (e.g., "-40% bundle", "sub-100ms res")
- Optional: Constraints (public API frozen, back-compat req)
Do
Step 1: Assay — Classify
Classify every element by value.
- Define value metric from In
- Inventory elements (fns, modules, exports, deps)
- Classify each:
Value Classification: +--------+---------------------------------------------------------+ | Gold | High value, well-designed. Amplify and protect. | | Silver | Good value, minor imperfections. Polish. | | Lead | Functional but heavy — poor performance, complex API. | | | Transmute into something lighter. | | Dross | Dead code, unused exports, vestigial features. | | | Remove entirely. | +--------+---------------------------------------------------------+
- Perf work → profile first:
- Hot paths (time sink)
- Cold paths (rare → maybe dross)
- Mem alloc patterns
- Produce Assay Report: element-by-element w/ evidence
→ Every element classified w/ evidence. Gold marked protect. Lead ranked by impact.
If err: No profiler → static analysis: cyclomatic complexity, dep count, size as proxies. Huge codebase → critical path first.
Step 2: Refine — Amplify Gold
Protect + enhance highest-value elements.
- Each Gold:
- Full tests (most valuable asset)
- Clear interface docs
- Extractable as reusable module?
- Each Silver:
- Targeted improvements (naming, types, minor opt)
- Tests → Gold-level
- Resolve minor smells, no restructure
- Do NOT modify Gold/Silver behavior → polish only
→ Gold + Silver better tested, documented, protected. No behavior change, quality up.
If err: "Gold" reveals hidden problems → reclassify. Honest > protect flawed.
Step 3: Transmute — Lead → Gold
Convert heavy elements to optimized equivalents.
- Rank Lead by impact (highest resource first)
- Each Lead → pick strategy:
- Algo opt: O(n^2) → O(n log n), kill redundant compute
- Cache/memoize: Store expensive res req'd repeat
- Lazy eval: Defer compute until needed
- Batch proc: Many small ops → fewer big ones
- Simplify: Lower cyclomatic, flatten nesting
- Apply + measure:
- Before/after benchmarks (perf)
- Before/after line counts (complexity)
- Before/after dep counts (coupling)
- Valid. behavior identical post-transmute
→ Measurable metric improvement. Each transmuted > Lead predecessor, same behavior.
If err: Lead resists opt in current interface → interface itself = problem. Sometimes transmute = change caller, not impl.
Step 4: Purge — Remove Dross
Kill dead weight systematically.
- Each Dross → valid. truly unused:
- Grep all refs (IDE find-usages)
- Dynamic refs (string dispatch, reflection)?
- External consumers (library)?
- Remove confirmed:
- Delete dead code, unused exports, vestigial features
- Drop unused deps from manifests
- Clean config for removed features
- Valid. nothing breaks post-removal (tests)
- Doc what + why (commit msgs, not code)
→ Codebase lighter. Bundle/dep count/volume measurably down. Tests pass.
If err: Removal breaks → wasn't dross → reclassify. Dynamic refs hide usage → temp logging before delete to confirm no runtime access.
Step 5: Verify — Weigh Gold
Measure overall improvement.
- Run same benchmarks as Step 1
- Before/after on metric
- Doc results:
- Refined elements (Gold/Silver wins)
- Transmuted (Lead → Gold w/ measurements)
- Purged (Dross removed w/ size/count impact)
- Overall metric gain (e.g., "47% faster", "32% smaller bundle")
→ Measurable, documented metric improvement. Codebase demonstrably more valuable.
If err: Marginal improvement → orig code better than assumed. Doc learning → knowing code near-optimal = valuable.
Check
- Assay report classifies all w/ evidence
- Gold has full tests + docs
- Lead transmutes show before/after metric gain
- Dross removal valid'd w/ ref checks pre-delete
- Tests pass each stage
- Overall improvement measured + documented
- No behavior regressions
- In constraints met
Traps
- Premature opt: Opt w/o profile → always measure first, opt hot paths
- Polish dross: Effort on code should-be-deleted → classify before refine
- Break Gold: Opt degrades best code → Gold only improves, never worse
- Unmeasured: "Feels faster" ≠ chrysopoeia → quantify every gain
- Opt cold paths: Effort on startup-once code when req loop = bottleneck
→
— Full four-stage when restructure needed, not just optathanor
— Targeted conversion when Lead needs paradigm shifttransmute
— Architecture-level evalreview-software-architecture
— Data pipeline opt parallels code optreview-data-analysis