Agent-almanac chrysopoeia

install

source · Clone the upstream repo

git clone https://github.com/pjt222/agent-almanac

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/pjt222/agent-almanac "$T" && mkdir -p ~/.claude/skills && cp -r "$T/i18n/caveman-ultra/skills/chrysopoeia" ~/.claude/skills/pjt222-agent-almanac-chrysopoeia-1e65d4 && rm -rf "$T"

manifest: i18n/caveman-ultra/skills/chrysopoeia/SKILL.md

source content

Chrysopoeia

Pull max value from code → find gold (high-val), lead (heavy), dross (dead). Amplify gold, transmute lead, purge dross.

Use When

Working code sluggish → optimize perf
API surface crufty → refine
Bundle/mem/startup too big → shrink
Prep open-source release → extract core
Code works but dull → polish, not rewrite

In

Required: Codebase/module (paths)
Required: Value metric (perf, API clarity, bundle, readability)
Optional: Profiling data/benchmarks
Optional: Target (e.g., "-40% bundle", "sub-100ms res")
Optional: Constraints (public API frozen, back-compat req)

Do

Step 1: Assay — Classify

Classify every element by value.

Define value metric from In
Inventory elements (fns, modules, exports, deps)
Classify each:

Value Classification:
+--------+---------------------------------------------------------+
| Gold   | High value, well-designed. Amplify and protect.         |
| Silver | Good value, minor imperfections. Polish.                |
| Lead   | Functional but heavy — poor performance, complex API.   |
|        | Transmute into something lighter.                       |
| Dross  | Dead code, unused exports, vestigial features.          |
|        | Remove entirely.                                        |
+--------+---------------------------------------------------------+

Perf work → profile first:
- Hot paths (time sink)
- Cold paths (rare → maybe dross)
- Mem alloc patterns
Produce Assay Report: element-by-element w/ evidence

→ Every element classified w/ evidence. Gold marked protect. Lead ranked by impact.

If err: No profiler → static analysis: cyclomatic complexity, dep count, size as proxies. Huge codebase → critical path first.

Step 2: Refine — Amplify Gold

Protect + enhance highest-value elements.

Each Gold:
- Full tests (most valuable asset)
- Clear interface docs
- Extractable as reusable module?
Each Silver:
- Targeted improvements (naming, types, minor opt)
- Tests → Gold-level
- Resolve minor smells, no restructure
Do NOT modify Gold/Silver behavior → polish only

→ Gold + Silver better tested, documented, protected. No behavior change, quality up.

If err: "Gold" reveals hidden problems → reclassify. Honest > protect flawed.

Step 3: Transmute — Lead → Gold

Convert heavy elements to optimized equivalents.

Rank Lead by impact (highest resource first)
Each Lead → pick strategy:
- Algo opt: O(n^2) → O(n log n), kill redundant compute
- Cache/memoize: Store expensive res req'd repeat
- Lazy eval: Defer compute until needed
- Batch proc: Many small ops → fewer big ones
- Simplify: Lower cyclomatic, flatten nesting
Apply + measure:
- Before/after benchmarks (perf)
- Before/after line counts (complexity)
- Before/after dep counts (coupling)
Valid. behavior identical post-transmute

→ Measurable metric improvement. Each transmuted > Lead predecessor, same behavior.

If err: Lead resists opt in current interface → interface itself = problem. Sometimes transmute = change caller, not impl.

Step 4: Purge — Remove Dross

Kill dead weight systematically.

Each Dross → valid. truly unused:
- Grep all refs (IDE find-usages)
- Dynamic refs (string dispatch, reflection)?
- External consumers (library)?
Remove confirmed:
- Delete dead code, unused exports, vestigial features
- Drop unused deps from manifests
- Clean config for removed features
Valid. nothing breaks post-removal (tests)
Doc what + why (commit msgs, not code)

→ Codebase lighter. Bundle/dep count/volume measurably down. Tests pass.

If err: Removal breaks → wasn't dross → reclassify. Dynamic refs hide usage → temp logging before delete to confirm no runtime access.

Step 5: Verify — Weigh Gold

Measure overall improvement.

Run same benchmarks as Step 1
Before/after on metric
Doc results:
- Refined elements (Gold/Silver wins)
- Transmuted (Lead → Gold w/ measurements)
- Purged (Dross removed w/ size/count impact)
- Overall metric gain (e.g., "47% faster", "32% smaller bundle")

→ Measurable, documented metric improvement. Codebase demonstrably more valuable.

If err: Marginal improvement → orig code better than assumed. Doc learning → knowing code near-optimal = valuable.

Check

Assay report classifies all w/ evidence
Gold has full tests + docs
Lead transmutes show before/after metric gain
Dross removal valid'd w/ ref checks pre-delete
Tests pass each stage
Overall improvement measured + documented
No behavior regressions
In constraints met

Traps

Premature opt: Opt w/o profile → always measure first, opt hot paths
Polish dross: Effort on code should-be-deleted → classify before refine
Break Gold: Opt degrades best code → Gold only improves, never worse
Unmeasured: "Feels faster" ≠ chrysopoeia → quantify every gain
Opt cold paths: Effort on startup-once code when req loop = bottleneck

→

```
athanor
```
— Full four-stage when restructure needed, not just opt
```
transmute
```
— Targeted conversion when Lead needs paradigm shift
```
review-software-architecture
```
— Architecture-level eval
```
review-data-analysis
```
— Data pipeline opt parallels code opt