Gbrain skillify
git clone https://github.com/garrytan/gbrain
T=$(mktemp -d) && git clone --depth=1 https://github.com/garrytan/gbrain "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/skillify" ~/.claude/skills/garrytan-gbrain-skillify && rm -rf "$T"
skills/skillify/SKILL.mdSkillify — The Meta Skill
Contract
A feature is "properly skilled" when all ten checklist items are present:
— skill file with YAML frontmatter, triggers, contract, phases.SKILL.md- Code — deterministic script if applicable.
- Unit tests — cover every branch of deterministic logic.
- Integration tests — exercise live endpoints, not just in-memory shape.
- LLM evals — quality/correctness cases if the feature includes any LLM call.
- Resolver trigger —
entry with the trigger patterns the user actually types.skills/RESOLVER.md - Resolver trigger eval — test that feeds trigger phrases to the resolver and asserts they route to this skill, not the old pre-skillify path.
- Check-resolvable —
passes (skill is reachable, MECE against its siblings, no DRY violations).gbrain check-resolvable - E2E test — exercises the full pipeline from user turn to side effect.
- Brain filing — if the feature writes brain pages,
has an entry for the directory so the pages aren't orphaned.brain/RESOLVER.md
Trigger
- "skillify this" / "skillify" / "is this a skill?" / "make this proper"
- "add tests and evals for this"
- After building any new feature that touches user-facing behavior
- When you grep the repo and notice a script with no SKILL.md next to it
Phases
Phase 1: Audit what exists
For the feature being skillified, answer:
- Feature name: what does it do in one line?
- Code path: where does the implementation live (file path)?
- Checklist status: run
(or write the 10-item checklist manually) and note which items are missing.scripts/skillify-check.ts <path>
Phase 2: Create missing pieces in order
Work the list top-down. Each earlier item constrains what later items look like (the SKILL.md contract determines what tests assert; tests determine what evals gate; the resolver entry determines what trigger-eval checks).
- Write
first. Frontmatter must includeSKILL.md
,name
,version
,description
,triggers[]
,tools[]
. Body has at minimum Contract, Phases, and Output Format sections.mutating - Extract deterministic code into a script if applicable (scripts/*.ts for gbrain; host projects may use .mjs / .py / whatever their runtime uses).
- Write unit tests for every branch of the script. Mock external calls (LLM, DB, network) so tests run fast and deterministic.
- Add integration tests that hit real endpoints. These catch bugs the
unit tests' mocks hide (see the
learning: reimplementation in tests lets production vulnerabilities slip through).files-test-reimplements-production - Add LLM evals if the feature includes any LLM call. Even a three-case eval (happy / edge / adversarial) is cheap insurance against prompt regressions.
- Add the resolver trigger to
. Use the trigger patterns the user ACTUALLY types, not what you think they should type.skills/RESOLVER.md - Add a resolver trigger eval that feeds those patterns in and asserts they route to the new skill.
- Run
. It validates reachability (is the skill mentioned from RESOLVER.md?), MECE overlap (does it duplicate an existing skill's trigger?), gap detection (are there user intents that fall through the resolver with no match?), and DRY. If it fails, fix the skill (or extend an existing one instead of creating a duplicate).gbrain check-resolvable - Add an E2E smoke test. For gbrain: submit a Minion job or run a CLI invocation end-to-end against a fixture brain; assert side effects.
- Update
if the skill writes brain pages. Orphaned brain pages are worse than no brain pages.brain/RESOLVER.md
Phase 3: Verify
Run each of these and confirm green:
# Unit tests bun test test/<skill-name>.test.ts # Integration tests (when applicable) bun run test:e2e # Resolver reachability + MECE + DRY gbrain check-resolvable # Conformance tests (skill YAML + required sections) bun test test/skills-conformance.test.ts
Quality gates
A feature is NOT properly skilled until:
- All tests pass (unit + integration + evals).
- It appears in
with accurate trigger patterns.skills/RESOLVER.md - The resolver trigger eval confirms patterns route to the new skill.
shows no orphaned skills, no MECE overlaps, no DRY violations.gbrain check-resolvable- If it writes brain pages,
has the directory.brain/RESOLVER.md
Anti-Patterns
- ❌ Code with no SKILL.md — invisible to the resolver; the agent will never run it.
- ❌ SKILL.md with no tests — untested contract; one prompt change regresses silently.
- ❌ Tests that reimplement production code — the reimplementation's
bugs don't catch production's bugs (the
lesson).files-test-reimplements- production - ❌ Resolver entry that uses internal jargon the user never types — trigger patterns must mirror real user language.
- ❌ Feature that writes to brain without a
entry — orphaned pages the agent will never find.brain/RESOLVER.md - ❌ Deterministic logic in LLM space — should be a script.
- ❌ LLM judgment in deterministic space — should be an eval.
Why skillify + check-resolvable is the right pair
Hermes and similar agent frameworks auto-create skills as a background behavior. That's fine until you don't know what the agent shipped — checklists decay, tests drift, resolver entries get stale.
Gbrain ships the same capability as two user-controlled tools:
builds the checklist and helps you fill in the gaps./skillify
validates the whole skill tree: reachability, MECE, DRY, gap detection, orphaned skills.gbrain check-resolvable
You decide when and what. The human keeps judgment. The tooling keeps the checklist honest. In practice this combo produces zero orphaned skills, every feature with tests + evals + resolver triggers + evals of the triggers.
Output Format
A skillify run produces, in order:
- An audit printout listing which of the 10 items exist and which are missing for the target feature.
- The files created to close each gap (SKILL.md, test files, resolver entries).
- The final
output confirming reachability.gbrain check-resolvable - A one-line summary of the resulting skill completeness score (N/10).