skill-anything
git clone https://github.com/AgentSkillOS/SkillAnything
git clone --depth=1 https://github.com/AgentSkillOS/SkillAnything ~/.claude/skills/agentskillos-skillanything-skill-anything
SKILL.mdSkillAnything
Automatically generate production-ready Skills for any target — software, API, CLI tool, library, workflow, or web service. SkillAnything runs a 7-phase pipeline that analyzes your target, designs the skill architecture, implements it, generates test cases, benchmarks performance, optimizes the description, and packages for multiple agent platforms.
Quick Start
Fully automated (one command):
Give SkillAnything a target and it handles everything: - "Create a skill for the jq CLI tool" - "Generate a skill for the Stripe API" - "Turn this workflow into a multi-platform skill"
The pipeline runs all 7 phases automatically. Results land in
sa-workspace/.
The 7-Phase Pipeline
Phase 1: Analyze → Detect target type, extract capabilities → analysis.json Phase 2: Design → Map capabilities to skill architecture → architecture.json Phase 3: Implement → Generate SKILL.md + scripts + references → complete skill directory Phase 4: Test Plan → Auto-generate eval cases + trigger queries → evals.json Phase 5: Evaluate → Benchmark with/without skill, grade results → benchmark.json Phase 6: Optimize → Improve description via train/test loop → optimized SKILL.md Phase 7: Package → Multi-platform distribution packages → dist/
See
METHODOLOGY.md for the full pipeline specification.
Usage Modes
Auto Mode (default)
Runs all 7 phases end-to-end. Provide the target and SkillAnything does the rest:
Target: "the httpie CLI tool" → Analyzes httpie --help output, designs command structure, generates skill, creates tests, benchmarks, optimizes, packages for 4 platforms
Interactive Mode
Set
auto_mode: false in config.yaml. SkillAnything pauses after each phase for review:
- Phase 1 → "Here's what I found about the target. Look right?"
- Phase 2 → "Here's the proposed skill architecture. Any changes?"
- Phase 3 → "Draft skill ready for review."
- ...continues with user feedback at each step
Single Phase Mode
Run any phase independently:
python -m scripts.analyze_target --target "jq" --output analysis.json python -m scripts.design_skill --analysis analysis.json --output architecture.json python -m scripts.init_skill my-skill --template cli --output ./out python -m scripts.generate_tests --analysis analysis.json --skill-path ./out/my-skill python -m scripts.run_eval --eval-set evals.json --skill-path ./out/my-skill python -m scripts.run_loop --eval-set trigger-evals.json --skill-path ./out/my-skill --model <model> python -m scripts.package_multiplatform ./out/my-skill --platforms claude-code,openclaw,codex
Configuration
Edit
config.yaml to customize the pipeline. Key settings:
| Setting | Default | Description |
|---|---|---|
| | Run all phases or pause for review |
| | Force target type: api, cli, library, workflow, service |
| all 4 | Which platforms to package for |
| claude-code | Primary output platform |
| 5 | Max description optimization rounds |
| | Obfuscate original scripts with PyArmor |
See
references/schemas.md for the complete configuration schema.
Platform Output
| Platform | Install Path | Package Format |
|---|---|---|
| Claude Code | | Directory |
| OpenClaw | | Directory |
| Codex | | Directory + openai.yaml |
| Generic | anywhere | zip |
See
references/platform-formats.md for platform-specific format details.
Evaluation and Benchmarking
SkillAnything uses the same eval system as the Anthropic skill-creator:
- Test cases with assertions → graded by
agents/grader.md - Benchmark comparing with-skill vs baseline →
benchmark.json - Description optimization with train/test split → prevents overfitting
- Interactive viewer via
eval-viewer/generate_review.py
The eval loop is optional (
skip_eval: true in config) for rapid prototyping.
Scripts Reference
| Script | Phase | Purpose |
|---|---|---|
| 1 | Auto-detect and analyze target |
| 2 | Generate skill architecture from analysis |
| 3 | Scaffold skill directory from templates |
| 4 | Auto-generate test cases and trigger queries |
| 5 | Test description triggering accuracy |
| 5 | Aggregate benchmark statistics |
| 5-6 | Generate HTML optimization report |
| 6 | AI-powered description improvement |
| 6 | Full eval + improve optimization loop |
| 7 | Validate SKILL.md structure |
| 7 | Package for single platform |
| 7 | Package for all enabled platforms |
| - | PyArmor wrapper for code protection |
Agents
Read these when spawning specialized subagents:
| Agent | Purpose |
|---|---|
| Phase 1: Target analysis instructions |
| Phase 2: Skill architecture design |
| Phase 3: Skill content writing |
| Phase 5: Eval assertion grading |
| Phase 5: Blind A/B output comparison |
| Phase 6: Description optimization orchestration |
| Phase 7: Multi-platform packaging instructions |
Target Types
SkillAnything auto-detects the target type and adapts its analysis:
| Type | Detection | Analysis Method |
|---|---|---|
| API | URL with /api, OpenAPI spec, swagger | Fetch spec, extract endpoints |
| CLI | Executable name, --help output | Run help, parse subcommands |
| Library | Package name, import path | Read docs, parse public API |
| Workflow | Step descriptions, sequence | Parse steps, map data flow |
| Service | URL, web interface | Scrape docs, identify actions |
Troubleshooting
- Phase 1 fails: Target not found or inaccessible → provide
override--target-type - Low eval scores: Description too vague → run Phase 6 optimization
- Platform packaging errors: Missing required fields → check
references/platform-formats.md - PyArmor not found: Install with
pip install pyarmor
License
MIT License. See
NOTICE for third-party attributions (CLI-Anything, Dazhuang Skill Creator,
Anthropic Skill Creator).