Claude-skill-registry infrastructure-skill-builder
Transform infrastructure documentation, runbooks, and operational knowledge into reusable Claude Code skills. Convert Proxmox configs, Docker setups, Kubernetes deployments, and cloud infrastructure patterns into structured, actionable skills.
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/infrastructure-skill-builder" ~/.claude/skills/majiayu000-claude-skill-registry-infrastructure-skill-builder && rm -rf "$T"
skills/data/infrastructure-skill-builder/SKILL.mdInfrastructure Skill Builder
Convert your infrastructure documentation into powerful, reusable Claude Code skills.
Overview
Infrastructure knowledge is often scattered across:
- README files
- Runbooks and wiki pages
- Configuration files
- Troubleshooting guides
- Team Slack/Discord history
- Mental models of senior engineers
This skill helps you systematically capture that knowledge as Claude Code skills for:
- Faster onboarding
- Consistent operations
- Disaster recovery
- Knowledge preservation
- Team scaling
When to Use
Use this skill when:
- Documenting complex infrastructure setups
- Creating runbooks for operations teams
- Onboarding new team members to infrastructure
- Preserving expert knowledge before team changes
- Standardizing infrastructure operations
- Building organizational infrastructure library
- Migrating from manual to automated operations
Skill Extraction Process
Step 1: Identify Infrastructure Domains
Common domains:
- Container Orchestration: Docker, Kubernetes, Proxmox LXC
- Cloud Platforms: AWS, GCP, Azure, DigitalOcean
- Databases: PostgreSQL, MongoDB, Redis, MySQL
- Web Servers: Nginx, Apache, Caddy, Traefik
- Monitoring: Prometheus, Grafana, ELK Stack
- CI/CD: Jenkins, GitLab CI, GitHub Actions
- Networking: VPNs, Load Balancers, DNS, Firewalls
- Storage: S3, MinIO, NFS, Ceph
- Security: Authentication, SSL/TLS, Firewalls
Step 2: Extract Core Operations
For each domain, document:
- Setup/Provisioning: How to create new instances
- Configuration: How to configure for different use cases
- Operations: Day-to-day management tasks
- Troubleshooting: Common issues and resolutions
- Scaling: How to scale up/down
- Backup/Recovery: Disaster recovery procedures
- Monitoring: Health checks and alerts
- Security: Security best practices
Infrastructure Skill Template
--- name: [infrastructure-component]-manager description: Expert guidance for [component] management, provisioning, troubleshooting, and operations license: MIT tags: [infrastructure, [component], operations, troubleshooting] --- # [Component] Manager Expert knowledge for managing [component] infrastructure. ## Authentication & Access ### Access Methods ```bash # How to access the infrastructure component ssh user@host # or kubectl config use-context cluster-name
Credentials & Configuration
- Where credentials are stored
- How to configure access
- Common authentication issues
Architecture Overview
Component Topology
- How components are organized
- Network topology
- Resource allocation
- Redundancy setup
Key Resources
- Resource 1: Purpose and specs
- Resource 2: Purpose and specs
- Resource 3: Purpose and specs
Common Operations
Operation 1: [e.g., Create New Instance]
# Step-by-step commands command1 --flags command2 --flags # Verification verify-command
Operation 2: [e.g., Update Configuration]
# Commands and explanations
Operation 3: [e.g., Scale Resources]
# Commands and explanations
Monitoring & Health Checks
Check System Status
# Health check commands status-command # Expected output # What healthy output looks like
Common Metrics
- Metric 1: What it means, normal range
- Metric 2: What it means, normal range
- Metric 3: What it means, normal range
Troubleshooting
Issue 1: [Common Problem]
Symptoms: What you observe Cause: Why it happens Fix: Step-by-step resolution
# Fix commands
Issue 2: [Another Problem]
Symptoms: Cause: Fix:
# Fix commands
Backup & Recovery
Backup Procedures
# How to backup backup-command # Verification verify-backup
Recovery Procedures
# How to restore restore-command # Verification verify-restore
Security Best Practices
- Security practice 1
- Security practice 2
- Security practice 3
Quick Reference
| Task | Command |
|---|---|
| Task 1 | |
| Task 2 | |
| Task 3 | |
Additional Resources
- Official documentation links
- Related skills
- External references
## Real-World Example: Proxmox Skill Based on the proxmox-auth skill in this repository: ### Extracted Knowledge **From**: Proxmox VE cluster documentation + operational experience **Structured as**: 1. **Authentication**: SSH access patterns, node IPs 2. **Architecture**: Cluster topology (2 nodes, resources) 3. **Operations**: Container/VM management commands 4. **Troubleshooting**: Common errors and fixes 5. **Networking**: Bridge configuration, IP management 6. **GPU Passthrough**: Special container configurations **Result**: Comprehensive skill covering: - Quick access to any node - Container lifecycle management - GPU-accelerated containers - Network troubleshooting - Backup procedures - Common gotchas and solutions ## Extraction Scripts ### Extract from Runbooks ```bash #!/bin/bash # extract-from-runbook.sh - Convert runbook to skill RUNBOOK_FILE="$1" SKILL_NAME="$2" if [ -z "$RUNBOOK_FILE" ] || [ -z "$SKILL_NAME" ]; then echo "Usage: $0 <runbook.md> <skill-name>" exit 1 fi SKILL_DIR="skills/$SKILL_NAME" mkdir -p "$SKILL_DIR" # Extract sections from runbook cat > "$SKILL_DIR/SKILL.md" << EOF --- name: $SKILL_NAME description: $(head -5 "$RUNBOOK_FILE" | grep -v "^#" | head -1 | xargs) license: MIT extracted-from: $RUNBOOK_FILE --- # ${SKILL_NAME^} $(cat "$RUNBOOK_FILE") --- **Note**: This skill was auto-extracted from runbook documentation. Review and refine before use. EOF echo "✓ Created skill: $SKILL_DIR/SKILL.md" echo "Review and edit to add:" echo " - Metadata and tags" echo " - Troubleshooting section" echo " - Quick reference" echo " - Examples"
Extract from Docker Compose
#!/bin/bash # docker-compose-to-skill.sh - Extract skill from docker-compose.yaml COMPOSE_FILE="${1:-docker-compose.yaml}" PROJECT_NAME=$(basename $(pwd)) SKILL_DIR="skills/docker-$PROJECT_NAME" mkdir -p "$SKILL_DIR" # Extract services SERVICES=$(yq eval '.services | keys | .[]' "$COMPOSE_FILE") cat > "$SKILL_DIR/SKILL.md" << EOF --- name: docker-$PROJECT_NAME description: Docker Compose configuration and management for $PROJECT_NAME license: MIT --- # Docker $PROJECT_NAME Manage Docker Compose stack for $PROJECT_NAME. ## Services $(yq eval '.services | to_entries | .[] | "### " + .key + "\n" + (.value.image // "custom") + "\n"' "$COMPOSE_FILE") ## Quick Start \`\`\`bash # Start all services docker-compose up -d # Check status docker-compose ps # View logs docker-compose logs -f # Stop all services docker-compose down \`\`\` ## Service Details ### Ports $(yq eval '.services | to_entries | .[] | select(.value.ports) | "- **" + .key + "**: " + (.value.ports | join(", "))' "$COMPOSE_FILE") ### Volumes $(yq eval '.services | to_entries | .[] | select(.value.volumes) | "- **" + .key + "**: " + (.value.volumes | join(", "))' "$COMPOSE_FILE") ## Configuration See \`$COMPOSE_FILE\` for full configuration. ## Common Operations ### Restart Service \`\`\`bash docker-compose restart SERVICE_NAME \`\`\` ### Update Service \`\`\`bash docker-compose pull SERVICE_NAME docker-compose up -d SERVICE_NAME \`\`\` ### View Service Logs \`\`\`bash docker-compose logs -f SERVICE_NAME \`\`\` ## Troubleshooting ### Service Won't Start 1. Check logs: \`docker-compose logs SERVICE_NAME\` 2. Verify ports not in use: \`netstat -tulpn | grep PORT\` 3. Check disk space: \`df -h\` ### Network Issues \`\`\`bash # Recreate network docker-compose down docker network prune docker-compose up -d \`\`\` EOF echo "✓ Created skill from docker-compose.yaml"
Extract from Kubernetes Manifests
#!/bin/bash # k8s-to-skill.sh - Extract skill from Kubernetes manifests K8S_DIR="${1:-.}" APP_NAME="${2:-$(basename $(pwd))}" SKILL_DIR="skills/k8s-$APP_NAME" mkdir -p "$SKILL_DIR" cat > "$SKILL_DIR/SKILL.md" << EOF --- name: k8s-$APP_NAME description: Kubernetes deployment and management for $APP_NAME license: MIT --- # Kubernetes $APP_NAME Manage Kubernetes resources for $APP_NAME. ## Resources $(find "$K8S_DIR" -name "*.yaml" -o -name "*.yml" | while read file; do KIND=$(yq eval '.kind' "$file" 2>/dev/null) NAME=$(yq eval '.metadata.name' "$file" 2>/dev/null) echo "- **$KIND**: $NAME ($(basename $file))" done) ## Deployment ### Apply All Resources \`\`\`bash kubectl apply -f $K8S_DIR/ \`\`\` ### Check Status \`\`\`bash # Pods kubectl get pods -l app=$APP_NAME # Services kubectl get svc -l app=$APP_NAME # Deployments kubectl get deploy -l app=$APP_NAME \`\`\` ## Common Operations ### Scale Deployment \`\`\`bash kubectl scale deployment $APP_NAME --replicas=3 \`\`\` ### Update Image \`\`\`bash kubectl set image deployment/$APP_NAME container=new-image:tag \`\`\` ### View Logs \`\`\`bash kubectl logs -f deployment/$APP_NAME \`\`\` ### Port Forward \`\`\`bash kubectl port-forward svc/$APP_NAME 8080:80 \`\`\` ## Troubleshooting ### Pod Not Starting \`\`\`bash # Check pod events kubectl describe pod POD_NAME # Check logs kubectl logs POD_NAME # Previous instance logs kubectl logs POD_NAME --previous \`\`\` ### Service Not Reachable \`\`\`bash # Check endpoints kubectl get endpoints $APP_NAME # Check service kubectl describe svc $APP_NAME # Test from another pod kubectl run -it --rm debug --image=busybox --restart=Never -- wget -O- http://$APP_NAME \`\`\` ## Quick Reference | Task | Command | |------|---------| | Apply | \`kubectl apply -f $K8S_DIR/\` | | Status | \`kubectl get all -l app=$APP_NAME\` | | Logs | \`kubectl logs -f deployment/$APP_NAME\` | | Scale | \`kubectl scale deployment $APP_NAME --replicas=N\` | | Delete | \`kubectl delete -f $K8S_DIR/\` | EOF echo "✓ Created Kubernetes skill"
Infrastructure Patterns to Capture
Pattern 1: SSH Access Matrix
## SSH Access | Host | IP | Purpose | Access | |------|--------|---------|--------| | node1 | 192.168.1.10 | Primary | `ssh node1` | | node2 | 192.168.1.11 | Secondary | `ssh node2` | | bastion | 203.0.113.5 | Jump host | `ssh -J bastion node1` |
Pattern 2: Service Port Mapping
## Service Ports | Service | Internal | External | Protocol | |---------|----------|----------|----------| | Web | 8080 | 80 | HTTP | | API | 3000 | 443 | HTTPS | | DB | 5432 | - | TCP |
Pattern 3: Configuration Files
## Configuration Locations ### Application Config - Path: `/etc/app/config.yaml` - Format: YAML - Requires restart: Yes ### Database Config - Path: `/var/lib/postgres/postgresql.conf` - Format: INI - Requires restart: Yes
Pattern 4: Command Workflows
## Deployment Workflow 1. **Backup current state** ```bash ./backup.sh
-
Pull latest code
git pull origin main -
Build application
docker build -t app:latest . -
Deploy
docker-compose up -d -
Verify
curl http://localhost/health
## Best Practices ### ✅ DO 1. **Document assumptions** - What's required before operations 2. **Include verification** - How to verify each operation succeeded 3. **Add troubleshooting** - Common issues and fixes 4. **Show outputs** - Expected command outputs 5. **Link resources** - Related documentation and skills 6. **Version information** - Software versions, configurations 7. **Security notes** - Security implications of operations 8. **Update regularly** - Keep skills current with infrastructure ### ❌ DON'T 1. **Don't hardcode secrets** - Use placeholders or env vars 2. **Don't skip context** - Explain why, not just how 3. **Don't assume knowledge** - Explain terminology 4. **Don't omit edge cases** - Document special scenarios 5. **Don't forget cleanup** - Include teardown procedures 6. **Don't ignore dependencies** - Document prerequisites 7. **Don't skip testing** - Verify all commands work 8. **Don't leave TODO** - Complete all sections ## Quality Checklist - [ ] Clear component description - [ ] Authentication/access documented - [ ] Architecture overview provided - [ ] Common operations with examples - [ ] Troubleshooting section complete - [ ] Health checks documented - [ ] Backup/recovery procedures - [ ] Security considerations noted - [ ] Quick reference table - [ ] All commands tested - [ ] No hardcoded secrets - [ ] Links to resources ## Quick Start Workflow ```bash # 1. Identify infrastructure component COMPONENT="nginx-reverse-proxy" # 2. Gather documentation # - Collect README files # - Export wiki pages # - Capture team knowledge # - Document current setup # 3. Create skill structure mkdir -p skills/$COMPONENT # 4. Fill in template # Use the Infrastructure Skill Template above # 5. Test all commands # Verify every command in skill works # 6. Review and refine # Have team review for completeness # 7. Commit to repository git add skills/$COMPONENT git commit -m "docs: Add $COMPONENT infrastructure skill"
Version: 1.0.0 Author: Harvested from proxmox-auth skill pattern Last Updated: 2025-11-18 License: MIT Key Principle: Convert tribal knowledge into structured, searchable, actionable skills.
Examples in This Repository
- proxmox-auth: Proxmox VE cluster management
- docker-*: Docker-based infrastructure
- cloudflare-*: Cloudflare infrastructure services
Transform your infrastructure documentation into skills today! 🏗️