Claude-skill-registry kubernetes-operations
Assist with Kubernetes interactions including debugging (kubectl logs, describe, exec, port-forward), resource management (deployments, services, configmaps, secrets), and cluster operations (scaling, rollouts, node management). Use when working with kubectl, pods, deployments, services, or troubleshooting Kubernetes issues.
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/kubernetes-operations" ~/.claude/skills/majiayu000-claude-skill-registry-kubernetes-operations-63b4a9 && rm -rf "$T"
skills/data/kubernetes-operations/SKILL.mdKubernetes Operations
Comprehensive kubectl assistance for debugging, resource management, and cluster operations with token-efficient scripts.
BEFORE YOU START
This skill prevents 5 common errors and saves ~70% tokens.
| Metric | Without Skill | With Skill |
|---|---|---|
| Pod Debugging | ~1200 tokens | ~400 tokens |
| Resource Listing | ~800 tokens | ~200 tokens |
| Cluster Health | ~1500 tokens | ~300 tokens |
Known Issues This Skill Prevents
- Running kubectl commands in wrong namespace/context
- Verbose output flooding context with unnecessary data
- Missing critical debugging steps (events, previous logs)
- Exposing secrets in plain text output
- Destructive operations without dry-run verification
Quick Start
Step 1: Verify Context
kubectl config current-context kubectl config get-contexts
Why this matters: Running commands in the wrong cluster can cause production incidents.
Step 2: Debug a Pod
uv run scripts/debug_pod.py <pod-name> [-n namespace]
Why this matters: The script combines describe, logs, and events into a condensed summary, saving ~800 tokens.
Step 3: Check Cluster Health
uv run scripts/cluster_health.py
Why this matters: Quick overview of node status and unhealthy pods without verbose output.
Critical Rules
Always Do
- Always verify
before operationskubectl config current-context - Always use
to be explicit about target-n namespace - Always use
before applying changes--dry-run=client -o yaml - Always check events when debugging:
kubectl get events --sort-by='.lastTimestamp' - Always use
flag when pod is in CrashLoopBackOff--previous
Never Do
- Never run
withoutkubectl delete
first in production--dry-run - Never output secrets without filtering: avoid
kubectl get secret -o yaml - Never assume default namespace - always specify
-n - Never ignore resource limits when debugging OOMKilled pods
- Never skip
when logs show no errorsdescribe
Common Mistakes
Wrong:
kubectl logs my-pod
Correct:
kubectl logs my-pod -n my-namespace --tail=100 --timestamps
Why: Default namespace may not be correct, unlimited logs flood context, timestamps help correlate with events.
Known Issues Prevention
| Issue | Root Cause | Solution |
|---|---|---|
| CrashLoopBackOff | App crash on startup | Check and describe for exit codes |
| ImagePullBackOff | Registry auth or image tag | Verify image exists and check pull secrets |
| Pending pods | No schedulable nodes | Check node resources and pod affinity/tolerations |
| OOMKilled | Memory limit exceeded | Check container limits vs actual usage with |
| Connection refused | Service selector mismatch | Verify pod labels match service selector |
Debugging Workflows
Pod Not Starting
# 1. Get pod status and events kubectl describe pod <name> -n <namespace> # 2. Check logs (current or previous) kubectl logs <name> -n <namespace> --tail=100 kubectl logs <name> -n <namespace> --previous # If restarting # 3. Check events for scheduling issues kubectl get events -n <namespace> --sort-by='.lastTimestamp' | grep <name> # 4. Interactive debugging kubectl exec -it <name> -n <namespace> -- /bin/sh
Service Connectivity
# 1. Verify service exists and has endpoints kubectl get svc <name> -n <namespace> kubectl get endpoints <name> -n <namespace> # 2. Check pod labels match service selector kubectl get pods -n <namespace> --show-labels # 3. Test from within cluster kubectl run debug --rm -it --image=busybox -- wget -qO- http://<service>:<port> # 4. Port-forward for local testing kubectl port-forward svc/<name> 8080:80 -n <namespace>
Resource Management
Deployments
# List deployments kubectl get deployments -n <namespace> # Scale kubectl scale deployment <name> --replicas=3 -n <namespace> # Rollout status kubectl rollout status deployment/<name> -n <namespace> # Rollback kubectl rollout undo deployment/<name> -n <namespace> # History kubectl rollout history deployment/<name> -n <namespace>
ConfigMaps and Secrets
# List kubectl get configmaps -n <namespace> kubectl get secrets -n <namespace> # View ConfigMap data kubectl get configmap <name> -n <namespace> -o jsonpath='{.data}' # View Secret keys (NOT values) kubectl get secret <name> -n <namespace> -o jsonpath='{.data}' | jq 'keys' # Create from file kubectl create configmap <name> --from-file=<path> -n <namespace> --dry-run=client -o yaml
Cluster Operations
Node Management
# List nodes with status kubectl get nodes -o wide # Node details kubectl describe node <name> # Cordon (prevent scheduling) kubectl cordon <node> # Drain (evict pods) kubectl drain <node> --ignore-daemonsets --delete-emptydir-data # Uncordon kubectl uncordon <node>
Resource Usage
# Node resources kubectl top nodes # Pod resources kubectl top pods -n <namespace> # Sort by memory kubectl top pods -n <namespace> --sort-by=memory
Bundled Resources
Scripts
Located in
scripts/:
- Comprehensive pod debugging with condensed outputdebug_pod.py
- Resource summary using jsonpath for minimal tokensget_resources.py
- Quick cluster status overviewcluster_health.py
References
Located in
references/:
- Condensed command referencekubectl-cheatsheet.md
- Common JSONPath expressionsjsonpath-patterns.md
- Decision tree for pod issuesdebugging-flowchart.md
Note: For deep dives on specific topics, see the reference files above.
Dependencies
Required
| Package | Version | Purpose |
|---|---|---|
| kubectl | 1.25+ | Kubernetes CLI |
| jq | 1.6+ | JSON parsing for scripts |
Optional
| Package | Version | Purpose |
|---|---|---|
| k9s | 0.27+ | Terminal UI for Kubernetes |
| stern | 1.25+ | Multi-pod log tailing |
Official Documentation
Troubleshooting
kubectl command not found
Symptoms:
command not found: kubectl
Solution:
# macOS brew install kubectl # Verify kubectl version --client
Context not set
Symptoms:
error: no context is currently set
Solution:
# List available contexts kubectl config get-contexts # Set context kubectl config use-context <context-name>
Permission denied
Symptoms:
Error from server (Forbidden)
Solution:
# Check current user kubectl auth whoami # Check permissions kubectl auth can-i get pods -n <namespace> kubectl auth can-i --list -n <namespace>
Timeout connecting to cluster
Symptoms:
Unable to connect to the server: dial tcp: i/o timeout
Solution:
# Check cluster endpoint kubectl cluster-info # Verify network connectivity curl -k https://<cluster-api-endpoint>/healthz # Check kubeconfig cat ~/.kube/config
Setup Checklist
Before using this skill, verify:
-
installed (kubectl
)kubectl version --client - Kubeconfig configured (
exists)~/.kube/config - Context set to correct cluster (
)kubectl config current-context - Permissions verified (
)kubectl auth can-i get pods -
installed for JSON parsing (jq
)jq --version