Claude-skill-registry k8s

Kubernetes ops skill for deploying, operating, and troubleshooting services on Kubernetes. Use for tasks like writing manifests/Helm, configuring deployments/services/ingress, autoscaling, observability, RBAC, secrets/configmaps, rollout/rollback, incident debugging, and production readiness checks.

install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/k8s" ~/.claude/skills/majiayu000-claude-skill-registry-k8s && rm -rf "$T"
manifest: skills/data/k8s/SKILL.md
source content

k8s

Use this skill for Kubernetes 运维与发布相关工作。

Defaults / assumptions to confirm

  • Cluster type: managed (EKS/GKE/ACK) vs self-hosted
  • Packaging: raw YAML vs Helm vs Kustomize
  • Ingress: NGINX/ALB/APISIX/Istio
  • Observability stack: Prometheus/Grafana, Loki/ELK, tracing

Workflow

  1. Understand service requirements
  • Ports, protocols, health checks, resources (CPU/mem), storage needs.
  • SLOs: latency, availability, RPO/RTO.
  • Dependencies: DB, cache, MQ, external APIs.
  1. Deployment design
  • Use
    Deployment
    for stateless;
    StatefulSet
    for stable identities/storage.
  • Define
    readinessProbe
    and
    livenessProbe
    (and
    startupProbe
    if needed).
  • Set
    resources.requests/limits
    and choose appropriate QoS.
  • Use
    PodDisruptionBudget
    for availability during maintenance.
  1. Config & secrets
  • Config:
    ConfigMap
    (non-sensitive), mounted or env.
  • Secrets:
    Secret
    (sensitive) + external secret manager if available.
  • Never commit plaintext secrets; prefer sealed/external secrets.
  1. Networking
  • Service
    types and DNS.
  • Ingress
    /Gateway routing, TLS termination, timeouts.
  • NetworkPolicy if cluster enforces it.
  1. Scaling & resilience
  • HPA
    based on CPU/memory/custom metrics.
  • Graceful shutdown (
    preStop
    , terminationGracePeriodSeconds).
  • Retry/backoff at client; avoid retry storms.
  1. Observability
  • Standard logs with correlation IDs.
  • Metrics: RPS, p95 latency, error rate, saturation.
  • Alerts and dashboards; runbook links.
  1. Release operations
  • Rolling updates, canary/blue-green if needed.
  • kubectl rollout status
    + rollback plan.
  • Post-deploy verification checks and smoke tests.
  1. Troubleshooting checklist
  • kubectl get/describe
    pods, events, and
    logs
    .
  • Check probes, image pull, env/config, DNS, network, and resource throttling.
  • For performance: node pressure, HPA behavior, GC/heap, connection pool limits.

Output expectations when making changes

  • Provide manifests (or Helm values/templates) + brief deployment notes.
  • Include resource sizing rationale and probe settings.
  • Include rollback instructions and verification steps.