Claude-skill-registry k8s

Kubernetes ops skill for deploying, operating, and troubleshooting services on Kubernetes. Use for tasks like writing manifests/Helm, configuring deployments/services/ingress, autoscaling, observability, RBAC, secrets/configmaps, rollout/rollback, incident debugging, and production readiness checks.

install

source · Clone the upstream repo

git clone https://github.com/majiayu000/claude-skill-registry

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/k8s" ~/.claude/skills/majiayu000-claude-skill-registry-k8s && rm -rf "$T"

manifest: skills/data/k8s/SKILL.md

k8s

Use this skill for Kubernetes 运维与发布相关工作。

Defaults / assumptions to confirm

Cluster type: managed (EKS/GKE/ACK) vs self-hosted
Packaging: raw YAML vs Helm vs Kustomize
Ingress: NGINX/ALB/APISIX/Istio
Observability stack: Prometheus/Grafana, Loki/ELK, tracing

Workflow

Understand service requirements

Ports, protocols, health checks, resources (CPU/mem), storage needs.
SLOs: latency, availability, RPO/RTO.
Dependencies: DB, cache, MQ, external APIs.

Deployment design

Use
```
Deployment
```
for stateless;
```
StatefulSet
```
for stable identities/storage.
Define
```
readinessProbe
```
and
```
livenessProbe
```
(and
```
startupProbe
```
if needed).
Set
```
resources.requests/limits
```
and choose appropriate QoS.
Use
```
PodDisruptionBudget
```
for availability during maintenance.

Config & secrets

Config:
```
ConfigMap
```
(non-sensitive), mounted or env.
Secrets:
```
Secret
```
(sensitive) + external secret manager if available.
Never commit plaintext secrets; prefer sealed/external secrets.

Networking

```
Service
```
types and DNS.
```
Ingress
```
/Gateway routing, TLS termination, timeouts.
NetworkPolicy if cluster enforces it.

Scaling & resilience

```
HPA
```
based on CPU/memory/custom metrics.
Graceful shutdown (
```
preStop
```
, terminationGracePeriodSeconds).
Retry/backoff at client; avoid retry storms.

Observability

Standard logs with correlation IDs.
Metrics: RPS, p95 latency, error rate, saturation.
Alerts and dashboards; runbook links.

Release operations

Rolling updates, canary/blue-green if needed.
```
kubectl rollout status
```
+ rollback plan.
Post-deploy verification checks and smoke tests.

Troubleshooting checklist

```
kubectl get/describe
```
pods, events, and
```
logs
```
.
Check probes, image pull, env/config, DNS, network, and resource throttling.
For performance: node pressure, HPA behavior, GC/heap, connection pool limits.

Output expectations when making changes

Provide manifests (or Helm values/templates) + brief deployment notes.
Include resource sizing rationale and probe settings.
Include rollback instructions and verification steps.