Claude-skill-registry helm-charts-audit
Audits Helm charts for anti-patterns, security issues, and best practice violations. Use when asked to audit, review, or check Helm chart quality. Generates a comprehensive report under reports/YYYY-MM-DD/helm-charts-audit.md. (project)
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/helm-charts-audit" ~/.claude/skills/majiayu000-claude-skill-registry-helm-charts-audit && rm -rf "$T"
skills/data/helm-charts-audit/SKILL.mdPurpose
Enforce Helm chart quality and security standards across the
helm-charts/ directory through automated checks.
What it checks (13 checks):
- Image Tags (no latest, mutable tags) - HIGH
- Security Context (runAsNonRoot, no privileged) - HIGH
- Resource Limits (requests, limits, memory) - HIGH
- RBAC Wildcards (no * permissions) - HIGH
- Health Probes (liveness, readiness) - HIGH
- Helm Lint (official helm validation) - HIGH
- Chart Metadata (apiVersion, version, maintainers) - MEDIUM
- Chart Structure (README, NOTES.txt, _helpers.tpl) - MEDIUM
- Dependencies (pinned versions, Chart.lock) - MEDIUM
- Deprecated APIs (no v1beta1, use stable APIs) - MEDIUM
- Argo Rollouts (strategy, analysis, steps) - MEDIUM
- Ingress TLS (certificates, annotations) - MEDIUM
- GPU Resources (nvidia.com/gpu, tolerations) - LOW
Running Checks
Full audit (all checks):
node .claude/skills/helm-charts-audit/scripts/run_all_checks.mjs
Generate report (all checks + markdown report):
node .claude/skills/helm-charts-audit/scripts/generate_report.mjs
Report saved to:
reports/YYYY-MM-DD/helm-charts-audit.md
Individual checks:
node .claude/skills/helm-charts-audit/scripts/check_image_tags.mjs node .claude/skills/helm-charts-audit/scripts/check_security_context.mjs node .claude/skills/helm-charts-audit/scripts/check_resource_limits.mjs node .claude/skills/helm-charts-audit/scripts/check_rbac_wildcards.mjs node .claude/skills/helm-charts-audit/scripts/check_health_probes.mjs node .claude/skills/helm-charts-audit/scripts/check_helm_lint.mjs node .claude/skills/helm-charts-audit/scripts/check_chart_metadata.mjs node .claude/skills/helm-charts-audit/scripts/check_chart_structure.mjs node .claude/skills/helm-charts-audit/scripts/check_dependencies.mjs node .claude/skills/helm-charts-audit/scripts/check_deprecated_apis.mjs node .claude/skills/helm-charts-audit/scripts/check_argo_rollouts.mjs node .claude/skills/helm-charts-audit/scripts/check_ingress_tls.mjs node .claude/skills/helm-charts-audit/scripts/check_gpu_resources.mjs
Quality Rules
1. Image Tags (HIGH)
RULE: Never use mutable tags.
latest tag = unpredictable deployments + rollback failures.
Violations:
- mutable, changes without noticeimage: nginx:latest
- defaults to :latestimage: nginx
- empty tag in values.yamltag: ""
,tag: head
,tag: canary
- mutable branch tagstag: dev
Fix: Use immutable tags like
v1.2.3, SHA digests sha256:abc123, or SemVer 1.21.0.
2. Security Context (HIGH)
RULE: Containers must run with minimal privileges. Privileged containers = cluster takeover risk.
Violations:
- full host access, container escape trivialprivileged: true
- runs as root user UID 0runAsNonRoot: false
- explicitly rootrunAsUser: 0
- can gain more privilegesallowPrivilegeEscalation: true
- shares host network namespacehostNetwork: true
- can see/kill host processeshostPID: true
- can access host shared memoryhostIPC: true
- malware can write anywherereadOnlyRootFilesystem: false
- near-root level accesscapabilities.add: [SYS_ADMIN]
- equivalent to privilegedcapabilities.add: [ALL]
Fix: Add proper securityContext with
runAsNonRoot: true, allowPrivilegeEscalation: false, readOnlyRootFilesystem: true, capabilities.drop: [ALL].
3. Resource Limits (HIGH)
RULE: All containers must have resource requests and limits. No limits = node OOM + noisy neighbor issues.
Violations:
- empty resources blockresources: {}- Missing
- scheduler can't make decisionsrequests.cpu - Missing
- OOM killer may terminate unexpectedlyrequests.memory - Missing
- container can consume all node memorylimits.memory
- invalid configurationrequests > limits
Fix: Define
resources.requests.cpu, resources.requests.memory, resources.limits.memory. Note: CPU limits often intentionally omitted for better performance.
4. RBAC Wildcards (HIGH)
RULE: Follow least-privilege principle. Wildcard permissions = privilege escalation path.
Violations:
- grants all actionsverbs: ["*"]
- access to all resource typesresources: ["*"]
- access across all API groupsapiGroups: ["*"]
- full cluster accessroleRef.name: cluster-admin
- can act as other usersverbs: [impersonate]
- can grant additional privilegesverbs: [escalate, bind]- Access to
resource - can read all secretssecrets
Fix: Use explicit verbs like
[get, list, watch], explicit resources like [pods, services], avoid cluster-admin bindings.
5. Health Probes (HIGH)
RULE: All workloads must have health probes. No probes = stuck containers not restarted + traffic to unready pods.
Violations:
- Deployment without
- stuck containers won't restartlivenessProbe - Deployment without
- traffic sent to unready podsreadinessProbe
- probes start immediately, false failuresinitialDelaySeconds: 0
- too short, may cause false failurestimeoutSeconds: 1
on livenessProbe - should always be 1successThreshold > 1
- delays detecting actual failuresfailureThreshold > 10
Fix: Add livenessProbe and readinessProbe with reasonable
initialDelaySeconds (10-30s), periodSeconds (10s), timeoutSeconds (5s).
6. Helm Lint (HIGH)
RULE: Charts must pass official helm lint validation. Lint failures = deployment failures.
Violations:
- Template syntax errors
- Missing required fields in Chart.yaml
- Invalid YAML structure
- Broken template references
Fix: Run
helm lint <chart-path> and fix reported issues.
7. Chart Metadata (MEDIUM)
RULE: Chart.yaml must have complete metadata. Missing metadata = maintenance nightmare.
Violations:
- Helm 2 format, upgrade to v2apiVersion: v1- Missing or invalid
- must be SemVerversion - Missing
- hard to track what's deployedappVersion - Missing
- unclear what chart doesdescription - Missing
- no ownershipmaintainers
doesn't match directory name - confusingname
Fix: Use
apiVersion: v2, SemVer version, add description and maintainers with email.
8. Chart Structure (MEDIUM)
RULE: Follow standard Helm chart structure. Non-standard = user confusion + missing features.
Violations:
- Missing
- no documentationREADME.md - Missing
- no post-install instructionstemplates/NOTES.txt - Missing
- no template helperstemplates/_helpers.tpl - Missing
- unnecessary files in package.helmignore - Missing
- no values validationvalues.schema.json - Empty
directorytemplates/
Fix: Create missing files following Helm chart best practices.
9. Dependencies (MEDIUM)
RULE: Pin dependency versions. Floating versions = non-reproducible builds.
Violations:
- No
on dependency - unpinnedversion
orversion: "*"
- floating versionversion: "^1.0"- Missing
- dependency versions not lockedChart.lock
- local reference, breaks when publishedrepository: file://
- insecure, use HTTPSrepository: http://- Deprecated repository URLs (charts.helm.sh/stable)
Fix: Pin exact versions, run
helm dependency update to generate Chart.lock.
10. Deprecated APIs (MEDIUM)
RULE: Use stable Kubernetes APIs. Deprecated APIs = upgrade failures.
Violations:
- removed in K8s 1.22extensions/v1beta1
,apps/v1beta1
- removed in K8s 1.16apps/v1beta2
Ingress - removed in K8s 1.22networking.k8s.io/v1beta1
CronJob - removed in K8s 1.25batch/v1beta1
PodSecurityPolicy - removed in K8s 1.25policy/v1beta1
Fix: Update to stable APIs:
apps/v1, networking.k8s.io/v1, batch/v1. Run kubectl convert if needed.
11. Argo Rollouts (MEDIUM)
RULE: Rollouts must have valid strategy configuration. Invalid config = failed deployments.
Violations:
- Rollout without
- no deployment strategystrategy - Canary without
- no gradual rolloutsteps - Canary without
- no automated validationanalysis - BlueGreen without
- no active service definedactiveService - BlueGreen without
- can't preview before promotionpreviewService - Missing
- old ReplicaSets accumulaterevisionHistoryLimit - Missing
- stuck rollouts don't timeoutprogressDeadlineSeconds
Fix: Configure proper canary steps with analysis, or blueGreen with activeService/previewService.
12. Ingress TLS (MEDIUM)
RULE: Ingress must have TLS configuration. No TLS = unencrypted traffic.
Violations:
- Ingress with hosts but no TLS - traffic unencrypted
- TLS without
- certificate source unclearsecretName - No
- may use wrong controlleringressClassName - Missing cert-manager annotations - no automated certificates
- Deprecated
annotationkubernetes.io/ingress.class - No SSL redirect annotation - HTTP doesn't redirect to HTTPS
Fix: Add TLS section with secretName, use
cert-manager.io/cluster-issuer annotation for automated certs.
13. GPU Resources (LOW)
RULE: GPU workloads need proper configuration. Missing config = scheduling failures.
Violations:
- GPU limits without matching requests - should be equal
- No GPU toleration - won't schedule on GPU nodes
- No GPU nodeSelector/affinity - relies only on resource availability
- No runtimeClassName - may need nvidia runtime
Fix: Set
nvidia.com/gpu in both requests and limits (equal values), add GPU tolerations and nodeSelector.
Detection Philosophy
This skill uses VALUE-BASED detection:
- Detects issues by actual values and patterns, not by variable/field names
- Future-proof: new charts with issues are automatically detected
- No need to update scripts when new charts are added
Parsing Strategy
- Chart.yaml, values.yaml: YAML content parsed via regex patterns
- templates/*.yaml: Regex-based parsing (Go template syntax breaks YAML parsers)
- Multi-document YAML: Handles
separators---
Safety
- Read-only operation (except report generation)
- No Helm releases modified
- No cluster changes