Agent-almanac enforce-policy-as-code
git clone https://github.com/pjt222/agent-almanac
T=$(mktemp -d) && git clone --depth=1 https://github.com/pjt222/agent-almanac "$T" && mkdir -p ~/.claude/skills && cp -r "$T/i18n/caveman-ultra/skills/enforce-policy-as-code" ~/.claude/skills/pjt222-agent-almanac-enforce-policy-as-code-5c0824 && rm -rf "$T"
i18n/caveman-ultra/skills/enforce-policy-as-code/SKILL.mdEnforce Policy as Code
Implement declarative policy enforcement using OPA Gatekeeper or Kyverno for Kubernetes resource validation and mutation.
When to Use
- Enforce organizational standards for resource configuration (labels, annotations, limits)
- Prevent security misconfigurations (privileged containers, host namespaces, insecure images)
- Ensure compliance requirements are met before resources deployed
- Standardize resource naming conventions and metadata
- Implement automated remediation through mutation policies
- Audit existing cluster resources against policies without blocking
- Integrate policy validation into CI/CD pipelines for shift-left approach
Inputs
- Required: Kubernetes cluster with admin access
- Required: Choice of policy engine (OPA Gatekeeper or Kyverno)
- Required: List of policies to enforce (security, compliance, operational)
- Optional: Existing resources to audit
- Optional: Exemption/exclusion patterns for specific namespaces or resources
- Optional: CI/CD pipeline configuration for pre-deployment validation
Procedure
See Extended Examples for complete configuration files and templates.
Step 1: Install Policy Engine
Deploy OPA Gatekeeper or Kyverno as admission controller.
For OPA Gatekeeper:
# Install Gatekeeper using Helm helm repo add gatekeeper https://open-policy-agent.github.io/gatekeeper/charts helm repo update # Install with audit enabled helm install gatekeeper gatekeeper/gatekeeper \ --namespace gatekeeper-system \ --create-namespace \ --set audit.replicas=2 \ --set replicas=3 \ --set validatingWebhookFailurePolicy=Fail \ --set auditInterval=60 # Verify installation kubectl get pods -n gatekeeper-system kubectl get crd | grep gatekeeper # Check webhook configuration kubectl get validatingwebhookconfigurations gatekeeper-validating-webhook-configuration -o yaml
For Kyverno:
# Install Kyverno using Helm helm repo add kyverno https://kyverno.github.io/kyverno/ helm repo update # Install with HA setup helm install kyverno kyverno/kyverno \ --namespace kyverno \ --create-namespace \ --set replicaCount=3 \ --set admissionController.replicas=3 \ --set backgroundController.replicas=2 \ --set cleanupController.replicas=2 # Verify installation kubectl get pods -n kyverno kubectl get crd | grep kyverno # Check webhook configurations kubectl get validatingwebhookconfigurations kyverno-resource-validating-webhook-cfg kubectl get mutatingwebhookconfigurations kyverno-resource-mutating-webhook-cfg
Create namespace exclusions:
# gatekeeper-config.yaml apiVersion: config.gatekeeper.sh/v1alpha1 kind: Config metadata: name: config namespace: gatekeeper-system spec: match: - excludedNamespaces: - kube-system - kube-public - kube-node-lease - gatekeeper-system processes: - audit - webhook validation: traces: - user: system:serviceaccount:gatekeeper-system:gatekeeper-admin kind: group: "" version: v1 kind: Namespace
Expected: Policy engine pods running with multiple replicas. CRDs installed (ConstraintTemplate, Constraint for Gatekeeper; ClusterPolicy, Policy for Kyverno). Validating/mutating webhooks active. Audit controller running.
On failure:
- Check pod logs:
kubectl logs -n gatekeeper-system -l app=gatekeeper --tail=50 - Verify webhook endpoints reachable:
kubectl get endpoints -n gatekeeper-system - Check for port conflicts or certificate issues in webhook logs
- Ensure cluster has sufficient resources (policy engines need ~500MB per replica)
- Review RBAC permissions:
kubectl auth can-i create constrainttemplates --as=system:serviceaccount:gatekeeper-system:gatekeeper-admin
Step 2: Define Constraint Templates and Policies
Create reusable policy templates and specific constraints.
OPA Gatekeeper Constraint Template:
# required-labels-template.yaml apiVersion: templates.gatekeeper.sh/v1 kind: ConstraintTemplate metadata: name: k8srequiredlabels annotations: # ... (see EXAMPLES.md for complete configuration)
Kyverno ClusterPolicy:
# kyverno-policies.yaml apiVersion: kyverno.io/v1 kind: ClusterPolicy metadata: name: require-labels annotations: # ... (see EXAMPLES.md for complete configuration)
Apply policies:
# Apply Gatekeeper templates and constraints kubectl apply -f required-labels-template.yaml # Apply Kyverno policies kubectl apply -f kyverno-policies.yaml # Verify constraint/policy status kubectl get constraints kubectl get clusterpolicies # Check for any policy errors kubectl describe k8srequiredlabels require-app-labels kubectl describe clusterpolicy require-labels
Expected: ConstraintTemplates/ClusterPolicies created successfully. Constraints show status "True" for enforcement. No errors in policy definitions. Webhook begins evaluating new resources against policies.
On failure:
- Validate Rego syntax (Gatekeeper): Use
locally or check constraint statusopa test - Check policy YAML syntax:
kubectl apply --dry-run=client -f policy.yaml - Review constraint status:
kubectl get constraint -o yaml | grep -A 10 status - Test with simple policy first, then add complexity
- Verify match criteria (kinds, namespaces) are correct
Step 3: Test Policy Enforcement
Validate policies block non-compliant resources and allow compliant ones.
Create test manifests:
# test-non-compliant.yaml apiVersion: apps/v1 kind: Deployment metadata: name: test-no-labels namespace: production # ... (see EXAMPLES.md for complete configuration)
Test policies:
# Attempt to create non-compliant resource (should fail) kubectl apply -f test-non-compliant.yaml # Expected: Error with policy violation message # Create compliant resource (should succeed) kubectl apply -f test-compliant.yaml # Expected: deployment.apps/test-compliant created # Test with dry-run for validation kubectl apply -f test-non-compliant.yaml --dry-run=server # Shows policy violations without actually creating resource # Clean up kubectl delete -f test-compliant.yaml
Test with policy reporting (Kyverno):
# Check policy reports kubectl get policyreports -A kubectl get clusterpolicyreports # View detailed report kubectl get policyreport -n production -o yaml # Check policy rule results kubectl get policyreport -n production -o jsonpath='{.items[0].results}' | jq .
Expected: Non-compliant resources rejected with clear violation messages. Compliant resources created successfully. Policy reports show pass/fail results. Dry-run validation works without creating resources.
On failure:
- Check if policy is in audit mode instead of enforce:
validationFailureAction: audit - Verify webhook is processing requests:
kubectl logs -n gatekeeper-system -l app=gatekeeper - Check for namespace exclusions that might exempt test namespace
- Test webhook connectivity:
kubectl run test --rm -it --image=busybox --restart=Never - Review webhook failure policy (Ignore vs Fail)
Step 4: Implement Mutation Policies
Configure automatic remediation through mutation.
Gatekeeper mutation:
# gatekeeper-mutations.yaml apiVersion: mutations.gatekeeper.sh/v1beta1 kind: Assign metadata: name: add-default-labels spec: # ... (see EXAMPLES.md for complete configuration)
Kyverno mutation policies:
# kyverno-mutations.yaml apiVersion: kyverno.io/v1 kind: ClusterPolicy metadata: name: add-default-labels spec: # ... (see EXAMPLES.md for complete configuration)
Apply and test mutations:
# Apply mutation policies kubectl apply -f gatekeeper-mutations.yaml # OR kubectl apply -f kyverno-mutations.yaml # Test mutation with a deployment # ... (see EXAMPLES.md for complete configuration)
Expected: Mutations automatically add labels, resources, or modify images. Deployed resources show mutated values. Mutations logged in policy engine logs. No errors during mutation application.
On failure:
- Check mutation webhook is enabled:
kubectl get mutatingwebhookconfiguration - Verify mutation policy syntax: especially JSON paths and conditions
- Review logs:
kubectl logs -n kyverno deploy/kyverno-admission-controller - Test mutations don't conflict (multiple mutations on same field)
- Ensure mutation applied before validation (order matters)
Step 5: Enable Audit Mode and Reporting
Configure audit to identify violations in existing resources without blocking.
Gatekeeper audit:
# Audit runs automatically based on auditInterval setting # Check audit results kubectl get constraints -o json | \ jq '.items[] | {name: .metadata.name, violations: .status.totalViolations}' # Get detailed violation information # ... (see EXAMPLES.md for complete configuration)
Kyverno audit and reporting:
# Generate policy reports for existing resources kubectl create job --from=cronjob/kyverno-cleanup-controller -n kyverno manual-report-gen # View policy reports kubectl get policyreport -A kubectl get clusterpolicyreport # ... (see EXAMPLES.md for complete configuration)
Create dashboard for policy compliance:
# prometheus-rules.yaml apiVersion: monitoring.coreos.com/v1 kind: PrometheusRule metadata: name: policy-alerts namespace: monitoring # ... (see EXAMPLES.md for complete configuration)
Expected: Audit identifies violations in existing resources without blocking deployments. Policy reports generated with pass/fail counts. Violations exportable for review. Metrics exposed for monitoring. Alerts fire on increasing violations.
On failure:
- Verify audit controller running:
kubectl get pods -n gatekeeper-system -l gatekeeper.sh/operation=audit - Check audit interval setting in installation
- Review audit logs for errors:
kubectl logs -n gatekeeper-system -l gatekeeper.sh/operation=audit - Ensure RBAC permissions allow reading all resource types for audit
- Verify CRD status field being populated:
kubectl get constraint -o yaml | grep -A 20 status
Step 6: Integrate with CI/CD Pipeline
Add pre-deployment policy validation to shift-left policy enforcement.
CI/CD integration script:
#!/bin/bash # validate-policies.sh set -e echo "=== Policy Validation for CI/CD ===" # ... (see EXAMPLES.md for complete configuration)
GitHub Actions workflow:
# .github/workflows/policy-validation.yaml name: Policy Validation locale: caveman-ultra source_locale: en source_commit: 82c77053 translator: "Julius Brussee homage — caveman" translation_date: "2026-04-19" on: pull_request: paths: # ... (see EXAMPLES.md for complete configuration)
Pre-commit hook:
#!/bin/bash # .git/hooks/pre-commit # Validate Kubernetes manifests against policies if git diff --cached --name-only | grep -E 'manifests/.*\.yaml$'; then echo "Validating Kubernetes manifests against policies..." # ... (see EXAMPLES.md for complete configuration)
Expected: CI/CD pipeline validates manifests before deployment. Policy violations fail pipeline with clear messages. Policy reports attached to PR. Pre-commit hooks catch violations early. Developers notified of policy issues before reaching cluster.
On failure:
- Verify CLI tools installed and in PATH
- Check kubeconfig credentials valid for fetching policies
- Test policy validation locally first:
kyverno apply policy.yaml --resource manifest.yaml - Ensure policies synced from cluster are complete
- Review policy CLI logs for specific validation errors
Validation
- Policy engine pods running with HA configuration
- Validating and mutating webhooks active and reachable
- Constraint templates and policies created without errors
- Non-compliant resources rejected with clear violation messages
- Compliant resources deploy successfully
- Mutation policies automatically remediate resources
- Audit mode identifies violations in existing resources
- Policy reports generated and accessible
- Metrics exposed for policy compliance monitoring
- CI/CD pipeline validates manifests pre-deployment
- Pre-commit hooks prevent policy violations
- Namespace exclusions configured appropriately
Common Pitfalls
-
Webhook Failure Policy:
blocks all resources if webhook unavailable. UsefailurePolicy: Fail
for non-critical policies, but understand security implications. Test webhook availability before enforcing.Ignore -
Too Restrictive Initial Policies: Starting with enforcement mode on strict policies breaks existing workloads. Begin with audit mode, review violations, communicate with teams, then enforce gradually.
-
Missing Resource Specifications: Policies must specify API groups, versions, and kinds correctly. Use
to find exact values. Wildcards (kubectl api-resources
) convenient but can cause performance issues.* -
Mutation Order: Mutations applied before validations. Ensure mutations don't conflict and that validations account for mutated values. Test mutation+validation together.
-
Namespace Exclusions: Excluding system namespaces necessary, but be careful not to over-exclude. Review exclusions regularly as policies mature.
-
Rego Complexity (Gatekeeper): Complex Rego policies difficult to debug. Start simple, test with
locally, add logging withopa test
, use gator for offline testing.trace() -
Performance Impact: Policy evaluation adds latency to admission. Keep policies efficient, use appropriate matching criteria, monitor webhook latency metrics.
-
Policy Conflicts: Multiple policies modifying same field cause issues. Coordinate policies across teams, use policy libraries for common patterns, test combinations.
-
Background Scanning: Background audit scans entire cluster. Can be resource-intensive in large clusters. Adjust audit interval based on cluster size and policy count.
-
Version Compatibility: Policy CRD versions change. Gatekeeper v3 uses
constraints, Kyverno v1.11 usesv1beta1
. Check docs for your version.kyverno.io/v1
Related Skills
- Secret validation policiesmanage-kubernetes-secrets
- Complementary security scanningsecurity-audit-codebase
- Application deployment with policy validationdeploy-to-kubernetes
- Service mesh authorization policies complement admission policiessetup-service-mesh
- Gateway policies work alongside admission policiesconfigure-api-gateway
- GitOps with policy validation in pipelineimplement-gitops-workflow