Awesome-omni-skill kagenti:deploy
Deploy or redeploy the Kagenti Kind cluster using the Python installer - quick redeploy, manual steps, and troubleshooting
install
source · Clone the upstream repo
git clone https://github.com/diegosouzapw/awesome-omni-skill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/diegosouzapw/awesome-omni-skill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/devops/kagenti-deploy-kagenti" ~/.claude/skills/diegosouzapw-awesome-omni-skill-kagenti-deploy && rm -rf "$T"
manifest:
skills/devops/kagenti-deploy-kagenti/SKILL.mdsource content
Deploy Cluster Skill
This skill guides you through deploying or redeploying the Kagenti Kind cluster using the Python installer.
Context-Safe Execution (MANDATORY)
Deploy scripts produce hundreds of lines. Always redirect to files:
export LOG_DIR=/tmp/kagenti/deploy/$(basename $(git rev-parse --show-toplevel)) mkdir -p $LOG_DIR # Pattern: redirect deploy output ./.github/scripts/local-setup/kind-full-test.sh ... > $LOG_DIR/deploy.log 2>&1; echo "EXIT:$?" # On failure: Task(subagent_type='Explore') with Grep to find errors
When to Use
- Setting up new local development cluster
- Full cluster redeploy after major changes
- Cluster is corrupted or unstable
- Testing clean deployment
- Running E2E tests locally
Resource Requirements
Minimum (from CLAUDE.md):
- 12GB RAM
- 4 CPU cores
- Docker Desktop, Rancher Desktop, or Podman
Recommended for development:
- 16GB RAM
- 6 CPU cores
- 50GB free disk space
Multiple Clusters
You can run multiple Kind clusters:
- agent-platform - Created by kagenti-installer (default)
- kagenti-demo - Your existing cluster
- Each cluster runs independently with its own name
Check existing clusters:
kind get clusters
Quick Redeploy (Full Installation)
# 1. Setup environment (first time only) cp kagenti/installer/app/.env_template kagenti/installer/app/.env # Edit .env with: # - GITHUB_USER=<your-github-username> # - GITHUB_TOKEN=<ghcr.io-token> # - OPENAI_API_KEY=<openai-key> # - AGENT_NAMESPACES=team1,team2 # 2. Full redeploy (creates new cluster + installs everything) cd kagenti/installer uv run kagenti-installer # What it does (15-25 minutes): # ✓ Creates Kind cluster "agent-platform" # ✓ Installs registry (optional) # ✓ Installs Tekton Pipelines # ✓ Installs Cert-Manager # ✓ Installs Platform Operator # ✓ Installs Istio Ambient # ✓ Installs Gateway API # ✓ Installs SPIRE # ✓ Installs MCP Gateway # ✓ Installs Keycloak + PostgreSQL # ✓ Installs Addons (Prometheus, Kiali, Phoenix) # ✓ Installs UI # ✓ Installs ToolHive # ✓ Creates agent namespaces (team1, team2)
Use Existing Cluster
# Install on already running Kind cluster cd kagenti/installer uv run kagenti-installer --use-existing-cluster
Cleanup and Fresh Install
# 1. Delete existing cluster kind delete cluster --name agent-platform # 2. Clean Docker images (optional) docker system prune -a # 3. Fresh install cd kagenti/installer uv run kagenti-installer
Selective Component Installation
Skip components you don't need for faster deployment:
# Minimal install (no UI, no observability, no auth) cd kagenti/installer uv run kagenti-installer \ --skip-install ui \ --skip-install addons \ --skip-install keycloak \ --skip-install spire # Skip specific components uv run kagenti-installer \ --skip-install tekton \ --skip-install operator \ --skip-install gateway \ --skip-install mcp_gateway # Install only core platform (for testing) uv run kagenti-installer \ --skip-install addons \ --skip-install ui \ --skip-install keycloak \ --skip-install agents
Available components to skip:
- Internal container registryregistry
- Tekton Pipelines (build system)tekton
- Certificate managementcert_manager
- Platform Operator (deprecated, being replaced by kagenti-operator)operator
- Service meshistio
- Kubernetes Gateway APIgateway
- Workload identityspire
- MCP Gatewaymcp_gateway
- Observability (Prometheus, Kiali, Phoenix)addons
- Kagenti UIui
- Authenticationkeycloak
- Demo agentsagents
- Metrics servermetrics_server
- MCP inspectorinspector
- ToolHive operatortoolhive
Deploy Weather Agents (Demo)
# After platform is installed kubectl apply -f kagenti/examples/components/
This creates:
- weather-tool in team1 namespace
- weather-service in team1 namespace
Check Deployment Health
Quick Health Check
# Run the health check script (from CI) chmod +x .github/scripts/verify_deployment.sh .github/scripts/verify_deployment.sh # What it checks: # ✓ Resource usage (RAM, disk, CPU, containers) # ✓ Deployment status (weather-tool, weather-service, keycloak, operator) # ✓ Pod health summary (total, running, pending, failed, crashloop) # ✓ Failed pod details (events, error logs)
Manual Health Checks
# All pods kubectl get pods -A # Failed pods only kubectl get pods -A --field-selector=status.phase!=Running,status.phase!=Succeeded # Specific namespace kubectl get pods -n team1 kubectl get pods -n keycloak kubectl get pods -n kagenti-system # Deployments kubectl get deployments -A # Services kubectl get svc -A
Run E2E Tests Locally
After platform is deployed:
cd kagenti # Install test dependencies uv pip install -r tests/requirements.txt # Run all deployment health tests uv run pytest tests/e2e/test_deployment_health.py -v # Run only critical tests uv run pytest tests/e2e/test_deployment_health.py -v --only-critical # Run specific test uv run pytest tests/e2e/test_deployment_health.py::TestWeatherToolDeployment::test_weather_tool_deployment_ready -v # Exclude Keycloak tests uv run pytest tests/e2e/test_deployment_health.py -v --exclude-app=keycloak # Increase timeout uv run pytest tests/e2e/test_deployment_health.py -v --app-timeout=600
Run Full CI Workflow Locally
Simulate what runs in CI:
# 1. Install platform cd kagenti/installer uv run kagenti-installer --silent # 2. Deploy weather agents cd ../.. kubectl apply -f kagenti/examples/components/ # 3. Wait for deployments kubectl wait --for=condition=available --timeout=300s deployment/weather-tool -n team1 kubectl wait --for=condition=available --timeout=300s deployment/weather-service -n team1 # 4. Run health check chmod +x .github/scripts/verify_deployment.sh .github/scripts/verify_deployment.sh # 5. Run E2E tests cd kagenti uv pip install -r tests/requirements.txt uv run pytest tests/e2e/test_deployment_health.py -v \ --timeout=300 \ --tb=short
Troubleshooting Deployment
Issue: Installer Timeout or Slow
# Check Docker resource allocation docker info | grep -E "CPUs|Total Memory" # Increase timeout (images can be slow to pull) # The installer will retry - just re-run: cd kagenti/installer uv run kagenti-installer --use-existing-cluster
Issue: "Error loading config file" or kubectl errors
# Check kubeconfig kubectl config current-context # Should show: kind-agent-platform # If not, set context kubectl config use-context kind-agent-platform
Issue: Pods stuck in ImagePullBackOff
# Check if images are available in Kind docker exec agent-platform-control-plane crictl images # Reload images (for custom builds) kind load docker-image <image-name> --name agent-platform # Check pod description for error kubectl describe pod <pod-name> -n <namespace>
Issue: Keycloak Connection Issues
# Restart Keycloak kubectl delete -n keycloak -f kagenti/installer/app/resources/keycloak.yaml kubectl apply -n keycloak -f kagenti/installer/app/resources/keycloak.yaml # Restart Istio ztunnel kubectl rollout restart daemonset -n istio-system ztunnel # Restart Gateway kubectl rollout restart -n kagenti-system deployment http-istio
Issue: Need to Update Secrets
# Update GitHub token kubectl -n <namespace> delete secret github-token-secret # Re-run installer to recreate secrets cd kagenti/installer uv run kagenti-installer --use-existing-cluster
Issue: Blank UI on macOS
# Disable Screen Time Content & Privacy Restrictions # System Settings > Screen Time > Content & Privacy
Issue: GitHub Token Errors
# Ensure token has correct scopes: # - repo:all # - write:packages # - read:packages # Clear cached credentials docker logout ghcr.io
Access Platform Services
After deployment, access these services:
# Kagenti UI open http://kagenti-ui.localtest.me:8080 # Keycloak Admin Console open http://keycloak.localtest.me:8080 # Username: admin # Password: (from Keycloak secret) kubectl get secret -n keycloak keycloak-initial-admin -o jsonpath='{.data.password}' | base64 -d # Prometheus (if addons installed) kubectl port-forward -n observability svc/prometheus 9090:9090 open http://localhost:9090 # Grafana (if addons installed) kubectl port-forward -n observability svc/grafana 3000:3000 open http://localhost:3000 # Kiali (if addons installed) kubectl port-forward -n kiali svc/kiali 20001:20001 open http://localhost:20001
Platform Configuration
Environment Variables (.env file)
Required in
kagenti/installer/app/.env:
# GitHub access for ghcr.io GITHUB_USER=your-username GITHUB_TOKEN=ghp_xxx # Classic token with repo:all, write:packages, read:packages # OpenAI API (for agents) OPENAI_API_KEY=sk-xxx # Agent namespaces AGENT_NAMESPACES=team1,team2 # Optional: Slack (for Slack tool demo) SLACK_BOT_TOKEN=xoxb-xxx
Cluster Configuration
Edit
kagenti/installer/app/config.py:
CLUSTER_NAME = "agent-platform" # Kind cluster name DOMAIN_NAME = "localtest.me" # Domain for services CONTAINER_ENGINE = "docker" # or "podman"
Manual Step-by-Step Deployment (Advanced)
For debugging or understanding the installer:
# 1. Create Kind cluster manually cat <<EOF | kind create cluster --name agent-platform --config=- kind: Cluster apiVersion: kind.x-k8s.io/v1alpha4 nodes: - role: control-plane extraPortMappings: - containerPort: 30080 hostPort: 8080 - containerPort: 30443 hostPort: 9443 EOF # 2. Set kubeconfig context kubectl config use-context kind-agent-platform # 3. Install components one by one cd kagenti/installer # Install Tekton kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/previous/v0.66.0/release.yaml # Install Cert-Manager kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.16.2/cert-manager.yaml # ... (see installer code for full sequence)
Related Skills
- k8s:health: Check comprehensive platform health
- k8s:logs: Query logs for debugging
- k8s:pods: Debug pod issues
Pro Tips
- Use --use-existing-cluster: Faster reinstalls without recreating cluster
- Skip components: Use --skip-install for faster iteration
- Multiple clusters: Use different cluster names for parallel testing
- Resource allocation: Ensure Docker/Podman has enough RAM (16GB recommended)
- Cache images: Pulled images are cached - subsequent installs are faster
- Silent mode: Use --silent to skip interactive prompts
- Check logs: If installer fails, check pod logs in kagenti-system namespace
Common Workflows
Daily Development
# Use existing cluster, skip slow components cd kagenti/installer uv run kagenti-installer --use-existing-cluster \ --skip-install addons \ --skip-install keycloak
Full Test Before PR
# Fresh cluster, all components, run tests kind delete cluster --name agent-platform cd kagenti/installer uv run kagenti-installer --silent kubectl apply -f kagenti/examples/components/ .github/scripts/verify_deployment.sh cd kagenti && uv run pytest tests/e2e/test_deployment_health.py -v
Quick Agent Testing
# Minimal platform, just enough for agents cd kagenti/installer uv run kagenti-installer \ --skip-install addons \ --skip-install ui \ --skip-install keycloak kubectl apply -f kagenti/examples/components/