Awesome-omni-skill flux-troubleshooting
Use when Flux resources show Ready False, reconciliation errors appear in logs, deployments fail to sync from Git, HelmRelease installations fail, source artifacts are not being fetched, or image automation is not updating tags
install
source · Clone the upstream repo
git clone https://github.com/diegosouzapw/awesome-omni-skill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/diegosouzapw/awesome-omni-skill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/devops/flux-troubleshooting" ~/.claude/skills/diegosouzapw-awesome-omni-skill-flux-troubleshooting && rm -rf "$T"
manifest:
skills/devops/flux-troubleshooting/SKILL.mdsource content
Flux CD Troubleshooting
Diagnose Flux CD reconciliation failures and errors in GitOps environments.
Keywords
flux, fluxcd, troubleshooting, debug, error, failed, failure, reconciliation, kustomization, helmrelease, source, gitrepository, ocirepository, artifact, health check, diagnose, resolve, fix
When to Use This Skill
- Flux resources show
statusReady: False - Reconciliation errors appear in logs
- Deployments are not syncing from Git
- HelmRelease installations are failing
- Source artifacts are not being fetched
- Image automation is not updating tags
Related Skills
- flux-operations - Suspend/resume/reconcile operations
- flux-gitops-patterns - Architecture and patterns
- k8s-platform-operations - Incident response procedures
Quick Reference
| Task | Command |
|---|---|
| Check Flux health | |
| View all errors | |
| Get all status | |
| Warning events | |
Diagnostic Workflow
Is the resource Ready? ├─ Yes → Check if correct version/revision deployed └─ No → Is the source Ready? ├─ Yes → Check Kustomization/HelmRelease logs │ ├─ dry-run failed → Fix YAML syntax in Git │ ├─ health check timeout → Check pod logs/resources │ └─ dependency not ready → Fix parent first └─ No → Check source credentials/connectivity ├─ authentication failed → Update deploy key/secret ├─ checkout failed → Verify branch/tag exists └─ artifact not found → Check repository URL
Diagnostic Commands
1. Check Controller Health
flux check
2. View Error Logs
flux logs --all-namespaces --level=error
3. Get Warning Events
kubectl get events -n flux-system --field-selector type=Warning
4. Inspect Resource Status
flux get kustomizations -A flux get helmreleases -A flux get sources all -A
5. Controller-Specific Logs
kubectl logs -n flux-system deploy/source-controller kubectl logs -n flux-system deploy/kustomize-controller kubectl logs -n flux-system deploy/helm-controller kubectl logs -n flux-system deploy/notification-controller kubectl logs -n flux-system deploy/image-reflector-controller kubectl logs -n flux-system deploy/image-automation-controller
Error Pattern Reference
| Error Pattern | Cause | GitOps Solution |
|---|---|---|
| Git authentication or network | Verify deploy keys/credentials |
| Invalid manifest YAML | Fix syntax in Git repository |
| Pods not becoming ready | Check resource limits, images |
| Parent Kustomization failed | Fix upstream dependency first |
| Source not synced | Check source status, reconcile |
| Invalid field value | Correct the value in Git |
| Registry auth failed | Check imagePullSecrets |
| OCI tag doesn't exist | Verify tag in registry |
OCI Repository Troubleshooting
Common OCI Issues
flux get sources oci -A kubectl logs -n flux-system deploy/source-controller | grep -i oci kubectl get secret -n flux-system flux-system -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d
OCI Error Patterns
| Error | Cause | Diagnostic |
|---|---|---|
| Invalid credentials | Check docker-registry secret exists and is not expired; recommend updating secret |
| Tag not found | Verify tag exists in registry |
| Permission denied | Check registry permissions for the service account |
| Repository not found | Verify repository path in OCIRepository spec |
Image Automation Troubleshooting
Check Image Policies
flux get images policy -A flux get images repository -A flux get images update -A
Image Automation Errors
| Error | Cause | Diagnostic |
|---|---|---|
| Filter too restrictive | Check semver/regex pattern against available tags; recommend adjusting filter |
| Registry auth | Check image pull secrets configuration |
| Git write access | Check if deploy key has write permission |
Webhook/Notification Debugging
kubectl logs -n flux-system deploy/notification-controller flux get alerts -A flux get alert-providers -A kubectl logs -n flux-system deploy/notification-controller | grep -i receiver
Structured Log Analysis
{ "level": "error", "ts": "2024-01-15T09:36:41.286Z", "controllerGroup": "kustomize.toolkit.fluxcd.io", "controllerKind": "Kustomization", "name": "resource-name", "namespace": "namespace", "msg": "Reconciliation failed after 2s, next try in 5m0s", "revision": "main@sha1:abc123", "error": "specific error message" }
GitOps Principles for Troubleshooting
- Never fix directly in cluster - Identify root cause, fix in Git
- Suspend before debugging - Prevent Flux from reverting test changes
- Check the full dependency chain - Infrastructure before apps
- Verify source availability - Git repos and registries must be accessible
- Review recent commits - Issues often correlate with recent changes
MCP Tools Available
When the Flux MCP server is connected:
- Get Flux installation statusmcp__flux-operator-mcp__get_flux_instance
- Get pod logsmcp__flux-operator-mcp__get_kubernetes_logs
- Query resourcesmcp__flux-operator-mcp__get_kubernetes_resources
- Search Flux documentationmcp__flux-operator-mcp__search_flux_docs