Phase5 kagent-health-analyzer

kagent Health Analyzer Skill

install

source · Clone the upstream repo

git clone https://github.com/SyedaNabila559/phase5

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/SyedaNabila559/phase5 "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills/kagent-health-analyzer" ~/.claude/skills/syedanabila559-phase5-kagent-health-analyzer && rm -rf "$T"

manifest: .claude/skills/kagent-health-analyzer/skill.md

source content

kagent Health Analyzer Skill

Purpose

This skill provides cluster-level insights using kagent for analyzing health, optimizing resource allocation, diagnosing pod failures, and generating actionable recommendations based on kagent output.

Capabilities

Execute kagent commands for cluster health analysis
Optimize resource allocation based on cluster usage patterns
Diagnose pod failures and their root causes
Generate comprehensive summaries of kagent output
Provide actionable recommendations for cluster improvements
Interpret kagent analysis results and translate to practical actions

Implementation Details

Cluster Health Analysis

Analyze overall cluster status and component health
Check control plane component statuses (API server, etcd, scheduler, controller manager)
Evaluate node health and resource utilization
Assess networking and storage subsystems
Identify potential bottlenecks or issues

Resource Allocation Optimization

Analyze current resource usage patterns across the cluster
Identify over-provisioned and under-provisioned resources
Recommend resource limit and request adjustments
Suggest node pool scaling based on workload demands
Optimize scheduling efficiency

Pod Failure Diagnostics

Analyze failed pod patterns and common causes
Check for resource contention issues
Identify configuration errors leading to failures
Examine node-related issues affecting pods
Review logs and events for failure patterns

Output Interpretation

Summarize kagent analysis results in a clear format
Highlight critical issues requiring immediate attention
Prioritize recommendations based on severity
Translate technical findings into actionable steps
Provide context for recommendations

Usage

Common kagent Commands:

Cluster Health Analysis:

kagent "analyze cluster health"

Resource Optimization:

kagent "optimize resource allocation"

Pod Failure Investigation:

kagent "why are pods failing"

Comprehensive Analysis:

kagent "perform complete cluster analysis"

Analysis Workflow:

Step 1: Execute kagent command

kagent_output=$(kagent "{analysis_request}")

Step 2: Parse and summarize results

Extract key metrics and findings
Identify critical issues
Group related problems
Determine severity levels

Step 3: Generate recommendations

Immediate actions for critical issues
Medium-term optimizations
Long-term capacity planning suggestions
Best practice implementations

Example Output Format:

Cluster Health Analysis Summary:
- Overall Status: {status}
- Critical Issues Found: {count}
- Resource Utilization: {overview}
- Recommendations: {list}

Detailed Findings:
- {finding_1}: {description}
- {finding_2}: {description}

Recommended Actions:
1. {action_1}: {priority_level}
2. {action_2}: {priority_level}
3. {action_3}: {priority_level}

kagent Command Examples:

For Cluster Health:

kagent "analyze cluster health and provide a summary of any issues"

For Resource Optimization:

kagent "analyze resource allocation and suggest optimizations for CPU and memory"

For Pod Failures:

kagent "investigate recent pod failures and explain the most common causes"

For Capacity Planning:

kagent "analyze current usage trends and recommend scaling actions"

Recommended Actions Framework:

Immediate (0-1 hour):

Critical security patches
Failed components requiring restart
Resource exhaustion issues

Short-term (1-24 hours):

Configuration corrections
Minor resource adjustments
Pod restarts/redeployments

Medium-term (1-7 days):

Resource limit optimizations
Node pool adjustments
Storage optimizations

Long-term (1-4 weeks):

Architecture improvements
Scaling strategy implementations
Monitoring enhancements