Skillshub castai-cost-tuning

install

source · Clone the upstream repo

git clone https://github.com/ComeOnOliver/skillshub

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/ComeOnOliver/skillshub "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/jeremylongshore/claude-code-plugins-plus-skills/castai-cost-tuning" ~/.claude/skills/comeonoliver-skillshub-castai-cost-tuning && rm -rf "$T"

manifest: skills/jeremylongshore/claude-code-plugins-plus-skills/castai-cost-tuning/SKILL.md

CAST AI Cost Tuning

Overview

Maximize Kubernetes cost savings through CAST AI: spot instance strategies, workload right-sizing, cluster hibernation, and savings tracking. Typical savings: 50-70% on cloud compute costs.

Prerequisites

CAST AI Phase 2 enabled with full automation
Savings report available (requires 24h+ of data)
Understanding of workload criticality tiers

Instructions

Step 1: Analyze Current Savings

# Get savings breakdown
curl -s -H "X-API-Key: ${CASTAI_API_KEY}" \
  "https://api.cast.ai/v1/kubernetes/clusters/${CASTAI_CLUSTER_ID}/savings" \
  | jq '{
    currentMonthlyCost: .currentMonthlyCost,
    optimizedMonthlyCost: .optimizedMonthlyCost,
    monthlySavings: .monthlySavings,
    savingsPercentage: .savingsPercentage,
    spotSavings: .spotSavings,
    rightSizingSavings: .rightSizingSavings
  }'

Step 2: Maximize Spot Usage

# Enable aggressive spot with diversity and fallbacks
curl -X PUT -H "X-API-Key: ${CASTAI_API_KEY}" \
  -H "Content-Type: application/json" \
  "https://api.cast.ai/v1/kubernetes/clusters/${CASTAI_CLUSTER_ID}/policies" \
  -d '{
    "enabled": true,
    "spotInstances": {
      "enabled": true,
      "clouds": ["aws"],
      "spotDiversityEnabled": true,
      "spotDiversityPriceIncreaseLimitPercent": 20,
      "spotBackups": {
        "enabled": true,
        "spotBackupRestoreRateSeconds": 600
      }
    }
  }'

Spot allocation strategy by workload tier:

Workload Type	Spot %	Rationale
Batch jobs, CI runners	100% spot	Interruptible, restartable
Stateless APIs (behind LB)	80% spot	Can handle brief interruptions
Stateful services, databases	0% spot	Use on-demand or reserved
ML training	80-100% spot	Checkpointing handles interrupts

Step 3: Workload Right-Sizing

# Get resource waste analysis
curl -s -H "X-API-Key: ${CASTAI_API_KEY}" \
  "https://api.cast.ai/v1/workload-autoscaling/clusters/${CASTAI_CLUSTER_ID}/workloads" \
  | jq '[.items[] | select(.estimatedSavingsPercent > 20) | {
    name: .workloadName,
    namespace: .namespace,
    wastedCpu: (.currentCpuRequest - .recommendedCpuRequest),
    wastedMemory: (.currentMemoryRequest - .recommendedMemoryRequest),
    savingsPercent: .estimatedSavingsPercent
  }] | sort_by(-.savingsPercent) | .[0:10]'

Step 4: Cluster Hibernation (Dev/Staging)

# Hibernate non-production clusters during off-hours
# Scales nodes to zero, resume on demand

# Enable hibernation
curl -X POST -H "X-API-Key: ${CASTAI_API_KEY}" \
  -H "Content-Type: application/json" \
  "https://api.cast.ai/v1/kubernetes/clusters/${CASTAI_CLUSTER_ID}/hibernate" \
  -d '{
    "schedule": {
      "enabled": true,
      "hibernateAt": "20:00",
      "wakeUpAt": "08:00",
      "timezone": "America/New_York",
      "weekdaysOnly": true
    }
  }'

Step 5: Cost Tracking Dashboard

interface CostReport {
  cluster: string;
  period: string;
  currentCost: number;
  optimizedCost: number;
  savings: number;
  spotPercent: number;
}

async function generateMonthlyCostReport(
  clusterIds: string[]
): Promise<CostReport[]> {
  const reports: CostReport[] = [];

  for (const clusterId of clusterIds) {
    const [cluster, savings, nodes] = await Promise.all([
      castaiGet(`/v1/kubernetes/external-clusters/${clusterId}`),
      castaiGet(`/v1/kubernetes/clusters/${clusterId}/savings`),
      castaiGet(`/v1/kubernetes/external-clusters/${clusterId}/nodes`),
    ]);

    const spotNodes = nodes.items.filter(
      (n: { lifecycle: string }) => n.lifecycle === "spot"
    ).length;

    reports.push({
      cluster: cluster.name,
      period: new Date().toISOString().slice(0, 7),
      currentCost: savings.currentMonthlyCost,
      optimizedCost: savings.optimizedMonthlyCost,
      savings: savings.monthlySavings,
      spotPercent:
        nodes.items.length > 0
          ? (spotNodes / nodes.items.length) * 100
          : 0,
    });
  }

  return reports;
}

Cost Optimization Checklist

Spot instances enabled with diversity
Workload autoscaler right-sizing resources
Dev/staging clusters hibernated off-hours
Empty node downscaler enabled
Instance families include latest generation (cheaper)
Reserved/savings plan for baseline on-demand nodes
Weekly savings report review

Error Handling

Issue	Cause	Solution
Savings lower than expected	Too many on-demand constraints	Relax node template constraints
Spot interruptions too frequent	Single instance type	Enable spot diversity
Hibernation not triggering	Schedule timezone wrong	Use IANA timezone format
Right-sizing too aggressive	Low headroom	Increase memory headroom to 20%

Resources

Next Steps

For architecture patterns, see

castai-reference-architecture