Claude-skill-registry agent-cost-optimizer
Real-time cost tracking, budget enforcement, and ROI measurement for AI agent operations. Track token usage, predict costs, enforce budget caps ($50-70/month typical), optimize model selection, cache results, measure cost-to-value. Use when tracking AI costs, preventing budget overruns, optimizing spend, measuring ROI, or ensuring cost-effective AI operations.
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/agent-cost-optimizer" ~/.claude/skills/majiayu000-claude-skill-registry-agent-cost-optimizer && rm -rf "$T"
skills/data/agent-cost-optimizer/SKILL.mdAgent Cost Optimizer
Overview
agent-cost-optimizer provides comprehensive cost tracking, budget enforcement, and ROI measurement for AI agent operations.
Purpose: Control and optimize AI spending while maximizing value delivered
Pattern: Task-based (7 operations for cost management)
Key Innovation: Real-time cost tracking with automatic budget enforcement and cost-effective fallbacks
Industry Context (2025):
- Average AI spending: $85,521/month (36% YoY increase)
- Only 50% of organizations can measure AI ROI
- IT teams struggle with hidden costs
Solution: Comprehensive cost management from tracking to optimization
When to Use
Use agent-cost-optimizer when:
- Tracking AI costs across skills
- Preventing budget overruns
- Measuring ROI (cost vs. value delivered)
- Optimizing model selection (Opus vs. Sonnet vs. Haiku)
- Planning AI budgets
- Cost-effective development
- Enterprise cost accountability
Prerequisites
Required
- AI API access (Anthropic, OpenAI, Google)
- Cost tracking capability (API usage data)
Optional
- Paid.ai or AgentOps integration (advanced cost tracking)
- Prometheus/Grafana (cost visualization)
- Budget approval workflow
Cost Operations
Operation 1: Track Token Usage
Purpose: Monitor token consumption per skill invocation
Process:
-
Initialize Tracking:
{ "tracking_id": "track_20250126_1200", "skill": "multi-ai-verification", "started_at": "2025-01-26T12:00:00Z", "tokens": { "prompt": 0, "completion": 0, "total": 0 }, "cost": { "amount_usd": 0.00, "model": "claude-sonnet-4-5" } } -
Track During Execution:
// After each AI call trackTokens({ prompt_tokens: response.usage.input_tokens, completion_tokens: response.usage.output_tokens, model: 'claude-sonnet-4-5' }); // Update running totals -
Finalize Tracking:
{ "tracking_id": "track_20250126_1200", "skill": "multi-ai-verification", "completed_at": "2025-01-26T12:45:00Z", "duration_minutes": 45, "tokens": { "prompt": 15234, "completion": 8932, "total": 24166 }, "cost": { "amount_usd": 0.073, "model": "claude-sonnet-4-5", "rate": "$3 per million tokens" } } -
Save to Cost Log:
# Append to daily cost log cat tracking.json >> .cost-tracking/$(date +%Y-%m-%d).json
Outputs:
- Token usage per invocation
- Cost per invocation
- Model used
- Daily/monthly aggregates
Validation:
- Tracking initialized
- Tokens counted accurately
- Cost calculated correctly
- Logs saved
Time Estimate: Automatic (integrated into skills)
Operation 2: Calculate Costs
Purpose: Compute accurate costs based on provider pricing
Pricing (2025 rates):
Anthropic (Claude):
| Model | Input (per MTok) | Output (per MTok) |
|---|---|---|
| Claude Opus 4.5 | $15 | $75 |
| Claude Sonnet 4.5 | $3 | $15 |
| Claude Haiku 4.5 | $0.80 | $4 |
OpenAI (Codex):
| Model | Input | Output |
|---|---|---|
| GPT-5.1-codex | $5 | $15 |
| o3 | $10 | $40 |
| o4-mini | $1.50 | $6 |
Google (Gemini):
| Model | Input | Output |
|---|---|---|
| Gemini 2.5 Pro | $1.25 | $5 |
| Gemini 2.5 Flash | $0.15 | $0.60 |
Process:
function calculateCost(usage, model) { const pricing = { 'claude-sonnet-4-5': { input: 3, output: 15 }, 'claude-haiku-4-5': { input: 0.80, output: 4 }, 'claude-opus-4-5': { input: 15, output: 75 }, // ... more models }; const rates = pricing[model]; const inputCost = (usage.prompt_tokens / 1_000_000) * rates.input; const outputCost = (usage.completion_tokens / 1_000_000) * rates.output; return { input_cost: inputCost, output_cost: outputCost, total_cost: inputCost + outputCost, currency: 'USD' }; }
Outputs:
- Accurate cost per invocation
- Model-specific pricing
- Input vs. output cost breakdown
Operation 3: Enforce Budget Caps
Purpose: Prevent exceeding monthly budget limits
Process:
-
Set Budget:
{ "monthly_budget_usd": 100, "skill_budgets": { "multi-ai-verification": 70, "multi-ai-research": 20, "multi-ai-testing": 10 }, "alert_thresholds": { "warning": 0.80, "critical": 0.95 } } -
Check Before Operation:
async function checkBudget(skill, estimated_cost) { const usage = getCurrentMonthUsage(); const remaining = budget.monthly_budget_usd - usage.total_cost; if (estimated_cost > remaining) { return { allowed: false, reason: `Budget exceeded: $${usage.total_cost}/$${budget.monthly_budget_usd}`, overage: estimated_cost - remaining }; } if (usage.total_cost / budget.monthly_budget_usd > 0.80) { return { allowed: true, warning: `80% of monthly budget used ($${usage.total_cost}/$${budget.monthly_budget_usd})` }; } return { allowed: true }; } -
Enforce:
const budgetCheck = await checkBudget('multi-ai-verification', 0.50); if (!budgetCheck.allowed) { console.log(`❌ Budget exceeded: ${budgetCheck.reason}`); console.log(`Options:`); console.log(` 1. Use cheaper model (Sonnet → Haiku)`); console.log(` 2. Skip optional verification layers`); console.log(` 3. Request budget increase`); return; } if (budgetCheck.warning) { console.log(`⚠️ ${budgetCheck.warning}`); } // Proceed with operation
Outputs:
- Budget status checked
- Operations blocked if over budget
- Warnings at 80%
- Cost-effective alternatives suggested
Operation 4: Optimize Model Selection
Purpose: Choose cost-effective model for each task
Decision Matrix:
| Task Type | Recommended Model | Cost | Rationale |
|---|---|---|---|
| Simple verification (Layer 1-2) | Haiku | $ | Rules-based, fast, cheap |
| Code generation | Sonnet | $$ | Balanced quality/cost |
| Complex reasoning (architecture) | Opus | $$$ | Best quality, worth premium |
| LLM-as-judge | Sonnet or external model | $$ | Good judgment, reasonable cost |
| Test generation | Sonnet | $$ | Comprehensive coverage needed |
| Research | Sonnet/Haiku mix | $-$$ | Haiku for search, Sonnet for synthesis |
Auto-Optimization:
function selectModel(task_type, criticality, budget_remaining) { // Critical + budget OK → Use Opus if (criticality === 'critical' && budget_remaining > 20) { return 'claude-opus-4-5'; } // Standard → Use Sonnet if (criticality === 'standard') { return 'claude-sonnet-4-5'; } // Budget low or simple task → Use Haiku if (budget_remaining < 5 || task_type === 'simple') { return 'claude-haiku-4-5'; } return 'claude-sonnet-4-5'; // Default }
Outputs:
- Optimal model selected
- Cost minimized
- Quality maintained
Operation 5: Cost-Effective Caching
Purpose: Avoid re-computing identical operations
Process:
-
Cache Key Generation:
function generateCacheKey(operation, inputs) { // Hash inputs to create unique key const content_hash = crypto .createHash('sha256') .update(JSON.stringify(inputs)) .digest('hex'); return `${operation}_${content_hash}`; } -
Check Cache Before Operation:
const cacheKey = generateCacheKey('verify_code', { files: ['src/auth.ts'], file_hashes: {'src/auth.ts': 'abc123'} }); const cached = readCache(cacheKey); if (cached && !isExpired(cached, 24)) { // Use cached result console.log('📦 Using cached verification result'); return cached.result; } // Cache miss → run verification const result = await runVerification(); // Save to cache saveCache(cacheKey, result, ttl: 24 hours); -
Cache Structure:
{ "cache_key": "verify_code_abc123def456", "created_at": "2025-01-26T12:00:00Z", "expires_at": "2025-01-27T12:00:00Z", "inputs": { "files": ["src/auth.ts"], "file_hashes": {"src/auth.ts": "abc123"} }, "result": { "quality_score": 92, "layers_passed": 5, "issues": [] }, "cost_saved": 0.073 }
Outputs:
- 90% reduction in re-verification costs
- Instant results for unchanged code
- Cache hit/miss tracking
Validation:
- Cache key correctly identifies identical operations
- File changes invalidate cache
- Expired cache not used
- Cost savings tracked
Operation 6: Measure ROI
Purpose: Calculate return on investment for AI spending
Process:
-
Track Time Saved:
{ "task": "Implement user authentication", "without_ai": { "estimated_hours": 40, "developer_rate": 100, "total_cost": 4000 }, "with_ai": { "actual_hours": 11.3, "developer_rate": 100, "developer_cost": 1130, "ai_cost": 2.50, "total_cost": 1132.50 }, "roi": { "time_saved_hours": 28.7, "cost_saved": 2867.50, "roi_percentage": 253, "payback_period_hours": 0.025 } } -
Calculate ROI:
function calculateROI(task) { const time_saved = task.without_ai.estimated_hours - task.with_ai.actual_hours; const cost_saved = task.without_ai.total_cost - task.with_ai.total_cost; const roi_percentage = (cost_saved / task.with_ai.ai_cost) * 100; return { time_saved_hours: time_saved, cost_saved_usd: cost_saved, roi_percentage: roi_percentage, payback_period: task.with_ai.ai_cost / (task.without_ai.developer_rate * (time_saved / 40)) // weeks }; } -
Monthly ROI Report:
# Monthly ROI Report - January 2025 ## AI Spending - Total AI costs: $87.50 - Breakdown: - multi-ai-verification: $62.30 (71%) - multi-ai-research: $18.40 (21%) - multi-ai-testing: $6.80 (8%) ## Time Savings - Tasks completed: 8 - Total hours saved: 156 hours - Average savings per task: 19.5 hours ## Cost Savings - Developer cost avoided: $15,600 (156h × $100/h) - AI costs: $87.50 - Net savings: $15,512.50 ## ROI - ROI: 17,728% ($177 saved per $1 spent) - Payback period: 0.03 weeks (immediate) - Value multiplier: 178x ## Recommendations - ✅ Current spending highly cost-effective - Continue using AI for all qualifying tasks - Consider increasing budget (high ROI)
Outputs:
- Comprehensive ROI analysis
- Time and cost savings quantified
- Monthly reports
- Business justification for AI spending
Operation 7: Predict Costs
Purpose: Estimate costs before starting expensive operations
Process:
-
Historical Data:
{ "operation": "multi-ai-verification", "mode": "all_5_layers", "historical_costs": [ {"date": "2025-01-15", "tokens": 24166, "cost": 0.073}, {"date": "2025-01-18", "tokens": 21893, "cost": 0.066}, {"date": "2025-01-20", "tokens": 26543, "cost": 0.080} ], "avg_cost": 0.073, "std_dev": 0.007 } -
Predict Before Operation:
const prediction = predictCost('multi-ai-verification', { mode: 'all_5_layers', code_size_lines: 850 }); console.log(`💰 Estimated cost: $${prediction.estimated_cost} ± $${prediction.std_dev}`); console.log(` Range: $${prediction.min_cost} - $${prediction.max_cost}`); console.log(` Confidence: ${prediction.confidence}%`); // Check budget if (prediction.estimated_cost > budget_remaining) { console.log(`⚠️ Estimated cost exceeds remaining budget`); console.log(`Options:`); console.log(` 1. Use Haiku (estimated: $${prediction.estimated_cost * 0.27})`); console.log(` 2. Skip Layer 5 (save ~60%: $${prediction.estimated_cost * 0.4})`); console.log(` 3. Increase budget`); }
Outputs:
- Cost predictions with confidence intervals
- Budget impact assessment
- Cost-effective alternatives suggested
Cost Optimization Strategies
Strategy 1: Model Selection
Baseline (All Sonnet): $100/month
Optimized (Smart selection):
- Layer 1-2: Haiku ($20)
- Layer 3-4: Sonnet ($40)
- Layer 5: Sonnet ($30)
- Total: $90/month (10% savings)
Aggressive (Maximum savings):
- Layer 1-2: Haiku ($20)
- Layer 3-4: Haiku ($15)
- Layer 5: Sonnet ($30)
- Total: $65/month (35% savings)
Trade-off: Some quality reduction at Layers 3-4
Strategy 2: Caching
Without Caching: Re-verify same code multiple times
With Caching (24-hour TTL):
- First verification: $0.073
- Same code next 24h: $0 (cache hit)
- Savings: 90% on unchanged code
Implementation:
const cacheKey = hash(files_to_verify); const cached = getCache(cacheKey); if (cached && !isExpired(cached, 24)) { return cached.result; // $0 cost } const result = await verify(); // $0.073 cost saveCache(cacheKey, result);
Strategy 3: Ensemble Optimization
Baseline (Always 5-agent ensemble):
- 5 agents × $0.073 = $0.365 per verification
Optimized (Conditional ensemble):
- Critical features: 5 agents ($0.365)
- Standard features: 3 agents ($0.219)
- Simple changes: 1 agent ($0.073)
- Average savings: 60%
Decision Logic:
function shouldUseEnsemble(criticality, code_size, budget) { if (criticality === 'critical') return 5; if (criticality === 'high' && budget > 20) return 3; return 1; // Single agent }
Strategy 4: Layer Skipping
Full Verification (All 5 layers): ~$0.073
Fast-Track (Layers 1-2 only):
- Rules + Functional only
- ~$0.015 (80% savings)
- Use for: minor changes, docs, config
Decision:
if (lines_changed < 50 && files_changed.every(f => !isCritical(f))) { // Fast-track: Layers 1-2 only mode = 'fast_track'; estimated_cost = 0.015; } else { // Full verification mode = 'standard'; estimated_cost = 0.073; }
Budget Management
Monthly Budget Planning
Sample Budget ($100/month):
{ "monthly_budget_usd": 100, "allocation": { "multi-ai-verification": { "budget": 70, "rationale": "Most expensive (LLM-as-judge)" }, "multi-ai-research": { "budget": 20, "rationale": "Occasional use, tri-AI" }, "multi-ai-testing": { "budget": 10, "rationale": "Mostly automated" }, "buffer": 10 }, "assumptions": { "features_per_month": 8, "verifications_per_feature": 1.5, "research_per_month": 2 } }
Budget Tracking
Daily:
# Check today's spending cat .cost-tracking/$(date +%Y-%m-%d).json | jq '[.[] | .cost.amount_usd] | add' # Output: $3.45 today
Monthly:
# Check month-to-date cat .cost-tracking/2025-01-*.json | jq '[.[] | .cost.amount_usd] | add' # Output: $67.80 this month (68% of budget)
Projection:
// Project end-of-month const days_elapsed = 26; const days_in_month = 31; const current_spend = 67.80; const projected = (current_spend / days_elapsed) * days_in_month; // = $81.14 projected (within budget ✅)
Cost Alerting
Alert Levels
80% Budget (Warning):
⚠️ BUDGET ALERT: 80% Used **Current**: $80.00 / $100.00 (80%) **Remaining**: $20.00 **Days left**: 5 **Projected EOMs**: $93.75 (within budget) **Recommendations**: - Monitor spending closely - Use Haiku for simple tasks - Cache aggressively - Skip optional layers where safe
95% Budget (Critical):
🚨 CRITICAL: 95% Budget Used **Current**: $95.00 / $100.00 (95%) **Remaining**: $5.00 **Days left**: 5 **Projected EOM**: $110 (OVER BUDGET) **Actions Required**: 1. Pause non-critical verifications 2. Use Haiku exclusively 3. Request budget increase OR 4. Defer work to next month **Auto-throttling**: Enabled - Only critical operations allowed - All optional layers disabled - Ensemble verification disabled
Budget Exceeded:
❌ BUDGET EXCEEDED **Current**: $102.50 / $100.00 (102.5%) **Overage**: $2.50 **Operations BLOCKED** until: 1. Budget increased OR 2. Next month (resets automatically) **Emergency Override**: Requires approval
ROI Calculation Framework
Value Metrics
Quantifiable Value:
- Time saved (hours)
- Cost saved (developer time avoided)
- Quality improvement (fewer bugs in production)
- Faster time-to-market (days)
Formula:
ROI = ((Value Delivered - AI Costs) / AI Costs) × 100%
Example:
Feature without AI: 40 hours × $100/hour = $4,000 Feature with AI: 11.3 hours × $100/hour + $2.50 AI = $1,132.50 Value Delivered = $4,000 - $1,130 = $2,870 AI Costs = $2.50 ROI = ($2,870 / $2.50) × 100% = 114,800%
Monthly Reporting
Template:
# AI ROI Report - January 2025 ## Summary - **AI Spending**: $87.50 - **Value Delivered**: $15,600 (156 hours saved × $100/hour) - **ROI**: 17,728% - **Payback**: Immediate ## Details ### Features Delivered (8) 1. User authentication - 11.3h (was 40h), ROI: 114,800% 2. Payment integration - 8.5h (was 32h), ROI: 108,235% [... more ...] ### Cost Breakdown - Verification: $62.30 (71%) - Highest cost, highest value - Research: $18.40 (21%) - Occasional, high impact - Testing: $6.80 (8%) - Mostly automated, low cost ### Savings - Time: 156 hours saved - Cost: $15,512.50 net savings - Quality: 4 bugs prevented (saved ~20 hours) ### Recommendations - ✅ ROI is excellent (17,728%) - Consider increasing budget (high returns) - Current spending optimal
Quick Reference
Cost Operations
| Operation | Purpose | Time | Automation |
|---|---|---|---|
| Track | Monitor token usage | Automatic | 100% |
| Calculate | Compute costs | Automatic | 100% |
| Enforce | Budget caps | Automatic | 100% |
| Optimize | Model selection | Semi-auto | 70% |
| Cache | Avoid re-compute | Automatic | 100% |
| Measure ROI | Value analysis | Manual | 30% |
| Predict | Cost estimation | Automatic | 90% |
Cost Optimization Strategies
| Strategy | Savings | Trade-off | Recommended For |
|---|---|---|---|
| Model selection | 10-35% | Some quality loss | All features |
| Caching | 90% | Stale results risk | Unchanged code |
| Ensemble optimization | 60% | Lower confidence | Non-critical |
| Layer skipping | 80% | Less thorough | Minor changes |
Budget Thresholds
- < 80%: Normal operation
- 80-95%: Warning, optimize
- 95-100%: Critical, throttle
- > 100%: Block operations
agent-cost-optimizer ensures cost-effective AI operations through real-time tracking, budget enforcement, model optimization, and ROI measurement - preventing budget overruns while maximizing value delivered.
For cost reports, see examples/. For optimization strategies, see Cost Optimization Strategies section.