Agent-almanac forecast-operational-metrics
git clone https://github.com/pjt222/agent-almanac
T=$(mktemp -d) && git clone --depth=1 https://github.com/pjt222/agent-almanac "$T" && mkdir -p ~/.claude/skills && cp -r "$T/i18n/es/skills/forecast-operational-metrics" ~/.claude/skills/pjt222-agent-almanac-forecast-operational-metrics-348471 && rm -rf "$T"
i18n/es/skills/forecast-operational-metrics/SKILL.mdForecast Operational Metrics
Predict future resource usage and system metrics for capacity planning and cost optimization.
See Extended Examples for complete configuration files and templates.
Cuándo Usar
- Need to forecast infrastructure capacity needs (CPU, memory, disk, network)
- Planning hardware/cloud resource procurement for next quarter
- Want to predict cost trends and optimize cloud spending
- Need to set up proactive scaling policies based on predicted load
- Forecasting user traffic for event planning
- Predicting database storage growth for backup planning
- Estimating API usage for rate limiting configuration
Entradas
- Requerido: Historical time series metrics (3-12 months minimum)
- Requerido: Metric type (CPU, memory, requests/sec, costs, etc.)
- Requerido: Forecast horizon (days, weeks, or months ahead)
- Opcional: Known future events (deployments, marketing campaigns, holidays)
- Opcional: Seasonality information (daily, weekly, yearly patterns)
- Opcional: External regressors (e.g., marketing spend, user signups)
Procedimiento
Paso 1: Set Up Environment and Load Data
Install forecasting libraries and prepare time series data.
# Create virtual environment python -m venv venv source venv/bin/activate # Install forecasting libraries pip install prophet statsmodels pandas numpy pip install plotly matplotlib seaborn pip install prometheus-api-client influxdb-client pip install grafana-api
Load and prepare data with MetricsLoader:
# forecasting/data_loader.py (abbreviated) import pandas as pd from datetime import datetime, timedelta class MetricsLoader: def load_from_prometheus(self, query: str, lookback_days: int = 90, step: str = "1h"): """Load historical metrics from Prometheus.""" # ... implementation (see EXAMPLES.md for complete code) def resample_and_aggregate(self, df: pd.DataFrame, freq: str = "1H"): """Resample time series to regular intervals.""" # ... implementation (see EXAMPLES.md) # Example usage loader = MetricsLoader(prometheus_url="http://prometheus:9090") df = loader.load_from_prometheus( query='avg(rate(container_cpu_usage_seconds_total[5m]))', lookback_days=90, ) df_daily = loader.resample_and_aggregate(df, freq="1D")
See EXAMPLES.md Step 1 for the complete MetricsLoader implementation.
Esperado: Time series data loaded with regular intervals, missing values filled, ready for forecasting.
En caso de fallo: If data gaps exist, use forward-fill or interpolation, ensure lookback period has sufficient data (90+ days recommended), verify timestamp timezone consistency, check for outliers (>5 sigma) that may skew forecasts.
Paso 2: Implement Prophet Forecasting
Use Facebook Prophet for automatic seasonality detection and forecasting.
# forecasting/prophet_forecaster.py (abbreviated) from prophet import Prophet class ProphetForecaster: def __init__(self, growth: str = "linear", seasonality_mode: str = "multiplicative"): self.growth = growth self.prophet_params = { "growth": growth, "seasonality_mode": seasonality_mode, # ... additional parameters (see EXAMPLES.md) } def fit(self, df: pd.DataFrame, regressors=None, holidays=None): """Train Prophet model on historical data.""" # ... implementation (see EXAMPLES.md) def forecast(self, periods: int, freq: str = "D"): """Generate forecast for future periods.""" # ... implementation (see EXAMPLES.md) # Example usage forecaster = ProphetForecaster(growth="linear", seasonality_mode="multiplicative") forecaster.fit(df_daily) forecast = forecaster.forecast(periods=30, freq="D") forecaster.plot_forecast(forecast, save_path="results/cpu_forecast.png")
See EXAMPLES.md Step 2 for the complete ProphetForecaster implementation.
Esperado: Forecast generated for 30+ days ahead with confidence intervals, seasonal patterns captured in components plot, cross-validation MAPE < 15%.
En caso de fallo: If forecast looks unrealistic, try different growth model (linear vs logistic), if seasonality missing adjust seasonality_mode, if accuracy poor (<70% MAPE) add more historical data or external regressors, check for data quality issues.
Paso 3: Implement ARIMA/SARIMAX Forecasting (Alternative)
Use statsmodels for traditional time series forecasting.
# forecasting/arima_forecaster.py (abbreviated) from statsmodels.tsa.statespace.sarimax import SARIMAX class ARIMAForecaster: def __init__(self, order: tuple = (1, 1, 1), seasonal_order: tuple = (1, 1, 1, 7)): self.order = order self.seasonal_order = seasonal_order def fit(self, df: pd.DataFrame, exog=None): """Train SARIMAX model.""" series = df.set_index("timestamp")["value"] self.model = SARIMAX(series, exog=exog, order=self.order, seasonal_order=self.seasonal_order) self.fitted_model = self.model.fit(disp=False) # ... implementation (see EXAMPLES.md) def forecast(self, steps: int, exog_future=None): """Generate forecast for future periods.""" # ... implementation (see EXAMPLES.md) # Auto-select parameters best_order, best_seasonal = auto_arima(series, seasonal=True) forecaster = ARIMAForecaster(order=best_order, seasonal_order=best_seasonal) forecaster.fit(df_hourly) forecast = forecaster.forecast(steps=168) # 7 days
See EXAMPLES.md Step 3 for the complete ARIMAForecaster implementation and auto_arima function.
Esperado: ARIMA model fitted with optimal parameters, forecast generated with confidence intervals, diagnostic plots show white noise residuals.
En caso de fallo: If model doesn't converge, simplify parameters (reduce p, q, P, Q), if forecast has wrong trend check differencing order (d, D), if residuals not white noise add more AR/MA terms, ensure series length >2x seasonal period.
Paso 4: Identify Capacity Thresholds and Alerts
Analyze forecast to predict when resources will be exhausted.
# forecasting/capacity_planning.py (abbreviated) from datetime import datetime class CapacityPlanner: def __init__(self, capacity_limit: float, warning_threshold: float = 0.8): self.capacity_limit = capacity_limit self.warning_threshold = warning_threshold def find_exhaustion_date(self, forecast: pd.DataFrame): """Find when forecast exceeds capacity limit.""" exceeded = forecast[forecast["yhat"] >= self.capacity_limit] # ... implementation (see EXAMPLES.md) def generate_capacity_report(self, forecast: pd.DataFrame): """Generate comprehensive capacity planning report.""" # ... implementation (see EXAMPLES.md) # Example usage planner = CapacityPlanner(capacity_limit=1000, warning_threshold=0.8) report = planner.generate_capacity_report(forecast) print(f"Warning Date: {report['warning_date']}") print(f"Exhaustion Date: {report['exhaustion_date']}") recommendation = planner.recommend_scaling_action(report)
See EXAMPLES.md Step 4 for the complete CapacityPlanner implementation.
Esperado: Report shows when capacity limits will be reached, recommendations provided with urgency levels, growth rates calculated.
En caso de fallo: If exhaustion date unrealistic, verify capacity_limit is correct, if growth rate too high check for outliers in historical data, consider non-linear growth models for mature systems.
Paso 5: Visualize Forecasts in Grafana
Push forecast data to Grafana for real-time monitoring.
# forecasting/grafana_integration.py (abbreviated) import requests class GrafanaForecaster: def __init__(self, grafana_url: str, api_key: str, dashboard_uid: str = None): self.grafana_url = grafana_url.rstrip("/") self.api_key = api_key self.dashboard_uid = dashboard_uid def create_annotation(self, text: str, tags: list, time: datetime = None): """Create annotation in Grafana for forecast events.""" # ... implementation (see EXAMPLES.md) def create_capacity_alert_annotation(self, capacity_report: dict): """Create Grafana annotation for capacity warnings.""" # ... implementation (see EXAMPLES.md) # Export to CSV for Grafana datasource def export_forecast_to_csv(forecast: pd.DataFrame, output_path: str): """Export forecast in format compatible with Grafana CSV datasource.""" # ... implementation (see EXAMPLES.md) # Example usage grafana = GrafanaForecaster( grafana_url="http://grafana:3000", api_key="YOUR_API_KEY", dashboard_uid="your-dashboard-uid", ) grafana.create_capacity_alert_annotation(report) export_forecast_to_csv(forecast, "grafana/forecasts/cpu_forecast.csv")
See EXAMPLES.md Step 5 for the complete GrafanaForecaster implementation.
Esperado: Forecast annotations appear in Grafana dashboards, capacity warnings visible as vertical markers, forecast data accessible via CSV datasource.
En caso de fallo: Verify Grafana API key has correct permissions, check dashboard UID is correct, ensure timestamps in milliseconds for annotations, test API with curl before integrating.
Paso 6: Automate Forecast Generation
Set up scheduled jobs to generate forecasts regularly.
# forecasting/scheduler.py (abbreviated) import schedule import time def generate_daily_forecast(): """Generate forecast for all monitored metrics.""" logger.info("Starting daily forecast generation") metrics_config = [ {"name": "cpu_usage", "query": "...", "capacity_limit": 0.8, "forecast_days": 30}, {"name": "memory_usage", "query": "...", "capacity_limit": 32, "forecast_days": 30}, {"name": "disk_usage", "query": "...", "capacity_limit": 500, "forecast_days": 90}, ] loader = MetricsLoader(prometheus_url="http://prometheus:9090") for metric_config in metrics_config: df = loader.load_from_prometheus(query=metric_config["query"], lookback_days=90) forecaster = ProphetForecaster() forecaster.fit(df) forecast = forecaster.forecast(periods=metric_config["forecast_days"]) planner = CapacityPlanner(capacity_limit=metric_config["capacity_limit"]) report = planner.generate_capacity_report(forecast) export_forecast_to_csv(forecast, f"grafana/forecasts/{metric_config['name']}_forecast.csv") # ... (see EXAMPLES.md for complete implementation) # Schedule daily at 2 AM schedule.every().day.at("02:00").do(generate_daily_forecast) while True: schedule.run_pending() time.sleep(60)
See EXAMPLES.md Step 6 for the complete scheduler implementation.
Esperado: Forecasts generated daily for all metrics, capacity reports logged, CSV files exported for Grafana, alerts sent for critical capacity warnings.
En caso de fallo: Verify scheduler process runs continuously (use systemd/supervisor), check Prometheus connectivity, ensure sufficient disk space for forecast exports, implement retry logic for transient failures, set up monitoring for scheduler itself.
Validación
- Historical data loaded with 90+ days of continuous metrics
- Prophet forecast captures daily/weekly seasonality in components plot
- Forecast confidence intervals contain 85-95% of actual values in validation
- Capacity exhaustion dates calculated correctly for known scenarios
- ARIMA model residuals appear as white noise in diagnostic plots
- Grafana annotations appear at predicted warning/exhaustion dates
- Automated forecasting runs daily without manual intervention
- Forecast accuracy (MAPE) < 15% on validation set
Errores Comunes
- Insufficient historical data: Need 3-12 months for reliable seasonality detection; avoid forecasting with <60 days
- Ignoring known events: Holidays, deployments, marketing campaigns skew forecasts; add as external regressors or holidays
- Overconfidence in long-term forecasts: Accuracy degrades beyond 30-90 days; use as directional guidance, not exact predictions
- Static capacity limits: Infrastructure changes over time; update capacity_limit when adding resources
- Forecasting anomalies: Outliers in training data propagate to forecast; clean data or use robust methods
- Not updating models: Forecasts stale after system changes; retrain weekly or after significant architecture changes
- Ignoring confidence intervals: Point forecasts misleading; always use lower/upper bounds for planning
- Wrong seasonality period: Daily for hourly data, weekly for daily data; mismatch causes poor forecasts
Habilidades Relacionadas
- Anomaly detection complements forecasting for proactive monitoringdetect-anomalies-aiops
- Infrastructure capacity planning workflowsplan-capacity
- Visualize forecasts and capacity trendsbuild-grafana-dashboards