Claude-skill-registry data-visualizer
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/data-visualizer" ~/.claude/skills/majiayu000-claude-skill-registry-data-visualizer && rm -rf "$T"
manifest:
skills/data/data-visualizer/SKILL.mdsource content
Data Visualizer
Overview
Automated visualization generation for exploratory data analysis, model performance reporting, and stakeholder communication. Creates publication-quality plots, interactive dashboards, and business-friendly reports—all integrated with SpecWeave's increment workflow.
Visualization Categories
1. Exploratory Data Analysis (EDA)
Automated EDA Report:
from specweave import EDAVisualizer visualizer = EDAVisualizer(increment="0042") # Generates comprehensive EDA report report = visualizer.generate_eda_report(df) # Creates: # - Dataset overview (rows, columns, memory, missing values) # - Numerical feature distributions (histograms + KDE) # - Categorical feature counts (bar charts) # - Correlation heatmap # - Missing value pattern # - Outlier detection plots # - Feature relationships (pairplot for top features)
Individual EDA Plots:
# Distribution plots visualizer.plot_distribution( data=df['age'], title="Age Distribution", bins=30 ) # Correlation heatmap visualizer.plot_correlation_heatmap( data=df[numerical_columns], method='pearson' # or 'spearman', 'kendall' ) # Missing value patterns visualizer.plot_missing_values(df) # Outlier detection (boxplots) visualizer.plot_outliers(df[numerical_columns])
2. Model Performance Visualizations
Classification Performance:
from specweave import ClassificationVisualizer viz = ClassificationVisualizer(increment="0042") # Confusion matrix viz.plot_confusion_matrix( y_true=y_test, y_pred=y_pred, classes=['Negative', 'Positive'] ) # ROC curve viz.plot_roc_curve( y_true=y_test, y_proba=y_proba ) # Precision-Recall curve viz.plot_precision_recall_curve( y_true=y_test, y_proba=y_proba ) # Learning curves (train vs val) viz.plot_learning_curve( train_scores=train_scores, val_scores=val_scores ) # Calibration curve (are probabilities well-calibrated?) viz.plot_calibration_curve( y_true=y_test, y_proba=y_proba )
Regression Performance:
from specweave import RegressionVisualizer viz = RegressionVisualizer(increment="0042") # Predicted vs Actual viz.plot_predictions( y_true=y_test, y_pred=y_pred ) # Residual plot viz.plot_residuals( y_true=y_test, y_pred=y_pred ) # Residual distribution (should be normal) viz.plot_residual_distribution( residuals=y_test - y_pred ) # Error by feature value viz.plot_error_analysis( y_true=y_test, y_pred=y_pred, features=X_test )
3. Feature Analysis Visualizations
Feature Importance:
from specweave import FeatureVisualizer viz = FeatureVisualizer(increment="0042") # Feature importance (bar chart) viz.plot_feature_importance( feature_names=feature_names, importances=model.feature_importances_, top_n=20 ) # SHAP summary plot viz.plot_shap_summary( shap_values=shap_values, features=X_test ) # Partial dependence plots viz.plot_partial_dependence( model=model, features=['age', 'income'], X=X_train ) # Feature interaction viz.plot_feature_interaction( model=model, features=('age', 'income'), X=X_train )
4. Time Series Visualizations
Time Series Plots:
from specweave import TimeSeriesVisualizer viz = TimeSeriesVisualizer(increment="0042") # Time series with trend viz.plot_timeseries( data=sales_data, show_trend=True ) # Seasonal decomposition viz.plot_seasonal_decomposition( data=sales_data, period=12 # Monthly seasonality ) # Autocorrelation (ACF, PACF) viz.plot_autocorrelation(data=sales_data) # Forecast with confidence intervals viz.plot_forecast( actual=test_data, forecast=forecast, confidence_intervals=(0.80, 0.95) )
5. Model Comparison Visualizations
Compare Multiple Models:
from specweave import ModelComparisonVisualizer viz = ModelComparisonVisualizer(increment="0042") # Compare metrics across models viz.plot_model_comparison( models=['Baseline', 'XGBoost', 'LightGBM', 'Neural Net'], metrics={ 'accuracy': [0.65, 0.87, 0.86, 0.85], 'roc_auc': [0.70, 0.92, 0.91, 0.90], 'training_time': [1, 45, 32, 320] } ) # ROC curves for multiple models viz.plot_roc_curves_comparison( models_predictions={ 'XGBoost': (y_test, y_proba_xgb), 'LightGBM': (y_test, y_proba_lgbm), 'Neural Net': (y_test, y_proba_nn) } )
Interactive Visualizations
Plotly Integration:
from specweave import InteractiveVisualizer viz = InteractiveVisualizer(increment="0042") # Interactive scatter plot (zoom, pan, hover) viz.plot_interactive_scatter( x=X_test[:, 0], y=X_test[:, 1], colors=y_pred, hover_data=df[['id', 'amount', 'merchant']] ) # Interactive confusion matrix (click for details) viz.plot_interactive_confusion_matrix( y_true=y_test, y_pred=y_pred ) # Interactive feature importance (sortable, filterable) viz.plot_interactive_feature_importance( feature_names=feature_names, importances=importances )
Business Reporting
Automated ML Report:
from specweave import MLReportGenerator generator = MLReportGenerator(increment="0042") # Generate executive summary report report = generator.generate_report( model=model, test_data=(X_test, y_test), business_metrics={ 'false_positive_cost': 5, 'false_negative_cost': 500 } ) # Creates: # - Executive summary (1 page, non-technical) # - Key metrics (accuracy, precision, recall) # - Business impact ($$ saved, ROI) # - Model performance visualizations # - Recommendations # - Technical appendix
Report Output (HTML/PDF):
# Fraud Detection Model - Executive Summary ## Key Results - **Accuracy**: 87% (target: >85%) ✅ - **Fraud Detection Rate**: 62% (catching 310 frauds/day) - **False Positive Rate**: 38% (190 false alarms/day) ## Business Impact - **Fraud Prevented**: $155,000/day - **Review Cost**: $950/day (190 transactions × $5) - **Net Benefit**: $154,050/day ✅ - **Annual Savings**: $56.2M ## Model Performance [Confusion Matrix Visualization] [ROC Curve] [Feature Importance] ## Recommendations 1. ✅ Deploy to production immediately 2. Monitor fraud patterns weekly 3. Retrain model monthly with new data
Dashboard Creation
Real-Time Dashboard:
from specweave import DashboardCreator creator = DashboardCreator(increment="0042") # Create Grafana/Plotly dashboard dashboard = creator.create_dashboard( title="Model Performance Dashboard", panels=[ {'type': 'metric', 'query': 'prediction_latency_p95'}, {'type': 'metric', 'query': 'predictions_per_second'}, {'type': 'timeseries', 'query': 'accuracy_over_time'}, {'type': 'timeseries', 'query': 'error_rate'}, {'type': 'heatmap', 'query': 'prediction_distribution'}, {'type': 'table', 'query': 'recent_anomalies'} ] ) # Exports to Grafana JSON or Plotly Dash app dashboard.export(format='grafana')
Visualization Best Practices
1. Publication-Quality Plots
# Set consistent styling visualizer.set_style( style='seaborn', # Or 'ggplot', 'fivethirtyeight' context='paper', # Or 'notebook', 'talk', 'poster' palette='colorblind' # Accessible colors ) # High-resolution exports visualizer.save_figure( filename='model_performance.png', dpi=300, # Publication quality bbox_inches='tight' )
2. Accessible Visualizations
# Colorblind-friendly palettes visualizer.use_colorblind_palette() # Add alt text for accessibility visualizer.add_alt_text( plot=fig, description="Confusion matrix showing 87% accuracy" ) # High contrast for presentations visualizer.set_high_contrast_mode()
3. Annotation and Context
# Add reference lines viz.add_reference_line( y=0.85, # Target accuracy label='Target', color='red', linestyle='--' ) # Add annotations viz.annotate_point( x=optimal_threshold, y=optimal_f1, text='Optimal threshold: 0.47' )
Integration with SpecWeave
Automated Visualization in Increments
# All visualizations auto-saved to increment folder visualizer = EDAVisualizer(increment="0042") # Creates: # .specweave/increments/0042-fraud-detection/ # ├── visualizations/ # │ ├── eda/ # │ │ ├── distributions.png # │ │ ├── correlation_heatmap.png # │ │ └── missing_values.png # │ ├── model_performance/ # │ │ ├── confusion_matrix.png # │ │ ├── roc_curve.png # │ │ ├── precision_recall.png # │ │ └── learning_curves.png # │ ├── feature_analysis/ # │ │ ├── feature_importance.png # │ │ ├── shap_summary.png # │ │ └── partial_dependence/ # │ └── reports/ # │ ├── executive_summary.html # │ └── technical_report.pdf
Living Docs Integration
/sw:sync-docs update
Updates:
<!-- .specweave/docs/internal/architecture/ml-model-performance.md --> ## Fraud Detection Model Performance (Increment 0042) ### Model Accuracy  ### Key Metrics - Accuracy: 87% - Precision: 85% - Recall: 62% - ROC AUC: 0.92 ### Feature Importance  Top 5 features: 1. amount_vs_user_average (0.18) 2. days_since_last_purchase (0.12) 3. merchant_risk_score (0.10) 4. velocity_24h (0.08) 5. location_distance_from_home (0.07)
Commands
# Generate EDA report /ml:visualize-eda 0042 # Generate model performance report /ml:visualize-performance 0042 # Create interactive dashboard /ml:create-dashboard 0042 # Export all visualizations /ml:export-visualizations 0042 --format png,pdf,html
Advanced Features
1. Automated Report Generation
# Generate full increment report with all visualizations generator = IncrementReportGenerator(increment="0042") report = generator.generate_full_report() # Includes: # - EDA visualizations # - Experiment comparisons # - Best model performance # - Feature importance # - Business impact # - Deployment readiness
2. Custom Visualization Templates
# Create reusable templates template = VisualizationTemplate(name="fraud_analysis") template.add_panel("confusion_matrix") template.add_panel("roc_curve") template.add_panel("top_fraud_features") template.add_panel("fraud_trends_over_time") # Apply to any increment template.apply(increment="0042")
3. Version Control for Visualizations
# Track visualization changes across model versions viz_tracker = VisualizationTracker(increment="0042") # Compare model v1 vs v2 visualizations viz_tracker.compare_versions( version_1="model-v1", version_2="model-v2" ) # Shows: Confusion matrix improved, ROC curve comparison, etc.
Summary
Data visualization is critical for:
- ✅ Exploratory data analysis (understand data before modeling)
- ✅ Model performance communication (stakeholder buy-in)
- ✅ Feature analysis (understand what drives predictions)
- ✅ Business reporting (translate metrics to impact)
- ✅ Model debugging (identify issues visually)
This skill automates visualization generation, ensuring all ML work is visual, accessible, and business-friendly within SpecWeave's increment workflow.