Claude-skill-registry-data ml-project-lifecycle
Plan ML projects using CRISP-DM, TDSP, and MLOps methodologies with proper phase gates and deliverables.
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry-data
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry-data "$T" && mkdir -p ~/.claude/skills && cp -r "$T/data/ml-project-lifecycle" ~/.claude/skills/majiayu000-claude-skill-registry-data-ml-project-lifecycle && rm -rf "$T"
manifest:
data/ml-project-lifecycle/SKILL.mdsource content
ML Project Lifecycle Planning
When to Use This Skill
Use this skill when:
- Ml Project Lifecycle tasks - Working on plan ml projects using crisp-dm, tdsp, and mlops methodologies with proper phase gates and deliverables
- Planning or design - Need guidance on Ml Project Lifecycle approaches
- Best practices - Want to follow established patterns and standards
Overview
ML project lifecycle methodologies provide structured approaches for planning, executing, and deploying machine learning systems with appropriate governance and quality gates.
CRISP-DM Methodology
Six Phases
┌─────────────────────────────────────────────────────────────────┐ │ CRISP-DM Cycle │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────────────┐ │ │ │ 1. Business │ │ │ │ Understanding │ │ │ └────────┬────────────┘ │ │ │ │ │ ┌─────────────┼─────────────┐ │ │ │ ▼ │ │ │ │ ┌─────────────────────┐ │ │ │ │ │ 2. Data │ │ │ │ │ │ Understanding │ │ │ │ │ └────────┬────────────┘ │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ ┌─────────────────────┐ │ │ │ │ │ 3. Data │ │ │ │ │ │ Preparation │ │ │ │ │ └────────┬────────────┘ │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ ┌─────────────────────┐ │ │ │ │ │ 4. Modeling │ │ │ │ │ └────────┬────────────┘ │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ ┌─────────────────────┐ │ │ │ │ │ 5. Evaluation │ │◄──── Go/No-Go Decision │ │ │ └────────┬────────────┘ │ │ │ │ │ │ │ │ └───────────┼───────────────┘ │ │ ▼ │ │ ┌─────────────────────┐ │ │ │ 6. Deployment │ │ │ └─────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘
Phase Details
| Phase | Key Activities | Deliverables |
|---|---|---|
| Business Understanding | Define objectives, success criteria | Business requirements doc |
| Data Understanding | Explore, describe, verify data | Data quality report |
| Data Preparation | Clean, transform, feature engineer | Training datasets |
| Modeling | Select algorithms, train, tune | Model artifacts, metrics |
| Evaluation | Assess model, review process | Evaluation report |
| Deployment | Deploy, monitor, maintain | Production system |
MLOps Maturity Levels
Level Assessment
| Level | Description | Characteristics |
|---|---|---|
| 0 | Manual | No automation, ad-hoc experiments |
| 1 | ML Pipeline | Automated training, manual deployment |
| 2 | CI/CD Pipeline | Automated training and deployment |
| 3 | Full MLOps | Automated monitoring, retraining |
MLOps Components
┌─────────────────────────────────────────────────────────────────┐ │ MLOps Architecture │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ │ │ Data │ │ Feature │ │ Model │ │ │ │ Pipeline │──►│ Store │──►│ Training │ │ │ └────────────┘ └────────────┘ └─────┬──────┘ │ │ │ │ │ ┌────────────┐ ┌────────────┐ ┌─────▼──────┐ │ │ │ Monitoring │◄──│ Model │◄──│ Model │ │ │ │ & Alerts │ │ Serving │ │ Registry │ │ │ └────────────┘ └────────────┘ └────────────┘ │ │ │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ Experiment Tracking & Versioning │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘
Project Planning Template
# ML Project Plan: [Project Name] ## 1. Business Understanding ### Objectives - Primary goal: [What business problem are we solving?] - Success metrics: [How will we measure success?] - Stakeholders: [Who will use/be affected by this?] ### Constraints - Timeline: [Project duration] - Resources: [Team, compute, budget] - Data availability: [What data do we have access to?] ## 2. Data Understanding ### Data Sources | Source | Type | Volume | Refresh | |--------|------|--------|---------| | [Source 1] | [Type] | [Size] | [Frequency] | ### Data Quality Assessment - Completeness: [% complete] - Accuracy: [Validation approach] - Timeliness: [Data freshness] ## 3. Data Preparation ### Feature Engineering Plan | Feature | Source | Transformation | Rationale | |---------|--------|----------------|-----------| | [Feature 1] | [Column] | [Transform] | [Why] | ### Data Pipeline - Extraction: [Method] - Transformation: [Tools/approach] - Loading: [Destination] ## 4. Modeling Approach ### Algorithm Selection | Algorithm | Pros | Cons | Priority | |-----------|------|------|----------| | [Algorithm 1] | [Pros] | [Cons] | [1-3] | ### Experimentation Plan - Baseline: [Simple model for comparison] - Iterations: [Planned experiments] - Hyperparameter strategy: [Grid/random/bayesian] ## 5. Evaluation Criteria ### Metrics | Metric | Target | Baseline | Importance | |--------|--------|----------|------------| | [Metric 1] | [Target] | [Current] | [High/Med/Low] | ### Go/No-Go Criteria - Minimum performance: [Threshold] - Business validation: [Process] ## 6. Deployment Plan ### Serving Architecture - Inference type: [Real-time/Batch] - Infrastructure: [Cloud/Edge] - Scaling: [Strategy] ### Monitoring - Metrics: [What to track] - Alerts: [Thresholds] - Retraining: [Trigger conditions]
Experiment Tracking
Tracking Requirements
| Category | Items to Track |
|---|---|
| Parameters | Hyperparameters, configs |
| Metrics | Loss, accuracy, custom |
| Artifacts | Models, plots, data |
| Environment | Dependencies, hardware |
| Code | Git commit, branch |
MLflow Integration
// Semantic Kernel with experiment tracking public class ExperimentTracker { public async Task TrackExperiment( string experimentName, Func<Task<ExperimentResult>> experiment) { var runId = Guid.NewGuid().ToString(); var startTime = DateTime.UtcNow; try { // Log parameters await LogParameters(runId, new Dictionary<string, object> { ["model"] = "gpt-4o", ["temperature"] = 0.7, ["max_tokens"] = 1000 }); // Run experiment var result = await experiment(); // Log metrics await LogMetrics(runId, new Dictionary<string, double> { ["accuracy"] = result.Accuracy, ["latency_ms"] = result.LatencyMs, ["token_cost"] = result.TokenCost }); // Log artifacts await LogArtifact(runId, "prompt.txt", result.Prompt); await LogArtifact(runId, "response.json", result.Response); } finally { var duration = DateTime.UtcNow - startTime; await LogMetric(runId, "duration_seconds", duration.TotalSeconds); } } }
Model Registry
Registry Structure
# Model Registry Entry ## Model: customer-churn-predictor ### Versions | Version | Stage | Created | Metrics | Notes | |---------|-------|---------|---------|-------| | v1.0.0 | Production | 2024-01-15 | AUC: 0.85 | Baseline | | v1.1.0 | Staging | 2024-02-01 | AUC: 0.88 | New features | | v1.2.0 | Development | 2024-02-15 | AUC: 0.89 | Tuned | ### Promotion Criteria - [ ] Performance >= baseline + 2% - [ ] No regression on fairness metrics - [ ] A/B test shows positive lift - [ ] Stakeholder approval
Validation Checklist
- Business objectives clearly defined
- Success metrics identified and measurable
- Data sources identified and accessible
- Data quality assessed
- Feature engineering strategy defined
- Modeling approach selected
- Evaluation criteria established
- Deployment architecture planned
- Monitoring strategy defined
- MLOps maturity level targeted
Integration Points
Inputs from:
- Business requirements → Success criteria
- Data architecture → Data sources
- Compliance planning → Regulatory requirements
Outputs to:
skill → Algorithm choicesmodel-selection
skill → Safety requirementsai-safety-planning
skill → Cost estimationtoken-budgeting