Claude-skill-registry-data ml-project-lifecycle

Plan ML projects using CRISP-DM, TDSP, and MLOps methodologies with proper phase gates and deliverables.

install

source · Clone the upstream repo

git clone https://github.com/majiayu000/claude-skill-registry-data

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry-data "$T" && mkdir -p ~/.claude/skills && cp -r "$T/data/ml-project-lifecycle" ~/.claude/skills/majiayu000-claude-skill-registry-data-ml-project-lifecycle && rm -rf "$T"

manifest: data/ml-project-lifecycle/SKILL.md

source content

ML Project Lifecycle Planning

When to Use This Skill

Use this skill when:

Ml Project Lifecycle tasks - Working on plan ml projects using crisp-dm, tdsp, and mlops methodologies with proper phase gates and deliverables
Planning or design - Need guidance on Ml Project Lifecycle approaches
Best practices - Want to follow established patterns and standards

Overview

ML project lifecycle methodologies provide structured approaches for planning, executing, and deploying machine learning systems with appropriate governance and quality gates.

CRISP-DM Methodology

Six Phases

┌─────────────────────────────────────────────────────────────────┐
│                        CRISP-DM Cycle                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│         ┌─────────────────────┐                                 │
│         │   1. Business       │                                 │
│         │   Understanding     │                                 │
│         └────────┬────────────┘                                 │
│                  │                                               │
│    ┌─────────────┼─────────────┐                                │
│    │             ▼             │                                │
│    │  ┌─────────────────────┐  │                                │
│    │  │  2. Data            │  │                                │
│    │  │  Understanding      │  │                                │
│    │  └────────┬────────────┘  │                                │
│    │           │               │                                │
│    │           ▼               │                                │
│    │  ┌─────────────────────┐  │                                │
│    │  │  3. Data            │  │                                │
│    │  │  Preparation        │  │                                │
│    │  └────────┬────────────┘  │                                │
│    │           │               │                                │
│    │           ▼               │                                │
│    │  ┌─────────────────────┐  │                                │
│    │  │  4. Modeling        │  │                                │
│    │  └────────┬────────────┘  │                                │
│    │           │               │                                │
│    │           ▼               │                                │
│    │  ┌─────────────────────┐  │                                │
│    │  │  5. Evaluation      │  │◄──── Go/No-Go Decision        │
│    │  └────────┬────────────┘  │                                │
│    │           │               │                                │
│    └───────────┼───────────────┘                                │
│                ▼                                                 │
│         ┌─────────────────────┐                                 │
│         │  6. Deployment      │                                 │
│         └─────────────────────┘                                 │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Phase Details

Phase	Key Activities	Deliverables
Business Understanding	Define objectives, success criteria	Business requirements doc
Data Understanding	Explore, describe, verify data	Data quality report
Data Preparation	Clean, transform, feature engineer	Training datasets
Modeling	Select algorithms, train, tune	Model artifacts, metrics
Evaluation	Assess model, review process	Evaluation report
Deployment	Deploy, monitor, maintain	Production system

MLOps Maturity Levels

Level Assessment

Level	Description	Characteristics
0	Manual	No automation, ad-hoc experiments
1	ML Pipeline	Automated training, manual deployment
2	CI/CD Pipeline	Automated training and deployment
3	Full MLOps	Automated monitoring, retraining

MLOps Components

┌─────────────────────────────────────────────────────────────────┐
│                      MLOps Architecture                          │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌────────────┐   ┌────────────┐   ┌────────────┐              │
│  │ Data       │   │ Feature    │   │ Model      │              │
│  │ Pipeline   │──►│ Store      │──►│ Training   │              │
│  └────────────┘   └────────────┘   └─────┬──────┘              │
│                                          │                      │
│  ┌────────────┐   ┌────────────┐   ┌─────▼──────┐              │
│  │ Monitoring │◄──│ Model      │◄──│ Model      │              │
│  │ & Alerts   │   │ Serving    │   │ Registry   │              │
│  └────────────┘   └────────────┘   └────────────┘              │
│                                                                  │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │              Experiment Tracking & Versioning            │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Project Planning Template

# ML Project Plan: [Project Name]

## 1. Business Understanding

### Objectives
- Primary goal: [What business problem are we solving?]
- Success metrics: [How will we measure success?]
- Stakeholders: [Who will use/be affected by this?]

### Constraints
- Timeline: [Project duration]
- Resources: [Team, compute, budget]
- Data availability: [What data do we have access to?]

## 2. Data Understanding

### Data Sources
| Source | Type | Volume | Refresh |
|--------|------|--------|---------|
| [Source 1] | [Type] | [Size] | [Frequency] |

### Data Quality Assessment
- Completeness: [% complete]
- Accuracy: [Validation approach]
- Timeliness: [Data freshness]

## 3. Data Preparation

### Feature Engineering Plan
| Feature | Source | Transformation | Rationale |
|---------|--------|----------------|-----------|
| [Feature 1] | [Column] | [Transform] | [Why] |

### Data Pipeline
- Extraction: [Method]
- Transformation: [Tools/approach]
- Loading: [Destination]

## 4. Modeling Approach

### Algorithm Selection
| Algorithm | Pros | Cons | Priority |
|-----------|------|------|----------|
| [Algorithm 1] | [Pros] | [Cons] | [1-3] |

### Experimentation Plan
- Baseline: [Simple model for comparison]
- Iterations: [Planned experiments]
- Hyperparameter strategy: [Grid/random/bayesian]

## 5. Evaluation Criteria

### Metrics
| Metric | Target | Baseline | Importance |
|--------|--------|----------|------------|
| [Metric 1] | [Target] | [Current] | [High/Med/Low] |

### Go/No-Go Criteria
- Minimum performance: [Threshold]
- Business validation: [Process]

## 6. Deployment Plan

### Serving Architecture
- Inference type: [Real-time/Batch]
- Infrastructure: [Cloud/Edge]
- Scaling: [Strategy]

### Monitoring
- Metrics: [What to track]
- Alerts: [Thresholds]
- Retraining: [Trigger conditions]

Experiment Tracking

Tracking Requirements

Category	Items to Track
Parameters	Hyperparameters, configs
Metrics	Loss, accuracy, custom
Artifacts	Models, plots, data
Environment	Dependencies, hardware
Code	Git commit, branch

MLflow Integration

// Semantic Kernel with experiment tracking
public class ExperimentTracker
{
    public async Task TrackExperiment(
        string experimentName,
        Func<Task<ExperimentResult>> experiment)
    {
        var runId = Guid.NewGuid().ToString();
        var startTime = DateTime.UtcNow;

        try
        {
            // Log parameters
            await LogParameters(runId, new Dictionary<string, object>
            {
                ["model"] = "gpt-4o",
                ["temperature"] = 0.7,
                ["max_tokens"] = 1000
            });

            // Run experiment
            var result = await experiment();

            // Log metrics
            await LogMetrics(runId, new Dictionary<string, double>
            {
                ["accuracy"] = result.Accuracy,
                ["latency_ms"] = result.LatencyMs,
                ["token_cost"] = result.TokenCost
            });

            // Log artifacts
            await LogArtifact(runId, "prompt.txt", result.Prompt);
            await LogArtifact(runId, "response.json", result.Response);
        }
        finally
        {
            var duration = DateTime.UtcNow - startTime;
            await LogMetric(runId, "duration_seconds", duration.TotalSeconds);
        }
    }
}

Model Registry

Registry Structure

# Model Registry Entry

## Model: customer-churn-predictor

### Versions
| Version | Stage | Created | Metrics | Notes |
|---------|-------|---------|---------|-------|
| v1.0.0 | Production | 2024-01-15 | AUC: 0.85 | Baseline |
| v1.1.0 | Staging | 2024-02-01 | AUC: 0.88 | New features |
| v1.2.0 | Development | 2024-02-15 | AUC: 0.89 | Tuned |

### Promotion Criteria
- [ ] Performance >= baseline + 2%
- [ ] No regression on fairness metrics
- [ ] A/B test shows positive lift
- [ ] Stakeholder approval

Validation Checklist

Integration Points

Inputs from:

Business requirements → Success criteria
Data architecture → Data sources
Compliance planning → Regulatory requirements

Outputs to:

```
model-selection
```
skill → Algorithm choices
```
ai-safety-planning
```
skill → Safety requirements
```
token-budgeting
```
skill → Cost estimation