Babysitter aiml-validation-framework
AI/ML medical device validation skill implementing FDA's GMLP principles
install
source · Clone the upstream repo
git clone https://github.com/a5c-ai/babysitter
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/a5c-ai/babysitter "$T" && mkdir -p ~/.claude/skills && cp -r "$T/library/specializations/domains/science/biomedical-engineering/skills/aiml-validation-framework" ~/.claude/skills/a5c-ai-babysitter-aiml-validation-framework && rm -rf "$T"
manifest:
library/specializations/domains/science/biomedical-engineering/skills/aiml-validation-framework/SKILL.mdsource content
AI/ML Validation Framework Skill
Purpose
The AI/ML Validation Framework Skill supports validation of AI/ML-enabled medical devices per FDA Good Machine Learning Practice (GMLP) principles, addressing data quality, model performance, and predetermined change control.
Capabilities
- Training data quality assessment
- Ground truth labeling validation
- Model performance metrics calculation (AUC, sensitivity, specificity)
- Subgroup performance analysis
- Bias and fairness evaluation
- Predetermined change control plan (PCCP) templates
- Clinical validation study design
- Locked algorithm vs. adaptive documentation
- Model explainability documentation
- Performance monitoring planning
- Real-world performance tracking
Usage Guidelines
When to Use
- Validating AI/ML algorithms
- Assessing training data quality
- Planning clinical validation studies
- Preparing FDA AI/ML submissions
Prerequisites
- Algorithm development complete
- Training/test datasets curated
- Ground truth established
- Intended use clearly defined
Best Practices
- Document data management practices
- Validate on diverse populations
- Plan for performance monitoring
- Consider predetermined change control
Process Integration
This skill integrates with the following processes:
- AI/ML Medical Device Development
- Software Verification and Validation
- Clinical Evaluation Report Development
- Post-Market Surveillance System Implementation
Dependencies
- FDA AI/ML guidance
- GMLP principles
- Fairness toolkits (AIF360, Fairlearn)
- Statistical analysis tools
- Clinical study resources
Configuration
aiml-validation-framework: algorithm-types: - locked - adaptive - continuously-learning performance-metrics: - AUC - sensitivity - specificity - PPV - NPV subgroup-categories: - age - sex - race - disease-severity
Output Artifacts
- Data management documentation
- Algorithm description documents
- Performance reports
- Bias/fairness assessments
- PCCP documents
- Clinical validation protocols
- Monitoring plans
- FDA submission sections
Quality Criteria
- Training data quality documented
- Ground truth methodology validated
- Performance meets clinical requirements
- Subgroup performance acceptable
- Bias assessments completed
- PCCP appropriate for algorithm type