Claude-skill-registry drift-detection-implementation
Implement drift detection features for data quality monitoring including baseline storage, history tracking, thresholds, and validation wrappers
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/drift-detection-implementation" ~/.claude/skills/majiayu000-claude-skill-registry-drift-detection-implementation && rm -rf "$T"
manifest:
skills/data/drift-detection-implementation/SKILL.mdsource content
<!-- BEGIN:compound:skill-managed -->
Purpose
Implement drift detection with baseline storage, history tracking, configurable thresholds, and validation check wrappers.
When To Use
- User asks to implement drift detection features
- Ticket requires KS/PSI tests, baseline storage, history tracking, or validation wrappers
Architecture
Implement in src/vibe_piper/validation/drift_detection.py:
Configuration Types
- DriftThresholds dataclass: warning, critical, psi_warning, psi_critical, ks_significance
- BaselineMetadata dataclass: baseline_id, created_at, sample_size, columns, description
- DriftHistoryEntry dataclass: timestamp, baseline_id, method, drift_score, max_drift_score, drifted_columns, alert_level
Result Types
- DriftResult dataclass: method, drift_score, drifted_columns, p_values, statistics, recommendations, timestamp
- ColumnDriftResult dataclass: column_name, drift_score, p_value, is_significant, baseline_distribution, new_distribution, recommendation
Storage Classes
- BaselineStore class:
- init(storage_dir)
- _baseline_path(baseline_id) -> Path
- add_baseline(baseline_id, data, description) -> BaselineMetadata
- get_baseline(baseline_id, schema=None) -> Sequence[DataRecord]
- get_metadata(baseline_id) -> BaselineMetadata
- list_baselines() -> list[BaselineMetadata]
- delete_baseline(baseline_id)
- JSON file storage with metadata and data list
History Tracking
- DriftHistory class:
- init(storage_dir)
- _history_path(baseline_id) -> Path
- add_entry(result, baseline_id, thresholds) -> DriftHistoryEntry
- get_entries(baseline_id, limit=None) -> list[DriftHistoryEntry]
- get_trend(baseline_id, window=10) -> dict[str, Any]
- clear_history(baseline_id)
- JSONL append-only storage (one line per entry)
Alerting
- check_drift_alert(result, thresholds) -> tuple[bool, str] (should_alert, alert_level)
Validation Check Wrappers
- check_drift_ks(column, baseline, thresholds=None) -> Callable[[Sequence[DataRecord]], ValidationResult]
- check_drift_psi(column, baseline, thresholds=None) -> Callable[[Sequence[DataRecord]], ValidationResult]
- Convert DriftResult to ValidationResult based on alert_level
- Map errors (critical drift), warnings (recommendations, drifted columns)
Dependencies
- scipy (optional) - Import inside functions with TYPE_CHECKING guard
- datetime.utcnow (deprecation warning - consider datetime.now(datetime.UTC))
- json, pathlib, dataclasses
Testing Pattern
- Create test fixtures with sample_schema
- Test BaselineStore: add, get, get_metadata, list, delete operations
- Test DriftHistory: add_entry, get_entries, get_trend, clear_history
- Test thresholds validation
- Test validation wrappers with stable/drifted data
- Test alerting logic
- Use tempfile for BaselineStore/DriftHistory storage
- Aim for 85%+ coverage
Exports
Update src/vibe_piper/validation/init.py to export new classes and functions.
<!-- END:compound:skill-managed -->Manual notes
This section is preserved when the skill is updated. Put human notes, caveats, and exceptions here.