Claude-skill-registry data-cleaning
Data cleaning, preprocessing, and quality assurance techniques
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/data-cleaning" ~/.claude/skills/majiayu000-claude-skill-registry-data-cleaning && rm -rf "$T"
manifest:
skills/data/data-cleaning/SKILL.mdsource content
Data Cleaning Skill
Overview
Master data cleaning and preprocessing techniques essential for reliable analytics.
Topics Covered
- Missing value handling (imputation, deletion)
- Outlier detection and treatment
- Data type conversion and validation
- Duplicate identification and removal
- String cleaning and normalization
Learning Outcomes
- Clean messy datasets
- Handle missing data appropriately
- Detect and treat outliers
- Ensure data quality
Error Handling
| Error Type | Cause | Recovery |
|---|---|---|
| Memory error | Dataset too large | Use chunking or sampling |
| Type conversion failed | Invalid data format | Apply preprocessing first |
| Encoding issues | Wrong character encoding | Detect and specify encoding |
| Validation failure | Data doesn't meet schema | Review and adjust validation rules |
Related Skills
- programming (for automation)
- foundations (for data quality concepts)
- databases-sql (for SQL-based cleaning)