Claude-skill-registry data-cleaning

Data cleaning, preprocessing, and quality assurance techniques

install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/data-cleaning" ~/.claude/skills/majiayu000-claude-skill-registry-data-cleaning && rm -rf "$T"
manifest: skills/data/data-cleaning/SKILL.md
source content

Data Cleaning Skill

Overview

Master data cleaning and preprocessing techniques essential for reliable analytics.

Topics Covered

  • Missing value handling (imputation, deletion)
  • Outlier detection and treatment
  • Data type conversion and validation
  • Duplicate identification and removal
  • String cleaning and normalization

Learning Outcomes

  • Clean messy datasets
  • Handle missing data appropriately
  • Detect and treat outliers
  • Ensure data quality

Error Handling

Error TypeCauseRecovery
Memory errorDataset too largeUse chunking or sampling
Type conversion failedInvalid data formatApply preprocessing first
Encoding issuesWrong character encodingDetect and specify encoding
Validation failureData doesn't meet schemaReview and adjust validation rules

Related Skills

  • programming (for automation)
  • foundations (for data quality concepts)
  • databases-sql (for SQL-based cleaning)