MetaClaw data-validation-first
Use this skill before any data analysis, transformation, or modeling. Always inspect and validate the data before drawing conclusions or writing transformations.
install
source · Clone the upstream repo
git clone https://github.com/aiming-lab/MetaClaw
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/aiming-lab/MetaClaw "$T" && mkdir -p ~/.claude/skills && cp -r "$T/memory_data/skills/data-validation-first" ~/.claude/skills/aiming-lab-metaclaw-data-validation-first && rm -rf "$T"
manifest:
memory_data/skills/data-validation-first/SKILL.mdsource content
Data Validation First
Before writing any analysis code, understand the data:
# Always run these first df.shape # rows x columns df.dtypes # column types df.isnull().sum() # missing values per column df.describe() # statistics for numeric columns df.head() # sample rows
Key questions:
- Are there nulls in columns you'll join or filter on?
- Are numeric columns stored as strings? (parse_dates, astype)
- Are there unexpected duplicates (check primary key uniqueness)?
- Does the row count match your expectation from the source?
Anti-pattern: Running
.groupby().sum() without first checking for nulls in the groupby key.