install
source · Clone the upstream repo
git clone https://github.com/ai-analyst-lab/ai-analyst
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ai-analyst-lab/ai-analyst "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills/semantic-validation" ~/.claude/skills/ai-analyst-lab-ai-analyst-semantic-validation && rm -rf "$T"
manifest:
.claude/skills/semantic-validation/skill.mdsource content
Skill: Semantic Validation
Purpose
Orchestrate the full 4-layer validation stack plus confidence scoring to produce a comprehensive data quality assessment for any analysis output.
When to Use
- After analysis agents produce findings (before Storytelling agent)
- When the Validation agent runs its enhanced checks (Step 5a-5e)
- When a user asks "how confident should I be in these results?"
Invocation
Applied automatically as part of the Validation agent workflow. Can also be invoked standalone: "Validate the quality of this analysis."
Instructions
Layer 1: Structural Validation
Use
helpers/structural_validator.py:
from helpers.structural_validator import ( validate_schema, validate_primary_key, validate_referential_integrity, validate_completeness ) # Check schema matches expected structure schema_result = validate_schema(df, expected_columns, expected_types) # Check primary key uniqueness pk_result = validate_primary_key(df, key_columns) # Check FK references exist in parent table ri_result = validate_referential_integrity(child_df, parent_df, fk_column, pk_column) # Check column completeness (null rates) completeness_result = validate_completeness(df, thresholds={"warn": 0.05, "fail": 0.20})
Flag any FAIL results as BLOCKER — analysis built on broken data is invalid.
Layer 2: Logical Validation
Use
helpers/logical_validator.py:
from helpers.logical_validator import ( validate_aggregation_consistency, validate_trend_continuity, validate_segment_exhaustiveness, validate_temporal_consistency ) # Parts must sum to whole agg_result = validate_aggregation_consistency(parts_df, total_value, tolerance=0.01) # No discontinuities in time series trend_result = validate_trend_continuity(ts_df, date_col, value_col, max_gap_days=7) # Segments must cover the full population seg_result = validate_segment_exhaustiveness(segment_df, total_count) # Date ranges across tables must overlap temporal_result = validate_temporal_consistency(tables_dict, date_columns)
WARN on logical inconsistencies — they suggest calculation errors.
Layer 3: Business Rules Validation
Use
helpers/business_rules.py:
from helpers.business_rules import ( validate_ranges, validate_rates, validate_yoy_change ) # Check values fall within plausible ranges range_result = validate_ranges(df, column, min_val, max_val) # Check rates are 0-100% and denominators > 0 rate_result = validate_rates(numerator, denominator) # Check YoY changes are plausible (not 10000%) yoy_result = validate_yoy_change(current, previous, max_change_pct=500)
Flag implausible values as WARN — they may be correct but need explanation.
Layer 4: Simpson's Paradox Check
Use
helpers/simpsons_paradox.py:
from helpers.simpsons_paradox import check_simpsons_paradox, scan_dimensions # Check a specific aggregate vs segment breakdown paradox = check_simpsons_paradox(df, metric_col, segment_col) # Scan multiple dimensions for paradox risk scan = scan_dimensions(df, metric_col, dimension_cols)
BLOCKER on confirmed paradox — the aggregate finding is misleading.
Confidence Scoring
After all 4 layers complete, synthesize results into a confidence score:
from helpers.confidence_scoring import score_confidence, format_confidence_badge # Collect all validation results validation_results = { "structural": [schema_result, pk_result, ri_result, completeness_result], "logical": [agg_result, trend_result, seg_result, temporal_result], "business_rules": [range_result, rate_result, yoy_result], "simpsons_paradox": [paradox_result], "sample_size": len(df) } score = score_confidence(validation_results) badge = format_confidence_badge(score) # score returns: {score: 0-100, grade: A-F, factors: {...}, flags: [...]} # badge returns: "A (92/100)" or "C (58/100) — 2 warnings"
Output Integration
Pass the confidence score and badge to downstream agents:
- Storytelling agent: Include badge in executive summary
- Deck Creator: Show badge on synthesis slide
- Validation report: Full factor breakdown in the validation report
Severity Mapping
| Layer | FAIL → | WARN → |
|---|---|---|
| Structural | BLOCKER (halt analysis) | WARNING (proceed with caution) |
| Logical | WARNING (check calculations) | INFO (note in report) |
| Business Rules | WARNING (explain outliers) | INFO (note in report) |
| Simpson's | BLOCKER (disaggregate) | WARNING (check segments) |
Edge Cases
- Missing validators: If a helper module is unavailable, skip that layer and cap confidence at grade C
- Empty data: Structural validation catches this — BLOCKER before other layers run
- Single-table analysis: Skip referential integrity and segment exhaustiveness checks
- No time dimension: Skip temporal consistency and trend continuity checks