Ai-analyst semantic-validation

Skill: Semantic Validation

install
source · Clone the upstream repo
git clone https://github.com/ai-analyst-lab/ai-analyst
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ai-analyst-lab/ai-analyst "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills/semantic-validation" ~/.claude/skills/ai-analyst-lab-ai-analyst-semantic-validation && rm -rf "$T"
manifest: .claude/skills/semantic-validation/skill.md
source content

Skill: Semantic Validation

Purpose

Orchestrate the full 4-layer validation stack plus confidence scoring to produce a comprehensive data quality assessment for any analysis output.

When to Use

  • After analysis agents produce findings (before Storytelling agent)
  • When the Validation agent runs its enhanced checks (Step 5a-5e)
  • When a user asks "how confident should I be in these results?"

Invocation

Applied automatically as part of the Validation agent workflow. Can also be invoked standalone: "Validate the quality of this analysis."

Instructions

Layer 1: Structural Validation

Use

helpers/structural_validator.py
:

from helpers.structural_validator import (
    validate_schema, validate_primary_key,
    validate_referential_integrity, validate_completeness
)

# Check schema matches expected structure
schema_result = validate_schema(df, expected_columns, expected_types)

# Check primary key uniqueness
pk_result = validate_primary_key(df, key_columns)

# Check FK references exist in parent table
ri_result = validate_referential_integrity(child_df, parent_df, fk_column, pk_column)

# Check column completeness (null rates)
completeness_result = validate_completeness(df, thresholds={"warn": 0.05, "fail": 0.20})

Flag any FAIL results as BLOCKER — analysis built on broken data is invalid.

Layer 2: Logical Validation

Use

helpers/logical_validator.py
:

from helpers.logical_validator import (
    validate_aggregation_consistency, validate_trend_continuity,
    validate_segment_exhaustiveness, validate_temporal_consistency
)

# Parts must sum to whole
agg_result = validate_aggregation_consistency(parts_df, total_value, tolerance=0.01)

# No discontinuities in time series
trend_result = validate_trend_continuity(ts_df, date_col, value_col, max_gap_days=7)

# Segments must cover the full population
seg_result = validate_segment_exhaustiveness(segment_df, total_count)

# Date ranges across tables must overlap
temporal_result = validate_temporal_consistency(tables_dict, date_columns)

WARN on logical inconsistencies — they suggest calculation errors.

Layer 3: Business Rules Validation

Use

helpers/business_rules.py
:

from helpers.business_rules import (
    validate_ranges, validate_rates, validate_yoy_change
)

# Check values fall within plausible ranges
range_result = validate_ranges(df, column, min_val, max_val)

# Check rates are 0-100% and denominators > 0
rate_result = validate_rates(numerator, denominator)

# Check YoY changes are plausible (not 10000%)
yoy_result = validate_yoy_change(current, previous, max_change_pct=500)

Flag implausible values as WARN — they may be correct but need explanation.

Layer 4: Simpson's Paradox Check

Use

helpers/simpsons_paradox.py
:

from helpers.simpsons_paradox import check_simpsons_paradox, scan_dimensions

# Check a specific aggregate vs segment breakdown
paradox = check_simpsons_paradox(df, metric_col, segment_col)

# Scan multiple dimensions for paradox risk
scan = scan_dimensions(df, metric_col, dimension_cols)

BLOCKER on confirmed paradox — the aggregate finding is misleading.

Confidence Scoring

After all 4 layers complete, synthesize results into a confidence score:

from helpers.confidence_scoring import score_confidence, format_confidence_badge

# Collect all validation results
validation_results = {
    "structural": [schema_result, pk_result, ri_result, completeness_result],
    "logical": [agg_result, trend_result, seg_result, temporal_result],
    "business_rules": [range_result, rate_result, yoy_result],
    "simpsons_paradox": [paradox_result],
    "sample_size": len(df)
}

score = score_confidence(validation_results)
badge = format_confidence_badge(score)

# score returns: {score: 0-100, grade: A-F, factors: {...}, flags: [...]}
# badge returns: "A (92/100)" or "C (58/100) — 2 warnings"

Output Integration

Pass the confidence score and badge to downstream agents:

  • Storytelling agent: Include badge in executive summary
  • Deck Creator: Show badge on synthesis slide
  • Validation report: Full factor breakdown in the validation report

Severity Mapping

LayerFAIL →WARN →
StructuralBLOCKER (halt analysis)WARNING (proceed with caution)
LogicalWARNING (check calculations)INFO (note in report)
Business RulesWARNING (explain outliers)INFO (note in report)
Simpson'sBLOCKER (disaggregate)WARNING (check segments)

Edge Cases

  • Missing validators: If a helper module is unavailable, skip that layer and cap confidence at grade C
  • Empty data: Structural validation catches this — BLOCKER before other layers run
  • Single-table analysis: Skip referential integrity and segment exhaustiveness checks
  • No time dimension: Skip temporal consistency and trend continuity checks