Claude-skill-registry data-quality
Data quality testing with dbt tests, Great Expectations, and monitoring.
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/data-quality" ~/.claude/skills/majiayu000-claude-skill-registry-data-quality && rm -rf "$T"
manifest:
skills/data/data-quality/SKILL.mdsource content
Data Quality
Quality Dimensions
| Dimension | Description | Test |
|---|---|---|
| Completeness | No missing values | NOT NULL, count checks |
| Uniqueness | No duplicates | UNIQUE, distinct counts |
| Validity | Values in range | Range checks, regex |
| Consistency | Matches across sources | Cross-table checks |
| Timeliness | Data is fresh | Freshness checks |
dbt Tests
Schema Tests
models: - name: fct_orders columns: - name: order_id tests: - unique - not_null - name: status tests: - accepted_values: values: ['pending', 'completed', 'cancelled'] - name: amount tests: - not_null - dbt_utils.accepted_range: min_value: 0 max_value: 1000000
Custom Tests
-- tests/assert_positive_revenue.sql select * from {{ ref('fct_orders') }} where amount < 0
Relationship Tests
- name: customer_id tests: - relationships: to: ref('dim_customer') field: customer_id
Great Expectations
import great_expectations as gx context = gx.get_context() validator = context.sources.pandas_default.read_csv("data.csv") validator.expect_column_values_to_not_be_null("order_id") validator.expect_column_values_to_be_unique("order_id") validator.expect_column_values_to_be_between("amount", 0, 1000000) results = validator.validate()
Monitoring
- Row count trends
- Null percentage trends
- Schema drift detection
- Freshness SLAs
- Anomaly detection