Data-engineering-skills testing-dbt-models
install
source · Clone the upstream repo
git clone https://github.com/AltimateAI/data-engineering-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/AltimateAI/data-engineering-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/dbt/testing-dbt-models" ~/.claude/skills/altimateai-data-engineering-skills-testing-dbt-models && rm -rf "$T"
manifest:
skills/dbt/testing-dbt-models/SKILL.mdsource content
dbt Testing
Every model deserves at least one test. Primary keys need unique + not_null.
Workflow
1. Study Existing Test Patterns
CRITICAL: Match the project's existing testing style before adding new tests.
# Find all schema.yml files with tests find . -name "schema.yml" -exec grep -l "tests:" {} \; # Read existing tests to learn patterns cat models/staging/schema.yml | head -100 cat models/marts/schema.yml | head -100 # Check for custom tests or dbt packages ls tests/ cat packages.yml 2>/dev/null
Extract from existing tests:
- YAML formatting style (indentation, spacing)
- Test coverage depth (all columns vs key columns only)
- Use of custom tests (dbt_utils, dbt_expectations, custom macros)
- Description style (brief vs detailed)
- Severity levels used (warn vs error)
2. Read Model SQL
cat models/<path>/<model_name>.sql
Identify: primary keys, foreign keys, categorical columns, date columns, business-critical fields.
3. Check Existing Tests for This Model
cat models/<path>/schema.yml | grep -A 50 "<model_name>" # or find . -name "schema.yml" -exec grep -l "<model_name>" {} \;
4. Identify Testable Columns
| Column Type | Recommended Tests |
|---|---|
| Primary key | , |
| Foreign key | , |
| Categorical | (ask user for valid values) |
| Required field | |
| Date/timestamp | |
| Boolean | |
5. Write Tests in schema.yml
Match the existing style from step 1. Example format (adapt to project):
version: 2 models: - name: model_name description: "Brief description of what this model contains" columns: - name: primary_key_column description: "Unique identifier for this record" tests: - unique - not_null - name: foreign_key_column description: "Reference to related_model" tests: - not_null - relationships: to: ref('related_model') field: related_key_column - name: status description: "Current status of the record" tests: - not_null - accepted_values: values: ['pending', 'active', 'completed', 'cancelled'] - name: created_at description: "Timestamp when record was created" tests: - not_null
6. Run Tests
# Test specific model dbt test --select <model_name> # Test with upstream dbt test --select +<model_name>
7. Fix Failing Tests
Common failures and fixes:
| Failure | Likely Cause | Fix |
|---|---|---|
fails | Duplicate records | Add deduplication in model |
fails | NULL values in source | Add COALESCE or filter |
fails | Orphan records | Add WHERE clause or fix upstream |
fails | New/unexpected values | Update accepted values list |
Test Types Reference
Generic Tests (built-in)
tests: - unique - not_null - accepted_values: values: ['a', 'b', 'c'] - relationships: to: ref('other_model') field: id
Custom Generic Tests
tests: - dbt_utils.expression_is_true: expression: "amount >= 0" - dbt_utils.recency: datepart: day field: created_at interval: 1
Singular Tests
Create
tests/<test_name>.sql:
-- tests/assert_positive_revenue.sql select * from {{ ref('orders') }} where revenue < 0
Anti-Patterns
- Adding tests without checking existing project patterns first
- Using different YAML formatting style than existing tests
- Models without any tests
- Primary keys without both unique AND not_null
- Testing only obvious columns, ignoring business-critical ones
- Hardcoding accepted_values without confirming with stakeholders
- Adding dbt_utils tests when project doesn't use that package