Claude-skill-registry dbt-model-builder

Create dbt models following FF Analytics Kimball patterns and 2×2 stat model. This skill should be used when creating staging models, core facts/dimensions, or analytical marts. Guides through model creation with proper grain, tests, External Parquet configuration, and per-model YAML documentation using dbt 1.10+ syntax.

install

source · Clone the upstream repo

git clone https://github.com/majiayu000/claude-skill-registry

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/dbt-model-builder" ~/.claude/skills/majiayu000-claude-skill-registry-dbt-model-builder && rm -rf "$T"

manifest: skills/data/dbt-model-builder/SKILL.md

dbt Model Builder

Create complete dbt models for the Fantasy Football Analytics project following Kimball dimensional modeling and the 2×2 stat model (actuals/projections × real-world/fantasy).

When to Use This Skill

Use this skill proactively when:

Creating new dbt models (staging, core, marts)
User asks to "create a model for {entity/process}"
User mentions dbt modeling, dimensional modeling, or Kimball patterns
Adding analytics-ready views or transformations
Implementing 2×2 model quadrants (actuals/projections, real-world/fantasy scoring)

dbt Modeling Approach

The FF Analytics project follows:

Kimball Dimensional Modeling - Facts, dimensions, conformed entities
2×2 Stat Model - Separate facts for actuals vs projections
Per-Model YAML - One
```
_<model>.yml
```
file per model
External Parquet - Large models use external=true with partitioning
dbt 1.10+ syntax - Test arguments wrapped in
```
arguments:
```
block

Model Building Workflows

Workflow 1: Create Staging Model

Staging models normalize raw provider data.

Steps:

Identify source: Determine provider and dataset
Design grain: Define one row per...

Create SQL using

assets/staging_template.sql

Name:
```
stg_{provider}__{dataset}.sql
```
Materialize as view
Select from
```
{{ source('{provider}', '{dataset}') }}
```
Rename columns to standard names

Create YAML using
```
assets/staging_yaml_template.yml
```
:
- Name:
```
_stg_{provider}__{dataset}.yml
```
- Document grain and source
- Add not_null, unique tests for PKs
- Add accepted_values for enums

Run and test:

make dbt-run --select stg_{provider}__{dataset}
make dbt-test --select stg_{provider}__{dataset}

Workflow 2: Create Fact Table

Fact tables capture measurable events/processes.

Steps:

Design grain: Define composite primary key (e.g., player_id + game_id + stat_name)
Map foreign keys: Join to conformed dimensions (
```
dim_player
```
,
```
dim_team
```
, etc.)

Create SQL using

assets/fact_template.sql

Name:
```
fact_{process}.sql
```

Config:

materialized='table', external=true, partition_by=['season','week']

Join staging to dimensions for FK resolution
Select grain keys + measures

Create YAML using
```
assets/fact_yaml_template.yml
```
:
- Document grain explicitly
- Add
```
dbt_utils.unique_combination_of_columns
```
  for grain test
- Add relationship tests for all FKs
- Add not_null for required measures

Run and test:

make dbt-run --select fact_{process}
make dbt-test --select fact_{process}

Critical: Fact tables MUST have grain uniqueness test with dbt 1.10+ syntax:

data_tests:
  - dbt_utils.unique_combination_of_columns:
      arguments:
        combination_of_columns:
          - column1
          - column2
      config:
        severity: error

Workflow 3: Create Dimension Table

Dimensions provide descriptive context for facts.

Steps:

Determine SCD type: Type 1 (replace) or Type 2 (historical tracking)
Design natural key: Business key for the entity
Create SQL using
```
assets/dim_template.sql
```
:
- Name:
```
dim_{entity}.sql
```
- Generate surrogate key with
```
dbt_utils.generate_surrogate_key()
```
- For SCD Type 2: Add valid_from, valid_to, is_current
Create YAML:
- Document grain: "one row per {entity}"
- Add unique test on surrogate key
- Add not_null on natural key
Run and test

SCD Type 2 pattern:

Track changes over time with validity dates
Include version_number for multiple versions
Set is_current flag for latest version

Workflow 4: Create Analytical Mart (2×2 Model)

Marts provide wide-format, analytics-ready data.

2×2 Model Quadrants:

```
mart_real_world_actuals
```
- NFL stats (actuals)
```
mart_real_world_projections
```
- Projected NFL stats
```
mart_fantasy_actuals
```
- Fantasy points (actuals, apply scoring rules)
```
mart_fantasy_projections
```
- Projected fantasy points

Steps:

Select quadrant: Determine actuals vs projections, real-world vs fantasy
Pivot fact table: Convert long-form stats to wide columns
Join dimensions: Enrich with descriptive attributes
Apply scoring (fantasy quadrants only):
- Join
```
dim_scoring_rule
```
- Calculate points:
```
{stat} * {points_per_stat}
```
Create SQL using
```
assets/mart_template.sql
```
:
- Partition by season
- Wide format with one column per stat
Run and test

Example pivot:

SUM(CASE WHEN stat_name = 'passing_yards' THEN stat_value END) AS passing_yards,
SUM(CASE WHEN stat_name = 'passing_tds' THEN stat_value END) AS passing_tds

Resources Provided

references/

Real models from the codebase:

example_staging_model.sql - stg_ktc_assets
example_staging_yaml.yml - YAML with tests
example_fact_model.sql - Fact table example
example_dim_model.sql - Dimension example
example_mart_model.sql - Mart example

assets/

Templates for creating models:

staging_template.sql - Staging model SQL
staging_yaml_template.yml - Staging YAML with tests
fact_template.sql - Fact table SQL with FK joins
fact_yaml_template.yml - Fact YAML with grain test
dim_template.sql - Dimension SQL with SCD Type 2
mart_template.sql - Mart SQL with pivot pattern

Best Practices

Grain Declaration

CRITICAL: Every model must explicitly declare grain:

In SQL comments:
```
-- Grain: one row per...
```
In YAML description
In grain uniqueness test (facts)

Testing Strategy

Staging models:

not_null on all PKs
unique on single-column PKs
accepted_values on enums

Fact tables:

dbt_utils.unique_combination_of_columns (grain test)
relationships to all dimensions
not_null on FKs and measures

Dimensions:

unique on surrogate key
not_null on natural key

dbt 1.10+ Test Syntax

CRITICAL: Follow these two rules to avoid deprecation warnings:

Use
data_tests:
key (not
```
tests:
```
): dbt 1.5+ introduced
```
data_tests:
```
to distinguish from
```
unit_tests:
```
Wrap test parameters in
arguments:
: dbt 1.10+ requires this for all generic tests with parameters

# CORRECT - Column-level tests
columns:
  - name: position
    data_tests:  # Use data_tests:, not tests:
      - not_null
      - accepted_values:
          arguments:  # Arguments must be nested
            values: ['QB', 'RB', 'WR', 'TE']

  - name: player_id
    data_tests:
      - not_null
      - relationships:
          arguments:  # Wrap to, field in arguments:
            to: ref('dim_player')
            field: player_id
          config:  # config: is sibling to arguments:
            where: "player_id > 0"

# CORRECT - Model-level tests
data_tests:  # Use data_tests:, not tests:
  - dbt_utils.unique_combination_of_columns:
      arguments:
        combination_of_columns:
          - player_key
          - game_id

# WRONG - Deprecated syntax (will cause warnings)
columns:
  - name: position
    tests:  # WRONG - should be data_tests:
      - accepted_values:
          values: ['QB', 'RB']  # WRONG - should be under arguments:

tests:  # WRONG - should be data_tests:
  - relationships:
      to: ref('dim_player')  # WRONG - should be under arguments:
      field: player_id

Key Points:

Always use
```
data_tests:
```
(not
```
tests:
```
)
```
arguments:
```
wraps test parameters (to, field, values, combination_of_columns)
```
config:
```
is a sibling to
```
arguments:
```
, not nested inside
```
not_null
```
and
```
unique
```
have no arguments, use directly

External Parquet Configuration

Large models use External Parquet:

{{ config(
    materialized='table',
    external=true,
    partition_by=['season', 'week']
) }}

Naming Conventions

Staging:
```
stg_{provider}__{dataset}
```
Facts:
```
fact_{process}
```
(e.g.,
```
fact_player_stats
```
)
Dimensions:
```
dim_{entity}
```
(e.g.,
```
dim_player
```
)

Marts:

mart_{purpose}

(e.g.,

mart_fantasy_actuals_weekly

)

YAML:
```
_<model_name>.yml
```

Integration with Other Skills

data-ingestion-builder - Create staging models after adding providers
data-quality-test-generator - Enhance testing beyond basics