git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/dbt-architecture" ~/.claude/skills/majiayu000-claude-skill-registry-dbt-architecture && rm -rf "$T"
skills/data/dbt-architecture/SKILL.mddbt Architecture
Purpose
Transform AI agents into experts on dbt project architecture and medallion layer patterns, providing guidance on structuring production-grade dbt projects with proper layer separation, naming conventions, and configuration strategies.
When to Use This Skill
Activate this skill when users ask about:
- Planning dbt project structure and folder organization
- Implementing medallion architecture (bronze/silver/gold)
- Establishing naming conventions for models and columns
- Configuring folder-level settings in dbt_project.yml
- Ensuring proper model dependencies and data flow
- Understanding layer separation and architectural patterns
- Setting up tag inheritance strategies
Core Philosophy: Medallion Architecture + Best Practices Integration
Medallion architecture demonstrates how dbt best practices seamlessly integrate with a layered data approach:
- Bronze Layer = Staging Models (
) - One-to-one source relationshipsstg_ - Silver Layer = Intermediate Models (
) - Business logic transformationsint_ - Gold Layer = Marts (
,dim_
) - Business-ready data productsfct_
Every recommendation follows both architectural principles and dbt best practices simultaneously.
Medallion Architecture Quick Reference
Three Layers
Bronze (Staging):
- Naming:
stg_{source}__{table} - Materialization:
ephemeral - Purpose: One-to-one source cleaning
- Rules: No joins, no business logic
Silver (Intermediate):
- Naming:
int_{entity}__{description} - Materialization:
orephemeraltable - Purpose: Business logic, enrichment
- Rules: No direct source references
Gold (Marts):
- Naming:
ordim_{entity}fct_{process} - Materialization:
ortableincremental - Purpose: Business-ready data products
- Rules: Fully tested, documented, optimized
Critical Architectural Rules
Always enforce these patterns:
- ✅ No Direct Joins to Source - Models reference staging (
), neverref('stg_*')
directlysource() - ✅ One-to-One Staging - Each source table has exactly ONE staging model
- ✅ Proper Layering - Clear flow: staging → intermediate → marts
- ✅ Standardized Naming - Consistent
,stg_
,int_
,dim_
prefixesfct_ - ✅ Use ref() and source() - No hard-coded table references
- ✅ Folder-Level Configuration - Set common settings in dbt_project.yml
Official dbt Documentation: How we structure our dbt projects
Bronze Layer: Staging Models
Purpose: One-to-one relationship with source tables. Light cleaning and standardization only.
Materialization:
ephemeral (compiled as CTEs)
Naming:
stg_{source}__{table}.sql
Bronze Template
-- models/bronze/stg_tpc_h__customers.sql {{ config(materialized='ephemeral') }} select -- Primary key (renamed) c_custkey as customer_id, -- Attributes (cast and renamed) c_name as customer_name, c_address as customer_address, c_phone as phone_number, c_acctbal as account_balance, -- Metadata current_timestamp() as dbt_loaded_at from {{ source('tpc_h', 'customer') }}
Bronze Rules
✅ DO:
- One source table → One staging model
- Reference sources using
{{ source() }} - Rename columns to standard naming
- Cast data types
- Basic cleaning (trim, upper/lower)
❌ DON'T:
- Join between sources
- Add business logic
- Aggregate data
- Hard-code table names
Silver Layer: Intermediate Models
Purpose: Reusable business logic and complex transformations. Sits between staging and marts.
Materialization:
ephemeral (reusable logic) or table (complex computations)
Naming:
int_{entity}__{description}.sql
Silver Template
-- models/silver/int_customers__with_orders.sql {{ config(materialized='ephemeral') }} with customers as ( select * from {{ ref('stg_tpc_h__customers') }} ), orders as ( select * from {{ ref('stg_tpc_h__orders') }} ), customer_metrics as ( select customer_id, count(*) as total_orders, sum(order_total) as lifetime_value, min(order_date) as first_order_date from orders group by customer_id ) select c.customer_id, c.customer_name, coalesce(m.total_orders, 0) as total_orders, coalesce(m.lifetime_value, 0) as lifetime_value, m.first_order_date from customers c left join customer_metrics m on c.customer_id = m.customer_id
Silver Rules
✅ DO:
- Reference staging + other intermediate models
- Add business logic and aggregations
- Create reusable components
- Use CTEs for clarity
❌ DON'T:
- Reference sources directly
- Add final presentation logic
- Create one-time-use models
Gold Layer: Marts Models
Purpose: Business-ready data products optimized for BI tools and end users.
Materialization:
table (dimensions) or incremental (large facts)
Naming:
dim_{entity} (dimensions), fct_{process} (facts)
Dimension Template
-- models/gold/dim_customers.sql {{ config(materialized='table') }} with customers as ( select * from {{ ref('int_customers__with_orders') }} ) select -- Primary key customer_id, -- Attributes customer_name, customer_email, -- Metrics total_orders, lifetime_value, first_order_date, -- Business classification case when lifetime_value >= 5000 then 'gold' when lifetime_value >= 1000 then 'silver' else 'bronze' end as customer_tier, -- Metadata current_timestamp() as dbt_updated_at from customers
Fact Template
-- models/gold/fct_orders.sql {{ config( materialized='incremental', unique_key='order_id', cluster_by=['order_date', 'customer_id'] ) }} select order_id, customer_id, order_date, order_status, order_total, current_timestamp() as dbt_updated_at from {{ ref('stg_tpc_h__orders') }} {% if is_incremental() %} where order_date > (select max(order_date) from {{ this }}) {% endif %}
Gold Rules
✅ DO:
- Reference staging, intermediate, and other marts
- Add final business logic
- Optimize for query performance (clustering)
- Test comprehensively
- Document for business users
❌ DON'T:
- Reference sources directly
- Create unnecessary complexity
Naming Conventions
Model Naming
| Layer | Prefix | Example | Purpose |
|---|---|---|---|
| Bronze/Staging | | | Clean source data |
| Silver/Intermediate | | | Business logic |
| Gold/Dimensions | | | Business entities |
| Gold/Facts | | | Business events |
Column Naming Standards
Primary & Foreign Keys:
- customer_id, order_id, product_id{entity}_id- Foreign keys use same naming as primary key in related table
Boolean Flags:
- is_active, is_deleted, is_first_orderis_{condition}
- has_orders, has_discounthas_{attribute}
Dates & Timestamps:
- order_date, created_date{event}_date
- created_at, updated_at, deleted_at{event}_at- Always use UTC timezone suffix if needed - created_at_utc
Metrics & Aggregates:
- order_count, customer_count{metric}_count
- total_amount, discount_amount{metric}_amount- Include currency suffix if applicable - amount_usd, price_eur
Row Numbers & Sequences:
- order_row_number{entity}_row_number
- sequence_number{entity}_seq_number
Consistency Rules
✅ DO:
- Use snake_case for all column names
- Use consistent entity names across models
- Include currency/units in column names when relevant
- Keep names concise but descriptive
❌ DON'T:
- Mix naming styles (camelCase vs snake_case)
- Use abbreviations inconsistently
- Create ambiguous names without context
- Use reserved SQL keywords
Folder Structure
models/ ├── bronze/ # Staging layer - one-to-one with sources │ ├── stg_tpc_h__customers.sql │ ├── stg_tpc_h__orders.sql │ └── stg_tpc_h__lineitem.sql ├── silver/ # Intermediate layer - business logic │ ├── int_customers__with_orders.sql │ ├── int_fx_rates__daily.sql │ └── customer_segments.sql └── gold/ # Marts layer - business-ready analytics ├── dim_customers.sql ├── dim_products.sql ├── fct_orders.sql └── fct_order_lines.sql
Configuration in dbt_project.yml
Folder-Level Configuration (Reduces Repetition)
Configure common settings at the folder level to minimize model-level overrides:
models: your_project: bronze: +materialized: ephemeral +tags: ["bronze", "staging"] +schema: bronze silver: +materialized: ephemeral +tags: ["silver"] +schema: silver gold: +materialized: table +tags: ["gold", "marts"] +schema: gold
Model-Level Configuration: Override folder defaults only for unique requirements (incremental settings, clustering, etc.)
Tag Inheritance Strategy
✅ LEVERAGE: dbt's additive tag inheritance
Tags accumulate hierarchically per the dbt documentation. Child folders inherit all parent tags automatically.
# ✅ GOOD: Avoid duplicate tags bronze: +tags: ["bronze", "staging"] subfolder: +tags: ["subfolder"] # Inherits: bronze, staging, subfolder # ❌ BAD: Redundant parent tags bronze: +tags: ["bronze", "staging"] subfolder: +tags: ["bronze", "staging", "subfolder"] # Duplicates parent tags
Common Selection Patterns:
dbt run --select tag:bronze # All bronze models dbt run --select tag:gold # All gold models dbt run --select tag:staging # Alternative to bronze
Helping Users with Architecture
Strategy for Assisting Users
When users ask for architectural guidance:
- Identify the layer: Which medallion layer (bronze/silver/gold)?
- Clarify purpose: What transformation or business logic is needed?
- Apply naming conventions: Follow
,stg_
,int_
,dim_
patternsfct_ - Recommend materialization: Based on layer and reusability
- Provide working examples: Show complete, tested code patterns
- Validate dependencies: Ensure proper layer flow (staging → intermediate → marts)
Common User Questions
"How should I structure my project?"
- Explain medallion architecture layers
- Show folder organization by layer
- Demonstrate model dependencies flow
- Provide naming convention standards
- Show configuration strategy (folder-level first)
"Where does this model belong?"
- Ask: Is it cleaning source data? → Bronze
- Ask: Does it add business logic? → Silver
- Ask: Is it for end-user consumption? → Gold
"What should I name this model?"
- Bronze:
stg_{source}__{table} - Silver:
int_{entity}__{description} - Gold dimensions:
dim_{entity} - Gold facts:
fct_{process}
Related Official Documentation
- dbt Best Practices: How We Structure Our dbt Projects
- dbt Best Practices: Structuring Project
- dbt Resource Configurations: Tags
Goal: Transform AI agents into expert dbt architects who guide users through project structure with confidence, clarity, and production-ready patterns.