Claude-skill-registry dbt-migration

Database to dbt Migration Workflow

install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/dbt-migration" ~/.claude/skills/majiayu000-claude-skill-registry-dbt-migration && rm -rf "$T"
manifest: skills/data/dbt-migration/SKILL.md
source content

Database to dbt Migration Workflow

Purpose and When to Use

Guide AI agents through the complete migration lifecycle from Snowflake or legacy database systems (SQL Server, Oracle, Teradata, etc.) to production-quality dbt projects on Snowflake. This skill defines a structured, repeatable process while delegating platform-specific syntax translation to dedicated source-specific skills.

Activate this skill when users ask about:

  • Planning a database migration to dbt
  • Organizing legacy scripts for migration
  • Converting views and stored procedures to dbt models
  • Testing migration results against source systems
  • Deploying migrated dbt projects to production

Snowflake Migration Tools

Recommended Two-Step Approach

  1. Convert to Snowflake first: Use SnowConvert AI and AI Powered Code Conversion to convert source database objects (from SQL Server, Oracle, Teradata, etc.) to Snowflake tables, views, and stored procedures.
  2. Then convert to dbt: Use the $dbt-migration-snowflake skill to migrate Snowflake objects to dbt models.

SnowConvert AI (Recommended for Supported Platforms)

SnowConvert AI converts source DDL, views, stored procedures, functions, and additional objects (triggers, sequences, indexes) to Snowflake-compatible SQL. Download SnowConvert AI

Supported Platforms

  • Full support (tables, views, procedures, functions): SQL Server, Oracle, Teradata, Redshift, Azure Synapse, IBM DB2
  • Partial support (tables, views only): Sybase IQ, BigQuery, PostgreSQL, Spark SQL/Databricks, Hive, Vertica, Greenplum/Netezza

Platform-Specific Features

  • SQL Server: Direct DB connection, data migration, SSIS replatform
  • Oracle, Azure Synapse, Sybase IQ, BigQuery: DDL Extraction script
  • Teradata: BTEQ/MLOAD/TPUMP support
  • Redshift: Direct DB connection, data migration

Additional Snowflake Migration Tools

ToolPurpose
AI Code ConversionAI-powered validation and repair of converted code
Migration AssistantVS Code extension for resolving conversion issues (EWIs)
Data MigrationTransfer data to Snowflake (SQL Server, Redshift)
Data ValidationGUI-based validation (SQL Server)
Data Validation CLICLI validation (SQL Server, Teradata, Redshift)
ETL ReplatformConvert SSIS packages to dbt projects
Power BI RepointingRedirect Power BI reports to Snowflake

Migration Workflow Overview

The migration process follows seven sequential phases. Each phase has entry criteria, deliverables, and validation gates that must pass before advancing.

1-Discovery → 2-Planning → 3-Placeholders → 4-Views → 5-Table Logic → 6-Testing → 7-Deployment

Phase 1: Discovery and Assessment

Create a complete inventory of source database objects and understand dependencies, volumes, and complexity to inform migration planning.

SnowConvert AI Option: If your platform is supported, SnowConvert AI provides extraction scripts that automate object inventory, dependency mapping, and initial code conversion.

Phase 1 Activities

  1. Inventory source objects: Query system catalogs for tables, views, procedures, functions
  2. Document dependencies: Map object dependencies to determine migration order
  3. Document volumes: Record row counts and data sizes
  4. Assess complexity: Categorize objects as Low/Medium/High/Custom complexity
  5. Create migration tracker: Document objects in spreadsheet or issue tracker

Complexity Assessment

ComplexityCriteriaExamples
LowSimple SELECT, no/minimal joinsLookup tables, simple views
MediumMultiple joins, aggregations, CASESummary views, report queries
HighProcedural logic, cursors, temp tablesSCD procedures, bulk loads
CustomPlatform-specific featuresWrapped code, CLR functions

Phase 1 Checklist

  • All tables, views, procedures inventoried
  • Row counts documented
  • Object dependencies mapped
  • Complexity assessment complete
  • Migration tracker created
  • Refresh frequencies identified

Phase 2: Planning and Organization

Organize legacy scripts, map objects to the dbt medallion architecture, and establish naming conventions before any conversion begins.

Phase 2 Activities

  1. Organize legacy scripts: Create folder structure (tables/, views/, stored_procedures/, functions/)
  2. Map to medallion layers: Assign objects to Bronze/Silver/Gold with appropriate prefixes
  3. Define naming conventions: Follow $dbt-architecture skill patterns
  4. Create dependency graph: Visualize migration order
  5. Establish validation criteria: Define success metrics per object

Layer Mapping Reference

Source Object TypeTarget Layerdbt PrefixMaterialization
Source tables (raw)Bronze
stg_
ephemeral
Simple viewsBronze
stg_
ephemeral
Complex viewsSilver
int_
ephemeral/table
Dimension proceduresGold
dim_
table
Fact proceduresGold
fct_
incremental

Phase 2 Checklist

  • Legacy scripts organized in folders
  • All objects mapped to medallion layers
  • Naming conventions documented
  • Dependency graph created
  • Migration order established
  • Validation criteria defined

Phase 3: Create Placeholder Models

Create empty dbt models with correct column names, data types, and schema documentation before adding any transformation logic. This establishes the contract for downstream consumers.

Phase 3 Activities

  1. Generate placeholder models: Create SQL files with
    null::datatype as column_name
    pattern and
    where false
  2. Map datatypes: Use platform-specific skill for datatype conversion to Snowflake types
  3. Create schema documentation: Generate
    _models.yml
    with column descriptions and tests
  4. Validate compilation: Run
    dbt compile --select tag:placeholder
  5. Track status: Add
    placeholder
    tag to config for tracking

Placeholder Model Pattern

{{ config(materialized='ephemeral', tags=['placeholder', 'bronze']) }}

select
    null::integer as column_id,
    null::varchar(100) as column_name,
    -- ... additional columns with explicit types
where false

Phase 3 Checklist

  • Placeholder model created for each target table
  • All columns have explicit datatype casts
  • Column names follow naming conventions
  • _models.yml
    created with descriptions and tests
  • All placeholder models compile successfully
  • Placeholder tag applied for tracking

Phase 4: Convert Views

Convert source database views to dbt models, starting with simple views before tackling complex ones. Views are typically easier than stored procedures as they contain declarative SQL.

Phase 4 Activities

  1. Prioritize by complexity: Simple views (no joins) → Join views → Aggregate views → Complex views
  2. Apply syntax translation: Delegate to platform-specific skill (see Related Skills)
  3. Structure with CTEs: Use standard CTE pattern from $dbt-modeling skill
  4. Add tests: Define tests in
    _models.yml
    using $dbt-testing skill patterns
  5. Replace placeholder logic: Update placeholder SELECT with converted logic

Phase 4 Checklist

  • Views prioritized by complexity
  • Platform-specific syntax translated (delegate to source skills)
  • CTE pattern applied consistently
  • dbt tests added for each view
  • Converted views compile successfully
  • Inline comments document syntax changes

Phase 5: Convert Table Logic from Stored Procedures

Transform procedural stored procedure logic into declarative dbt models, selecting appropriate materializations for different ETL patterns.

Phase 5 Activities

  1. Analyze ETL patterns: Identify Full Refresh, SCD Type 1/2, Append, Delete+Insert patterns
  2. Map to materializations: Use pattern-to-materialization mapping from $dbt-materializations skill
  3. Break complex procedures: Split single procedures into multiple intermediate/final models
  4. Convert procedural constructs: Replace cursors, temp tables, variables with declarative SQL
  5. Document decisions: Add header comments explaining conversion approach

Pattern Mapping Reference

Source Patterndbt Approach
TRUNCATE + INSERT
materialized='table'
UPDATE + INSERT (SCD1)
materialized='incremental'
with merge
SCD Type 2dbt snapshot or custom incremental
INSERT only
materialized='incremental'
append
DELETE range + INSERT
incremental
with
delete+insert
strategy

Procedural to Declarative Conversion

Procedural Patterndbt Equivalent
CURSOR loopWindow function or recursive CTE
Temp tablesCTEs or intermediate models
VariablesJinja variables or macros
IF/ELSE branchesCASE expressions or
{% if %}
TRY/CATCHPre-validation tests

Phase 5 Checklist

  • All stored procedures analyzed for patterns
  • ETL patterns mapped to dbt materializations
  • Complex procedures broken into multiple models
  • Procedural logic converted to declarative SQL
  • Conversion decisions documented in model headers
  • All converted models compile successfully

Phase 6: End-to-End Testing and Validation

Verify that migrated dbt models produce identical results to source system, using multiple validation techniques to ensure data integrity.

Snowflake Data Validation CLI: For SQL Server, Teradata, or Redshift migrations, the Data Validation CLI provides automated schema validation (columns, data types, row counts) and metrics validation (MIN, MAX, AVG, NULL count, DISTINCT count).

Phase 6 Activities

  1. Row count validation: Compare total counts between source and target
  2. Column checksum validation: Compare row-level hashes to identify differences
  3. Business rule validation: Verify calculated fields match source logic
  4. Aggregate validation: Compare summary metrics (sums, counts, averages)
  5. Mock data testing: Create seed fixtures for complex transformation testing
  6. Incremental validation: Test both full-refresh and incremental runs
  7. Document results: Create validation report for each migrated object

Validation Techniques

TechniquePurposeImplementation
Row countsDetect missing/extra rowsCompare COUNT(*)
ChecksumsDetect value differencesSHA2 hash comparison
Business rulesVerify logic accuracySingular tests
AggregatesValidate totalsSUM/AVG comparisons
Mock dataTest transformationsSeed files + expected outputs

Phase 6 Checklist

  • Row count validation queries created
  • Checksum comparison implemented
  • Business rule tests written
  • Aggregate metrics compared
  • Incremental models tested (full refresh + incremental)
  • All validation queries pass
  • Discrepancies documented and resolved
  • Validation report completed

Phase 7: Deployment and Cutover

Deploy validated dbt models to production with a clear cutover plan and monitoring strategy.

Phase 7 Activities

  1. Deploy to Development: Run
    dbt build --target dev
    and validate
  2. Deploy to Test/UAT: Run full validation suite with
    --store-failures
  3. Create cutover plan: Document pre-cutover, cutover, post-cutover, and rollback steps
  4. Deploy to Production: Execute deployment with production data
  5. Configure scheduled runs: Set up Snowflake tasks or dbt Cloud scheduling
  6. Monitor post-deployment: Track run duration, row counts, test failures, performance

Cutover Plan Template

PhaseActivities
Pre-Cutover (T-1)Final validation, stakeholder sign-off, rollback docs, user communication
Cutover (T-0)Disable source ETL, final sync, deploy, build, validate, update BI connections
Post-Cutover (T+1)Monitor performance, verify schedules, confirm access, close tickets
RollbackRe-enable source ETL, revert BI connections, document issues

Phase 7 Checklist

  • Development deployment successful
  • Test/UAT deployment successful
  • Cutover plan documented
  • Rollback procedure documented
  • Stakeholder sign-off obtained
  • Production deployment successful
  • Scheduled runs configured
  • Monitoring set up
  • Migration marked complete

Related Skills

Platform-Specific Translation Skills

For syntax translation, delegate to the appropriate source-specific skill:

Source PlatformSkillKey Considerations
Snowflake$dbt-migration-snowflakeConvert Snowflake objects to dbt
SQL Server / Azure Synapse$dbt-migration-ms-sql-serverT-SQL, IDENTITY, TOP, #temp tables
Oracle$dbt-migration-oraclePL/SQL, ROWNUM, CONNECT BY, packages
Teradata$dbt-migration-teradataQUALIFY, BTEQ, volatile tables
BigQuery$dbt-migration-bigqueryUNNEST, STRUCT/ARRAY, backticks
Redshift$dbt-migration-redshiftDISTKEY/SORTKEY, COPY/UNLOAD
PostgreSQL / Greenplum / Netezza$dbt-migration-postgresArray expressions, psql commands
IBM DB2$dbt-migration-db2SQL PL, FETCH FIRST, handlers
Hive / Spark / Databricks$dbt-migration-hiveExternal tables, PARTITIONED BY
Vertica$dbt-migration-verticaProjections, flex tables
Sybase IQ$dbt-migration-sybaseT-SQL variant, SELECT differences

Quick Reference: Phase Summary

<!-- AGENT_WORKFLOW_METADATA: Machine-parseable phase definitions -->
PhaseKey DeliverableExit CriteriaPrimary SkillValidation FocusValidation Command
1. Discovery
migration_inventory.csv
, dependency graph
Inventory complete, dependencies mappedThis skillObject counts, dependency completenessManual review
2. PlanningFolder structure,
_naming_conventions.md
Folder structure created, naming defined$dbt-architectureFolder hierarchy, naming conventions
ls -la models/
3. Placeholders
.sql
files,
_models.yml
All models compile with
where false
This skillYAML structure, column definitions, naming
dbt compile --select tag:placeholder
4. ViewsConverted view modelsAll views converted and compiledbt-migration-{source}, $dbt-modelingSyntax translation, CTE patterns, ref() usage
dbt build --select tag:view
5. Table LogicConverted procedure modelsAll procedures converted$dbt-materializationsIncremental configs, materialization patterns
dbt build --select tag:procedure
6. TestingValidation queries, test resultsAll validation queries pass$dbt-testing, $dbt-performanceTest coverage, constraint definitions
dbt test --store-failures
7. DeploymentProduction models, monitoringProduction deployment successful$dbt-commands, $snowflake-cliRun success, schedule configuration
dbt build --target prod

General Skills

  • $dbt-core: Local installation, configuration, package management
  • $snowflake-connections: Connection setup for Snowflake CLI, Streamlit, dbt

Validation Requirements

CRITICAL: Agents must not advance to the next phase until all validations pass.

Before proceeding to each phase, verify:

  1. dbt compile
    succeeds
  2. dbt test
    passes
  3. Validation hooks report no errors

Hook configuration is defined in

.claude/settings.local.json
.