Claude-skill-registry dbt-migration

Database to dbt Migration Workflow

install

source · Clone the upstream repo

git clone https://github.com/majiayu000/claude-skill-registry

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/dbt-migration" ~/.claude/skills/majiayu000-claude-skill-registry-dbt-migration && rm -rf "$T"

manifest: skills/data/dbt-migration/SKILL.md

Database to dbt Migration Workflow

Purpose and When to Use

Guide AI agents through the complete migration lifecycle from Snowflake or legacy database systems (SQL Server, Oracle, Teradata, etc.) to production-quality dbt projects on Snowflake. This skill defines a structured, repeatable process while delegating platform-specific syntax translation to dedicated source-specific skills.

Activate this skill when users ask about:

Planning a database migration to dbt
Organizing legacy scripts for migration
Converting views and stored procedures to dbt models
Testing migration results against source systems
Deploying migrated dbt projects to production

Snowflake Migration Tools

Recommended Two-Step Approach

Convert to Snowflake first: Use SnowConvert AI and AI Powered Code Conversion to convert source database objects (from SQL Server, Oracle, Teradata, etc.) to Snowflake tables, views, and stored procedures.
Then convert to dbt: Use the $dbt-migration-snowflake skill to migrate Snowflake objects to dbt models.

SnowConvert AI (Recommended for Supported Platforms)

SnowConvert AI converts source DDL, views, stored procedures, functions, and additional objects (triggers, sequences, indexes) to Snowflake-compatible SQL. Download SnowConvert AI

Supported Platforms

Full support (tables, views, procedures, functions): SQL Server, Oracle, Teradata, Redshift, Azure Synapse, IBM DB2
Partial support (tables, views only): Sybase IQ, BigQuery, PostgreSQL, Spark SQL/Databricks, Hive, Vertica, Greenplum/Netezza

Platform-Specific Features

SQL Server: Direct DB connection, data migration, SSIS replatform
Oracle, Azure Synapse, Sybase IQ, BigQuery: DDL Extraction script
Teradata: BTEQ/MLOAD/TPUMP support
Redshift: Direct DB connection, data migration

Additional Snowflake Migration Tools

Tool	Purpose
AI Code Conversion	AI-powered validation and repair of converted code
Migration Assistant	VS Code extension for resolving conversion issues (EWIs)
Data Migration	Transfer data to Snowflake (SQL Server, Redshift)
Data Validation	GUI-based validation (SQL Server)
Data Validation CLI	CLI validation (SQL Server, Teradata, Redshift)
ETL Replatform	Convert SSIS packages to dbt projects
Power BI Repointing	Redirect Power BI reports to Snowflake

Migration Workflow Overview

The migration process follows seven sequential phases. Each phase has entry criteria, deliverables, and validation gates that must pass before advancing.

1-Discovery → 2-Planning → 3-Placeholders → 4-Views → 5-Table Logic → 6-Testing → 7-Deployment

Phase 1: Discovery and Assessment

Create a complete inventory of source database objects and understand dependencies, volumes, and complexity to inform migration planning.

SnowConvert AI Option: If your platform is supported, SnowConvert AI provides extraction scripts that automate object inventory, dependency mapping, and initial code conversion.

Phase 1 Activities

Inventory source objects: Query system catalogs for tables, views, procedures, functions
Document dependencies: Map object dependencies to determine migration order
Document volumes: Record row counts and data sizes
Assess complexity: Categorize objects as Low/Medium/High/Custom complexity
Create migration tracker: Document objects in spreadsheet or issue tracker

Complexity Assessment

Complexity	Criteria	Examples
Low	Simple SELECT, no/minimal joins	Lookup tables, simple views
Medium	Multiple joins, aggregations, CASE	Summary views, report queries
High	Procedural logic, cursors, temp tables	SCD procedures, bulk loads
Custom	Platform-specific features	Wrapped code, CLR functions

Phase 1 Checklist

All tables, views, procedures inventoried
Row counts documented
Object dependencies mapped
Complexity assessment complete
Migration tracker created
Refresh frequencies identified

Phase 2: Planning and Organization

Organize legacy scripts, map objects to the dbt medallion architecture, and establish naming conventions before any conversion begins.

Phase 2 Activities

Organize legacy scripts: Create folder structure (tables/, views/, stored_procedures/, functions/)
Map to medallion layers: Assign objects to Bronze/Silver/Gold with appropriate prefixes
Define naming conventions: Follow $dbt-architecture skill patterns
Create dependency graph: Visualize migration order
Establish validation criteria: Define success metrics per object

Layer Mapping Reference

Source Object Type	Target Layer	dbt Prefix	Materialization
Source tables (raw)	Bronze	`stg_`	ephemeral
Simple views	Bronze	`stg_`	ephemeral
Complex views	Silver	`int_`	ephemeral/table
Dimension procedures	Gold	`dim_`	table
Fact procedures	Gold	`fct_`	incremental

Phase 2 Checklist

Legacy scripts organized in folders
All objects mapped to medallion layers
Naming conventions documented
Dependency graph created
Migration order established
Validation criteria defined

Phase 3: Create Placeholder Models

Create empty dbt models with correct column names, data types, and schema documentation before adding any transformation logic. This establishes the contract for downstream consumers.

Phase 3 Activities

Generate placeholder models: Create SQL files with
```
null::datatype as column_name
```
pattern and
```
where false
```
Map datatypes: Use platform-specific skill for datatype conversion to Snowflake types
Create schema documentation: Generate
```
_models.yml
```
with column descriptions and tests
Validate compilation: Run
```
dbt compile --select tag:placeholder
```
Track status: Add
```
placeholder
```
tag to config for tracking

Placeholder Model Pattern

{{ config(materialized='ephemeral', tags=['placeholder', 'bronze']) }}

select
    null::integer as column_id,
    null::varchar(100) as column_name,
    -- ... additional columns with explicit types
where false

Phase 3 Checklist

Placeholder model created for each target table
All columns have explicit datatype casts
Column names follow naming conventions
```
_models.yml
```
created with descriptions and tests
All placeholder models compile successfully
Placeholder tag applied for tracking

Phase 4: Convert Views

Convert source database views to dbt models, starting with simple views before tackling complex ones. Views are typically easier than stored procedures as they contain declarative SQL.

Phase 4 Activities

Prioritize by complexity: Simple views (no joins) → Join views → Aggregate views → Complex views
Apply syntax translation: Delegate to platform-specific skill (see Related Skills)
Structure with CTEs: Use standard CTE pattern from $dbt-modeling skill
Add tests: Define tests in
```
_models.yml
```
using $dbt-testing skill patterns
Replace placeholder logic: Update placeholder SELECT with converted logic

Phase 4 Checklist

Views prioritized by complexity
Platform-specific syntax translated (delegate to source skills)
CTE pattern applied consistently
dbt tests added for each view
Converted views compile successfully
Inline comments document syntax changes

Phase 5: Convert Table Logic from Stored Procedures

Transform procedural stored procedure logic into declarative dbt models, selecting appropriate materializations for different ETL patterns.

Phase 5 Activities

Analyze ETL patterns: Identify Full Refresh, SCD Type 1/2, Append, Delete+Insert patterns
Map to materializations: Use pattern-to-materialization mapping from $dbt-materializations skill
Break complex procedures: Split single procedures into multiple intermediate/final models
Convert procedural constructs: Replace cursors, temp tables, variables with declarative SQL
Document decisions: Add header comments explaining conversion approach

Pattern Mapping Reference

Source Pattern	dbt Approach
TRUNCATE + INSERT	`materialized='table'`
UPDATE + INSERT (SCD1)	`materialized='incremental'` with merge
SCD Type 2	dbt snapshot or custom incremental
INSERT only	`materialized='incremental'` append
DELETE range + INSERT	`incremental` with `delete+insert` strategy

Procedural to Declarative Conversion

Procedural Pattern	dbt Equivalent
CURSOR loop	Window function or recursive CTE
Temp tables	CTEs or intermediate models
Variables	Jinja variables or macros
IF/ELSE branches	CASE expressions or `{% if %}`
TRY/CATCH	Pre-validation tests

Phase 5 Checklist

All stored procedures analyzed for patterns
ETL patterns mapped to dbt materializations
Complex procedures broken into multiple models
Procedural logic converted to declarative SQL
Conversion decisions documented in model headers
All converted models compile successfully

Phase 6: End-to-End Testing and Validation

Verify that migrated dbt models produce identical results to source system, using multiple validation techniques to ensure data integrity.

Snowflake Data Validation CLI: For SQL Server, Teradata, or Redshift migrations, the Data Validation CLI provides automated schema validation (columns, data types, row counts) and metrics validation (MIN, MAX, AVG, NULL count, DISTINCT count).

Phase 6 Activities

Row count validation: Compare total counts between source and target
Column checksum validation: Compare row-level hashes to identify differences
Business rule validation: Verify calculated fields match source logic
Aggregate validation: Compare summary metrics (sums, counts, averages)
Mock data testing: Create seed fixtures for complex transformation testing
Incremental validation: Test both full-refresh and incremental runs
Document results: Create validation report for each migrated object

Validation Techniques

Technique	Purpose	Implementation
Row counts	Detect missing/extra rows	Compare COUNT(*)
Checksums	Detect value differences	SHA2 hash comparison
Business rules	Verify logic accuracy	Singular tests
Aggregates	Validate totals	SUM/AVG comparisons
Mock data	Test transformations	Seed files + expected outputs

Phase 6 Checklist

Row count validation queries created
Checksum comparison implemented
Business rule tests written
Aggregate metrics compared
Incremental models tested (full refresh + incremental)
All validation queries pass
Discrepancies documented and resolved
Validation report completed

Phase 7: Deployment and Cutover

Deploy validated dbt models to production with a clear cutover plan and monitoring strategy.

Phase 7 Activities

Deploy to Development: Run
```
dbt build --target dev
```
and validate
Deploy to Test/UAT: Run full validation suite with
```
--store-failures
```
Create cutover plan: Document pre-cutover, cutover, post-cutover, and rollback steps
Deploy to Production: Execute deployment with production data
Configure scheduled runs: Set up Snowflake tasks or dbt Cloud scheduling
Monitor post-deployment: Track run duration, row counts, test failures, performance

Cutover Plan Template

Phase	Activities
Pre-Cutover (T-1)	Final validation, stakeholder sign-off, rollback docs, user communication
Cutover (T-0)	Disable source ETL, final sync, deploy, build, validate, update BI connections
Post-Cutover (T+1)	Monitor performance, verify schedules, confirm access, close tickets
Rollback	Re-enable source ETL, revert BI connections, document issues

Phase 7 Checklist

Related Skills

Platform-Specific Translation Skills

For syntax translation, delegate to the appropriate source-specific skill:

Source Platform	Skill	Key Considerations
Snowflake	$dbt-migration-snowflake	Convert Snowflake objects to dbt
SQL Server / Azure Synapse	$dbt-migration-ms-sql-server	T-SQL, IDENTITY, TOP, #temp tables
Oracle	$dbt-migration-oracle	PL/SQL, ROWNUM, CONNECT BY, packages
Teradata	$dbt-migration-teradata	QUALIFY, BTEQ, volatile tables
BigQuery	$dbt-migration-bigquery	UNNEST, STRUCT/ARRAY, backticks
Redshift	$dbt-migration-redshift	DISTKEY/SORTKEY, COPY/UNLOAD
PostgreSQL / Greenplum / Netezza	$dbt-migration-postgres	Array expressions, psql commands
IBM DB2	$dbt-migration-db2	SQL PL, FETCH FIRST, handlers
Hive / Spark / Databricks	$dbt-migration-hive	External tables, PARTITIONED BY
Vertica	$dbt-migration-vertica	Projections, flex tables
Sybase IQ	$dbt-migration-sybase	T-SQL variant, SELECT differences

Quick Reference: Phase Summary

Phase	Key Deliverable	Exit Criteria	Primary Skill	Validation Focus	Validation Command
1. Discovery	`migration_inventory.csv` , dependency graph	Inventory complete, dependencies mapped	This skill	Object counts, dependency completeness	Manual review
2. Planning	Folder structure, `_naming_conventions.md`	Folder structure created, naming defined	$dbt-architecture	Folder hierarchy, naming conventions	`ls -la models/`
3. Placeholders	`.sql` files, `_models.yml`	All models compile with `where false`	This skill	YAML structure, column definitions, naming	`dbt compile --select tag:placeholder`
4. Views	Converted view models	All views converted and compile	dbt-migration-{source}, $dbt-modeling	Syntax translation, CTE patterns, ref() usage	`dbt build --select tag:view`
5. Table Logic	Converted procedure models	All procedures converted	$dbt-materializations	Incremental configs, materialization patterns	`dbt build --select tag:procedure`
6. Testing	Validation queries, test results	All validation queries pass	$dbt-testing, $dbt-performance	Test coverage, constraint definitions	`dbt test --store-failures`
7. Deployment	Production models, monitoring	Production deployment successful	$dbt-commands, $snowflake-cli	Run success, schedule configuration	`dbt build --target prod`

General Skills

$dbt-core: Local installation, configuration, package management
$snowflake-connections: Connection setup for Snowflake CLI, Streamlit, dbt

Validation Requirements

CRITICAL: Agents must not advance to the next phase until all validations pass.

Before proceeding to each phase, verify:

```
dbt compile
```
succeeds
```
dbt test
```
passes
Validation hooks report no errors

Hook configuration is defined in

.claude/settings.local.json