Claude-skill-registry data-contract
Create, validate, test, and manage data contracts using the Open Data Contract Specification (ODCS) and the datacontract CLI. Use when working with data contracts, ODCS specifications, data quality rules, or when the user mentions datacontract CLI or data contract workflows.
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/data-contract" ~/.claude/skills/majiayu000-claude-skill-registry-data-contract && rm -rf "$T"
skills/data/data-contract/SKILL.mdData Contract Management
Overview
This skill helps you work with data contracts following the Open Data Contract Specification (ODCS). You can execute the datacontract CLI directly to create, lint, test, and export data contracts.
When to Use This Skill
- Creating new data contracts
- Validating contracts against ODCS specification
- Testing data quality rules
- Generating platform-specific artifacts (dbt, SQL, etc.)
- Working with data contract YAML files
- Discussing data quality patterns and SLAs
Available CLI Commands
Core Commands
- Initialize a new data contractdatacontract init [--template PLATFORM]
- Validate against ODCS specdatacontract lint {description}-CONTRACT.yaml
- Run data quality testsdatacontract test {description}-CONTRACT.yaml
- Export to other formatsdatacontract export {description}-CONTRACT.yaml --format FORMAT
- Check for best practicesdatacontract lint {description}-CONTRACT.yaml
- Check for breaking changesdatacontract breaking {description}-CONTRACT.yaml CONTRACT_v2.yaml
Supported Platforms
Templates available: snowflake, bigquery, redshift, databricks, postgres, s3, local
Export Formats
Available formats: dbt, dbt-sources, dbt-staging-sql, odcs, jsonschema, sql, sqlalchemy, avro, protobuf, great-expectations, terraform, rdf
Workflow Patterns
Creating a New Data Contract
Step 1: Gather Requirements Ask the user for:
- Data platform (Snowflake, BigQuery, Databricks, etc.)
- Schema details:
- Database/dataset name
- Table/model name
- Column names, types, and descriptions
- Quality requirements:
- Freshness expectations (how often data updates)
- Completeness rules (which fields must not be null)
- Uniqueness constraints (primary keys, unique fields)
- Validity rules (patterns, ranges, allowed values)
- SLAs and policies:
- Availability commitments
- Retention policies
- Access/privacy considerations
Step 2: Initialize Contract
datacontract init --template <platform>
This creates a basic contract template for the specified platform.
Step 3: Customize the Contract Edit the generated YAML to include:
- Specific schema details
- Column definitions with types and descriptions
- Quality rules based on requirements
- SLA specifications
Step 4: Lint
datacontract lint {description}-contract.yaml
Always validate after creation or any modifications.
Step 5: Iterate Based on Validation If validation fails:
- Parse error messages carefully
- Fix structural issues first (missing required fields, incorrect format)
- Then address semantic issues (invalid values, incorrect references)
- Re-validate after each fix
- Maximum 3 iteration cycles before asking user for guidance
Validation Error Handling
When validation fails, follow this pattern:
- Read the error output - Identify specific issues (missing fields, type mismatches, invalid values)
- Prioritize fixes:
- Required ODCS fields first (dataset, schema, columns)
- Format/structure issues
- Type and value constraints
- Fix one category at a time - Don't try to fix everything at once
- Validate after each fix - Ensures you're making progress
- Explain what you fixed - Help the user understand the changes
Testing Data Quality
datacontract test {description}-contract.yaml
Tests validate that actual data meets the quality rules defined in the contract. This requires:
- Access to the data platform
- Appropriate connection credentials
- Data actually existing at the specified location
If tests fail, help the user understand:
- Which quality rules failed
- Whether the contract needs adjustment or the data needs fixing
ODCS Structure Essentials
Required Top-Level Fields
Every data contract must include:
- Version of ODCS (e.g., "0.9.3")dataContractSpecification
- Unique identifier for the contractid
- Metadata (title, version, description, owner, contact)info
- Data platform connection detailsservers
- The data model definitionschema
Schema Definition
The schema section defines your data model:
schema: type: dbt | table | view | ... specification: dbt | bigquery | snowflake | ... <table_name>: type: table columns: <column_name>: type: <data_type> required: true|false description: "..." unique: true|false primary: true|false
Quality Rules
Define quality expectations:
quality: type: SodaCL | great-expectations | ... specification: checks for <table_name>: - freshness(<column>) < 24h - missing_count(<column>) = 0 - duplicate_count(<column>) = 0 - values in (<column>) must be in [list]
Common Quality Patterns
Freshness
How recently data was updated:
- Data updated within last hourfreshness(updated_at) < 1h
- Daily updatesfreshness(load_date) < 1d
Completeness
Ensuring required data exists:
- No nulls in required fieldmissing_count(customer_id) = 0
- Allow some missing valuesmissing_percent(email) < 5%
Uniqueness
Preventing duplicates:
- Primary keys must be uniqueduplicate_count(order_id) = 0
- Composite uniquenessduplicate_count(user_id, timestamp) = 0
Validity
Data meets expected patterns:
values in (status) must be in ['pending', 'completed', 'cancelled']
- Email format validationinvalid_percent(email) < 1%
- Range constraintsmin(price) >= 0
- Length constraintsmax_length(postal_code) = 5
Best Practices
Contract Creation
- Start with required ODCS fields to ensure valid structure
- Use platform-appropriate data types (Snowflake: VARCHAR, BigQuery: STRING, etc.)
- Include clear descriptions for all columns - aids documentation
- Mark primary keys and required fields explicitly
- Consider downstream dependencies when defining schema
Quality Rules
- Add quality rules incrementally - start with critical rules
- Be realistic with thresholds - 100% quality isn't always achievable
- Match rules to business requirements, not technical ideals
- Test rules on actual data before finalizing
- Document why specific thresholds were chosen
Validation Workflow
- Validate early and often
- Fix structural issues before semantic ones
- Keep validation output for reference
- Re-validate after any change
- Use lint command for additional best practice checks
Platform-Specific Considerations
- Snowflake: Use fully qualified names (database.schema.table)
- BigQuery: Use dataset.table naming
- Databricks: Consider Unity Catalog structure
- S3: Include bucket and path information
- Local: Specify file paths clearly
Exporting Contracts
Generate platform-specific artifacts:
datacontract export {description}-contract.yaml --format dbt datacontract export {description}-contract.yaml --format sql datacontract export {description}-contract.yaml --format great-expectations
Common use cases:
- dbt: Generate dbt model YAML and staging SQL
- sql: Create DDL statements for database setup
- great-expectations: Generate expectations for data validation
- terraform: Infrastructure as code for data resources
Tips for Effective Use
- Be conversational - Ask clarifying questions rather than guessing requirements
- Show your work - Explain what commands you're running and why
- Parse errors carefully - CLI output can be verbose; extract key issues
- Iterate transparently - Show validation results and explain fixes
- Suggest improvements - Recommend quality rules based on data types
- Reference documentation - When unsure, mention you can look up ODCS spec details
Example Interaction Flow
User: "Create a data contract for our customer orders table in Snowflake"
You should:
- Ask about schema (columns, types), quality needs, and SLAs
- Run:
datacontract init --template snowflake - Explain the generated structure
- Customize based on user's answers
- Run:
datacontract lint {description}-contract.yaml - If errors, fix and re-validate
- Suggest quality rules based on the schema
- Offer to test or export as needed
Resources
When you need more details:
- ODCS Specification: https://datacontract.com
- CLI Documentation: https://cli.datacontract.com
- Example contracts in the datacontract repository
Remember: The datacontract CLI is your primary tool. Execute commands directly, parse output carefully, and guide the user through the process with clear explanations.