Babysitter CDC Pattern Implementer

Implements Change Data Capture patterns for real-time data integration

install
source · Clone the upstream repo
git clone https://github.com/a5c-ai/babysitter
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/a5c-ai/babysitter "$T" && mkdir -p ~/.claude/skills && cp -r "$T/library/specializations/data-engineering-analytics/skills/cdc-pattern-implementer" ~/.claude/skills/a5c-ai-babysitter-cdc-pattern-implementer && rm -rf "$T"
manifest: library/specializations/data-engineering-analytics/skills/cdc-pattern-implementer/SKILL.md
source content

CDC Pattern Implementer

Overview

Implements Change Data Capture patterns for real-time data integration. This skill provides expertise in CDC configuration and implementation across various database and streaming platforms.

Capabilities

  • Debezium connector configuration
  • CDC pattern selection (log-based, trigger-based, timestamp-based)
  • Initial snapshot strategy
  • Schema change handling
  • Exactly-once delivery configuration
  • Sink connector setup
  • Tombstone handling
  • CDC monitoring setup

Input Schema

{
  "sourceDatabase": {
    "type": "postgres|mysql|oracle|sqlserver",
    "connection": "object"
  },
  "tables": ["string"],
  "targetSystem": "kafka|kinesis|pubsub",
  "requirements": {
    "latencyMs": "number",
    "exactlyOnce": "boolean"
  }
}

Output Schema

{
  "connectorConfig": "object",
  "snapshotStrategy": "object",
  "schemaConfig": "object",
  "monitoringConfig": "object",
  "documentation": "string"
}

Target Processes

  • ETL/ELT Pipeline
  • Streaming Pipeline
  • Data Warehouse Setup

Usage Guidelines

  1. Identify source database and tables for CDC
  2. Define target streaming system
  3. Specify latency and delivery guarantees
  4. Configure appropriate snapshot strategy for initial load

Best Practices

  • Use log-based CDC when possible for minimal source impact
  • Plan initial snapshot strategy carefully for large tables
  • Implement proper error handling and dead letter queues
  • Monitor replication lag and connector health
  • Test schema evolution handling before production