Ai-design-components managing-dns
Manage DNS records, TTL strategies, and DNS-as-code automation for infrastructure. Use when configuring domain resolution, automating DNS from Kubernetes with external-dns, setting up DNS-based load balancing, or troubleshooting propagation issues across cloud providers (Route53, Cloud DNS, Azure DNS, Cloudflare).
git clone https://github.com/ancoleman/ai-design-components
T=$(mktemp -d) && git clone --depth=1 https://github.com/ancoleman/ai-design-components "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/managing-dns" ~/.claude/skills/ancoleman-ai-design-components-managing-dns && rm -rf "$T"
skills/managing-dns/SKILL.mdDNS Management
Configure and automate DNS records with proper TTL strategies, DNS-as-code patterns, and troubleshooting techniques.
Purpose
Guide DNS configuration for applications, infrastructure, and services with focus on:
- Record type selection (A, AAAA, CNAME, MX, TXT, SRV, CAA)
- TTL strategies for propagation and caching
- DNS-as-code automation (external-dns, OctoDNS, DNSControl)
- Cloud DNS services comparison and selection
- DNS-based load balancing patterns
- Troubleshooting tools and techniques
When to Use This Skill
Apply DNS management patterns when:
- Setting up DNS for new applications or services
- Automating DNS updates from Kubernetes workloads
- Configuring DNS-based failover or load balancing
- Troubleshooting DNS propagation or resolution issues
- Migrating DNS between providers
- Planning DNS changes with minimal downtime
- Implementing GeoDNS for global users
Record Type Selection
Quick Reference
Address Resolution:
- A Record: Map hostname to IPv4 address (example.com → 192.0.2.1)
- AAAA Record: Map hostname to IPv6 address (example.com → 2001:db8::1)
- CNAME Record: Alias to another domain (www.example.com → example.com)
- Cannot use at zone apex (@)
- Cannot coexist with other records at same name
Email Configuration:
- MX Record: Direct email to mail servers with priority
- TXT Record: Email authentication (SPF, DKIM, DMARC) and verification
Service Discovery:
- SRV Record: Specify service location (protocol, priority, weight, port, target)
Delegation and Security:
- NS Record: Delegate subdomain to different nameservers
- CAA Record: Restrict which Certificate Authorities can issue certificates
Cloud-Specific:
- ALIAS Record: Like CNAME but works at zone apex (Route53, Cloudflare)
Decision Tree
Need to point domain to: ├─ IPv4 Address? → A record ├─ IPv6 Address? → AAAA record ├─ Another Domain? │ ├─ Zone apex (@) → ALIAS/ANAME or A record │ └─ Subdomain → CNAME ├─ Mail Server? → MX record (with priority) ├─ Email Authentication? → TXT record (SPF/DKIM/DMARC) ├─ Service Discovery? → SRV record ├─ Domain Verification? → TXT record ├─ Certificate Control? → CAA record └─ Subdomain Delegation? → NS record
For detailed record type examples and patterns, see
references/record-types.md.
TTL Strategy
Standard TTL Values
By Change Frequency:
- Stable records: 3600-86400s (1-24 hours) - NS, stable A/AAAA
- Normal operation: 3600s (1 hour) - Standard websites, MX
- Moderate changes: 300-1800s (5-30 min) - Development, A/B testing
- Failover scenarios: 60-300s (1-5 min) - Critical records needing fast updates
Key Principle: Lower TTL = faster propagation but higher DNS query load
Pre-Change Process
When planning DNS changes:
T-48h: Lower TTL to 300s T-24h: Verify TTL propagated globally T-0h: Make DNS change T+1h: Verify new records propagating T+6h: Confirm global propagation T+24h: Raise TTL back to normal (3600s)
Propagation Formula:
Max Time = Old TTL + New TTL + Query Time
Example: Changing a record with 3600s TTL takes up to 2 hours to fully propagate.
TTL by Use Case
| Use Case | TTL | Rationale |
|---|---|---|
| Production (stable) | 3600s | Balance speed and load |
| Before planned change | 300s | Fast propagation |
| Development/staging | 300-600s | Frequent changes |
| DNS-based failover | 60-300s | Fast recovery |
| Mail servers | 3600s | Rarely change |
| NS records | 86400s | Very stable |
For detailed TTL scenarios and calculations, see
references/ttl-strategies.md.
DNS-as-Code Tools
Tool Selection by Use Case
Kubernetes DNS Automation → external-dns
- Annotation-based configuration on Services/Ingresses
- Automatic sync to DNS providers (20+ supported)
- No manual DNS updates required
- See
examples/external-dns/
Multi-Provider DNS Management → OctoDNS or DNSControl
- Version control for DNS records
- Sync configuration across multiple providers
- Preview changes before applying
- OctoDNS (Python/YAML) - See
examples/octodns/ - DNSControl (JavaScript) - See
examples/dnscontrol/
Infrastructure-as-Code → Terraform
- Manage DNS alongside cloud resources
- Provider-specific resources (aws_route53_record, etc.)
- See
examples/terraform/
Tool Comparison
| Tool | Language | Best For | Kubernetes | Multi-Provider |
|---|---|---|---|---|
| external-dns | Go | K8s automation | ★★★★★ | ★★★★ |
| OctoDNS | Python/YAML | Version control | ★★★ | ★★★★★ |
| DNSControl | JavaScript | Complex logic | ★★ | ★★★★★ |
| Terraform | HCL | IaC integration | ★★★ | ★★★★ |
Quick Start: external-dns
# Kubernetes Service with DNS annotation apiVersion: v1 kind: Service metadata: name: app annotations: external-dns.alpha.kubernetes.io/hostname: app.example.com external-dns.alpha.kubernetes.io/ttl: "300" spec: type: LoadBalancer ports: - port: 80
Deploy external-dns controller once, then all annotated Services/Ingresses automatically create DNS records.
For complete examples, see
examples/external-dns/ and references/dns-as-code-comparison.md.
Cloud DNS Provider Selection
Provider Characteristics
AWS Route53
- Best for AWS-heavy infrastructure
- Advanced routing policies (weighted, latency, geolocation, failover)
- Health checks with automatic failover
- ALIAS records for AWS resources (ELB, CloudFront, S3)
- Pricing: $0.50/month per zone + $0.40 per million queries
Google Cloud DNS
- Best for GCP-native applications
- Strong DNSSEC support with automatic key rotation
- Private zones for VPC internal DNS
- Split-horizon DNS (different internal/external records)
- Pricing: $0.20/month per zone + $0.40 per million queries
Azure DNS
- Best for Azure-native applications
- Integration with Azure Traffic Manager
- Azure Private DNS zones
- Azure RBAC for access control
- Pricing: $0.50/month per zone + $0.40 per million queries
Cloudflare
- Best for multi-cloud or cloud-agnostic
- Fastest DNS query times globally
- Built-in DDoS protection
- Free tier with unlimited queries
- CDN integration
- Pricing: Free tier, $20/month Pro, $200/month Business
Selection Decision Tree
Choose based on: ├─ AWS-heavy? → Route53 ├─ GCP-native? → Cloud DNS ├─ Azure-native? → Azure DNS ├─ Multi-cloud? → Cloudflare or OctoDNS/DNSControl ├─ Need fastest global DNS? → Cloudflare ├─ Need DDoS protection? → Cloudflare └─ Budget-conscious? → Cloudflare (free tier) or Cloud DNS (lowest zone cost)
For detailed provider comparisons and examples, see
references/cloud-providers.md.
DNS-Based Load Balancing
GeoDNS (Geographic Routing)
Return different IP addresses based on client location to:
- Reduce latency (route to nearest data center)
- Comply with data residency requirements
- Distribute load across regions
Example Pattern:
Client Location → DNS Response ├─ North America → 192.0.2.1 (US data center) ├─ Europe → 192.0.2.10 (EU data center) └─ Default → CloudFront edge (global CDN)
Weighted Routing
Distribute traffic by percentage for:
- Blue-green deployments
- Canary releases (10% to new version)
- A/B testing
Example Pattern:
DNS Responses: ├─ 90% → 192.0.2.1 (stable version) └─ 10% → 192.0.2.2 (canary version)
Health Check-Based Failover
Automatically route traffic away from unhealthy endpoints.
Pattern:
Primary: 192.0.2.1 (health checked every 30s) ├─ Healthy → Return primary IP └─ Unhealthy → Return secondary IP (192.0.2.2) Failover time: ~2-3 minutes = Health check failures (90s) + TTL expiration (60s)
For complete load balancing examples, see
examples/load-balancing/.
Troubleshooting
Essential Commands
Check DNS Resolution:
# Basic query dig example.com # Clean output (just IP) dig example.com +short # Query specific DNS server dig @8.8.8.8 example.com dig @1.1.1.1 example.com # Trace resolution path dig +trace example.com
Check TTL:
dig example.com | grep -A1 "ANSWER SECTION" # Look for TTL value (number before IN A)
Check Propagation:
# Multiple resolvers dig @8.8.8.8 example.com +short # Google dig @1.1.1.1 example.com +short # Cloudflare dig @208.67.222.222 example.com +short # OpenDNS
Flush Local DNS Cache:
# macOS sudo dscacheutil -flushcache; sudo killall -HUP mDNSResponder # Windows ipconfig /flushdns # Linux sudo systemd-resolve --flush-caches
Common Problems
Slow Propagation:
- Check current TTL (old TTL must expire first)
- Lower TTL 24-48 hours before changes
- Use propagation checkers: whatsmydns.net, dnschecker.org
CNAME at Zone Apex:
- Error: Cannot use CNAME at @ (zone apex)
- Solution: Use ALIAS record (Route53, Cloudflare) or A record
external-dns Not Creating Records:
- Verify annotation spelling:
external-dns.alpha.kubernetes.io/hostname - Check domain filter matches:
--domain-filter=example.com - Review external-dns logs for errors
- Confirm provider credentials configured
For detailed troubleshooting, see
references/troubleshooting.md.
Common Patterns
Pattern 1: Kubernetes DNS Automation
# Deploy external-dns (once per cluster) helm install external-dns external-dns/external-dns \ --set provider=aws \ --set domainFilters[0]=example.com \ --set policy=sync # Then annotate Services apiVersion: v1 kind: Service metadata: annotations: external-dns.alpha.kubernetes.io/hostname: api.example.com external-dns.alpha.kubernetes.io/ttl: "300" spec: type: LoadBalancer
Pattern 2: Multi-Provider Sync with OctoDNS
# octodns-config.yaml providers: config: class: octodns.provider.yaml.YamlProvider directory: ./config route53: class: octodns_route53.Route53Provider cloudflare: class: octodns_cloudflare.CloudflareProvider zones: example.com.: sources: [config] targets: [route53, cloudflare]
Pattern 3: DNS-Based Failover
# Route53 with health checks resource "aws_route53_health_check" "primary" { fqdn = "primary.example.com" port = 443 type = "HTTPS" resource_path = "/health" failure_threshold = 3 request_interval = 30 } resource "aws_route53_record" "primary" { zone_id = aws_route53_zone.main.zone_id name = "api.example.com" type = "A" ttl = 60 set_identifier = "primary" failover_routing_policy { type = "PRIMARY" } health_check_id = aws_route53_health_check.primary.id records = ["192.0.2.1"] } resource "aws_route53_record" "secondary" { zone_id = aws_route53_zone.main.zone_id name = "api.example.com" type = "A" ttl = 60 set_identifier = "secondary" failover_routing_policy { type = "SECONDARY" } records = ["192.0.2.2"] }
Integration with Other Skills
infrastructure-as-code:
- Manage DNS via Terraform/Pulumi alongside other resources
- Zone configuration in IaC repositories
kubernetes-operations:
- external-dns automates DNS for Kubernetes workloads
- Ingress controller integration for automatic DNS
load-balancing-patterns:
- DNS-based load balancing (GeoDNS, weighted routing)
- Health checks and failover configurations
security-hardening:
- DNSSEC for DNS integrity
- CAA records for certificate authority control
- DNS-based DDoS mitigation
secret-management:
- Store DNS provider API credentials in vaults
- Secure DDNS update mechanisms
Additional Resources
Reference Documentation:
- Detailed record type guide with examplesreferences/record-types.md
- TTL scenarios and propagation calculationsreferences/ttl-strategies.md
- Provider comparison and detailed featuresreferences/cloud-providers.md
- Common problems and solutionsreferences/troubleshooting.md
- Tool comparison matrixreferences/dns-as-code-comparison.md
Examples:
- Kubernetes DNS automationexamples/external-dns/
- Multi-provider sync with YAMLexamples/octodns/
- Multi-provider with JavaScript DSLexamples/dnscontrol/
- Cloud provider configurationsexamples/terraform/
- GeoDNS and failover patternsexamples/load-balancing/
Scripts:
- Verify propagation across resolversscripts/check-dns-propagation.sh
- Validate DNS configurationscripts/validate-dns-config.py
- Export existing DNS recordsscripts/export-dns-records.sh
- Calculate propagation timescripts/calculate-ttl-propagation.py
Quick Reference
Record Types Cheat Sheet
| Record | Purpose | Example |
|---|---|---|
| A | IPv4 address | example.com → 192.0.2.1 |
| AAAA | IPv6 address | example.com → 2001:db8::1 |
| CNAME | Alias to domain | www → example.com |
| MX | Mail server | 10 mail.example.com |
| TXT | Text/verification | "v=spf1 include:_spf.google.com ~all" |
| SRV | Service location | 10 60 5060 sip.example.com |
| NS | Nameserver delegation | ns1.provider.com |
| CAA | CA authorization | 0 issue "letsencrypt.org" |
TTL Cheat Sheet
| Scenario | TTL | Why |
|---|---|---|
| Stable production | 3600s | Balance speed/load |
| Before change | 300s | Fast propagation |
| Failover | 60-300s | Fast recovery |
| NS records | 86400s | Very stable |
Provider Cheat Sheet
| Provider | Best For | Key Feature |
|---|---|---|
| Route53 | AWS | Advanced routing, health checks |
| Cloud DNS | GCP | DNSSEC, private zones |
| Azure DNS | Azure | Traffic Manager integration |
| Cloudflare | Multi-cloud | Fastest, DDoS protection, free tier |
Tool Cheat Sheet
| Tool | Use When |
|---|---|
| external-dns | Kubernetes DNS automation |
| OctoDNS | Multi-provider, Python shop |
| DNSControl | Multi-provider, JavaScript preference |
| Terraform | Managing DNS with other infrastructure |