Claude-skill-registry data-and-sql

Use when writing SQL queries, optimizing database performance, or analyzing data

install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/data-and-sql" ~/.claude/skills/majiayu000-claude-skill-registry-data-and-sql && rm -rf "$T"
manifest: skills/data/data-and-sql/SKILL.md
source content

Data & SQL

Guidelines for SQL optimization, database performance, and data analysis.

When to Use

  • Writing or optimizing SQL queries
  • Database performance issues
  • Schema design decisions
  • Data analysis and insights
  • Index strategy planning

Query Optimization

Before Writing Queries

  1. Clarify the business objective
  2. Understand data volume and patterns
  3. State any assumptions clearly
  4. Consider cost and performance

Optimization Techniques

ProblemSolution
SubqueriesReplace with JOINs
Complex logicUse CTEs for readability
Self-joinsUse window functions
Row-by-rowBatch operations
SELECT *Specify columns
COUNT for existenceUse EXISTS

N+1 Query Problem

Before (N+1):

-- 1 query for users, then N queries for posts
SELECT * FROM users WHERE active = true;
-- Then for each: SELECT * FROM posts WHERE user_id = ?;

After (Single Query):

SELECT u.id, u.name,
  JSON_AGG(JSON_BUILD_OBJECT('id', p.id, 'title', p.title))
  FILTER (WHERE p.id IS NOT NULL) AS posts
FROM users u
LEFT JOIN posts p ON u.id = p.user_id
WHERE u.active = true
GROUP BY u.id, u.name;

Index Strategy

Index Types

TypeBest ForExample
B-TreeEquality, range, sortPrimary keys, dates
CompositeMulti-column WHERE
(status, created_at)
PartialFiltered queries
WHERE status = 'active'
CoveringIndex-only scansInclude all SELECT columns
GIN/GiSTFull-text, JSONText search

Index Checklist

  • Primary keys indexed
  • Foreign keys indexed
  • Frequent WHERE columns indexed
  • Composite indexes match query patterns
  • No unused indexes (check stats)
  • Index bloat monitored

Find Unused Indexes (PostgreSQL)

SELECT indexname, idx_scan, pg_size_pretty(pg_relation_size(indexrelid))
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC;

Performance Diagnosis

Slow Query Analysis

-- PostgreSQL: Find slow queries
SELECT query, calls, mean_exec_time, max_exec_time
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;

Common Issues

SymptomCauseFix
Seq scan on large tableMissing indexAdd appropriate index
Index not usedStale statsANALYZE table
Lock contentionLong transactionsReduce scope
Table bloatDead tuplesVACUUM or pg_repack

EXPLAIN ANALYZE

Always analyze execution plans:

EXPLAIN ANALYZE
SELECT * FROM orders
WHERE user_id = 123
ORDER BY created_at DESC
LIMIT 10;

Schema Design

Normalization vs Denormalization

Normalize when:

  • Data integrity is critical
  • Write-heavy workload
  • Storage cost matters

Denormalize when:

  • Read:write ratio > 10:1
  • Query performance critical
  • Joins are bottleneck

Schema Checklist

  • Primary keys defined
  • Foreign keys with proper constraints
  • Appropriate data types (not oversized)
  • Indexes for query patterns
  • Timestamps for auditing
  • Migration scripts reversible

Caching Strategies

When to Cache

  • Expensive queries
  • Frequently accessed data
  • Semi-static data (tolerate staleness)

Cache Layers

LayerToolUse Case
ApplicationRedis, MemcachedQuery results
DatabaseQuery cacheIdentical queries
Materialized ViewPostgreSQLPre-computed aggregates

Materialized View Example

CREATE MATERIALIZED VIEW user_stats AS
SELECT user_id, COUNT(*) as post_count, MAX(created_at) as last_post
FROM posts
GROUP BY user_id;

CREATE INDEX ON user_stats(user_id);
REFRESH MATERIALIZED VIEW CONCURRENTLY user_stats;

Data Analysis Output

Findings Format

## Summary
Key insight in 1-2 sentences.

## Key Metrics
| Metric | Value | Trend |
|--------|-------|-------|

## Insights
- Finding 1 with supporting data
- Finding 2 with supporting data

## Recommendations
1. Action item with expected impact
2. Suggested follow-up analysis

Migration Best Practices

Safe Migration Template

-- Migration: Add index for performance
-- Date: 2025-12-23
-- Ticket: DB-456

-- Forward (non-blocking)
CREATE INDEX CONCURRENTLY idx_posts_user_created
ON posts(user_id, created_at DESC);

-- Rollback
DROP INDEX CONCURRENTLY idx_posts_user_created;

-- Validation
EXPLAIN ANALYZE SELECT * FROM posts WHERE user_id = 1 ORDER BY created_at DESC;

Migration Checklist

  • Tested on production-like data
  • Rollback script ready
  • Zero-downtime (CONCURRENTLY for indexes)
  • Performance impact measured
  • Monitoring in place

Decision Priority

When optimizing:

  1. Impact - Measured improvement
  2. Safety - Reversible, tested
  3. Maintainability - Understandable
  4. Scalability - Works at 10x
  5. Cost - Resource implications