Gsd-skill-creator sql-patterns
Provides SQL best practices for query optimization, schema design, migrations, transactions, and indexing. Use when writing database queries, designing schemas, creating migrations, or when user mentions 'SQL', 'database', 'query', 'schema', 'migration', 'index', 'transaction'.
git clone https://github.com/Tibsfox/gsd-skill-creator
T=$(mktemp -d) && git clone --depth=1 https://github.com/Tibsfox/gsd-skill-creator "$T" && mkdir -p ~/.claude/skills && cp -r "$T/examples/skills/patterns/sql-patterns" ~/.claude/skills/tibsfox-gsd-skill-creator-sql-patterns && rm -rf "$T"
examples/skills/patterns/sql-patterns/SKILL.mdSQL Patterns
Best practices for writing safe, performant, and maintainable SQL. All examples use parameterized queries to prevent SQL injection.
Security-First Principles
Every query MUST use parameterized inputs. Never concatenate user input into SQL strings.
-- DANGEROUS: SQL injection vulnerability SELECT * FROM users WHERE email = '" + email + "'; -- SAFE: Parameterized query SELECT id, name, email FROM users WHERE email = $1;
Destructive Operation Safety
Never run destructive operations without a WHERE clause and explicit confirmation.
| Operation | Risk Level | Required Safeguards |
|---|---|---|
| CRITICAL | Backup first, use , require approval |
| CRITICAL | Backup first, confirm table name, check foreign keys |
without | CRITICAL | Always add WHERE clause, run SELECT first to preview |
without | CRITICAL | Always add WHERE clause, run SELECT first to preview |
| HIGH | Backup first, check dependencies, use migration |
| MEDIUM | Verify query plans won't degrade |
Always preview before modifying:
-- Step 1: Preview what will be affected SELECT id, email, status FROM users WHERE status = 'inactive'; -- Step 2: Verify the count SELECT COUNT(*) FROM users WHERE status = 'inactive'; -- Step 3: Only then modify (parameterized) DELETE FROM users WHERE status = $1; -- $1 = 'inactive'
Query Optimization
Using EXPLAIN
Always analyze query plans before deploying queries that touch large tables.
-- Basic explain EXPLAIN SELECT id, name FROM users WHERE email = $1; -- With execution stats (PostgreSQL) EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) SELECT u.id, u.name, o.total FROM users u JOIN orders o ON o.user_id = u.id WHERE u.created_at > $1;
Reading EXPLAIN Output
| Term | Meaning | Action |
|---|---|---|
| Seq Scan | Full table scan | Add an index on filter columns |
| Index Scan | Using an index | Good -- verify it's the right index |
| Index Only Scan | Answered from index alone | Best case for read queries |
| Nested Loop | Row-by-row join | Fine for small result sets, bad for large |
| Hash Join | Hash-based join | Good for medium-large joins |
| Sort | In-memory or disk sort | Add index if used with ORDER BY/DISTINCT |
| Bitmap Heap Scan | Index + table lookup | Normal for multi-condition queries |
Selective Column Queries
-- BAD: SELECT * fetches unnecessary data, breaks if schema changes SELECT * FROM users WHERE status = $1; -- GOOD: Explicit columns, only what you need SELECT id, name, email, status FROM users WHERE status = $1;
Pagination
-- BAD: OFFSET-based pagination degrades with large offsets SELECT id, name FROM products ORDER BY id LIMIT 20 OFFSET 10000; -- GOOD: Cursor-based pagination using the last seen ID SELECT id, name FROM products WHERE id > $1 ORDER BY id LIMIT 20;
Indexing Guide
Index Types
| Type | Use Case | Example |
|---|---|---|
| B-tree (default) | Equality, range, sorting, LIKE 'prefix%' | |
| Hash | Equality only (rare, B-tree usually better) | |
| GIN | Full-text search, JSONB, arrays | |
| GiST | Geometric, range types, full-text | |
| Partial | Subset of rows matching a condition | |
Composite Indexes
Column order matters. Place equality columns first, then range columns.
-- Query pattern: WHERE status = $1 AND created_at > $2 -- Index matches the query pattern: equality first, range second CREATE INDEX idx_status_created ON orders (status, created_at); -- This index will NOT help the query above efficiently: CREATE INDEX idx_created_status ON orders (created_at, status);
When to Add Indexes
| Signal | Action |
|---|---|
| Seq Scan on large table in EXPLAIN | Add index on filter/join columns |
| Slow ORDER BY | Add index on sort columns |
| Slow JOIN | Add index on foreign key columns |
| Frequent WHERE on same columns | Add composite index |
| Low selectivity column (e.g., boolean) | Partial index instead of full |
When NOT to Index
- Tables with fewer than ~1,000 rows (seq scan is faster)
- Columns that are rarely queried
- Columns with very low cardinality on large tables (unless partial index)
- Write-heavy tables where index maintenance cost exceeds read benefit
Schema Design
Normalization Quick Reference
| Normal Form | Rule | Example Violation |
|---|---|---|
| 1NF | Atomic values, no repeating groups | in one column |
| 2NF | No partial dependencies on composite key | Non-key column depends on part of key |
| 3NF | No transitive dependencies | determines in |
When to Denormalize
Denormalize only when you have measured performance problems and understand the tradeoff.
| Scenario | Denormalization Strategy | Tradeoff |
|---|---|---|
| Frequent join of 3+ tables | Materialized view | Stale data, refresh cost |
| Read-heavy reporting | Summary table | Write complexity, eventual consistency |
| Display name with foreign key | Cache column + trigger | Must keep in sync |
| Nested JSON responses | JSONB column | Harder to query, larger rows |
Foreign Key Design
-- Always define foreign keys with explicit actions CREATE TABLE orders ( id SERIAL PRIMARY KEY, user_id INTEGER NOT NULL REFERENCES users(id) ON DELETE RESTRICT, product_id INTEGER NOT NULL REFERENCES products(id) ON DELETE RESTRICT, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); -- Index foreign keys (PostgreSQL does NOT auto-index them) CREATE INDEX idx_orders_user_id ON orders (user_id); CREATE INDEX idx_orders_product_id ON orders (product_id);
| ON DELETE | Behavior | Use When |
|---|---|---|
| RESTRICT | Block delete if referenced | Parent record must exist (orders -> users) |
| CASCADE | Delete children too | Children are meaningless without parent (comments -> post) |
| SET NULL | Set FK to NULL | Relationship is optional (assigned_to -> users) |
| NO ACTION | Like RESTRICT but deferred | Checking at transaction commit |
Migration Best Practices
Reversible Migrations
Every migration should have a rollback strategy.
-- Migration: up ALTER TABLE users ADD COLUMN phone VARCHAR(20); -- Migration: down (rollback) ALTER TABLE users DROP COLUMN phone;
Zero-Downtime Migration Patterns
Adding a column (safe):
-- Step 1: Add nullable column (no lock, no rewrite) ALTER TABLE users ADD COLUMN phone VARCHAR(20); -- Step 2: Backfill in batches (avoid locking entire table) UPDATE users SET phone = $1 WHERE id BETWEEN $2 AND $3; -- Step 3: Add NOT NULL constraint after backfill (if needed) ALTER TABLE users ALTER COLUMN phone SET NOT NULL;
Renaming a column (requires coordination):
-- Step 1: Add new column ALTER TABLE users ADD COLUMN full_name VARCHAR(255); -- Step 2: Backfill (batched) UPDATE users SET full_name = name WHERE id BETWEEN $1 AND $2; -- Step 3: Application reads from both, writes to both -- Step 4: Switch application to new column only -- Step 5: Drop old column in later migration ALTER TABLE users DROP COLUMN name;
Migration Checklist
- Migration has a rollback script
- Large table changes use batched updates
- No
on large tables (pre-PG 11)ALTER TABLE ... ADD COLUMN ... DEFAULT - Index creation uses
on production tablesCONCURRENTLY - Foreign key columns are indexed
- Migration tested on a copy of production data
- No data loss on rollback
-- SAFE: Concurrent index creation (doesn't lock writes) CREATE INDEX CONCURRENTLY idx_users_email ON users (email); -- UNSAFE: Locks the table during creation CREATE INDEX idx_users_email ON users (email);
Transaction Patterns
Isolation Levels
| Level | Dirty Read | Non-Repeatable Read | Phantom Read | Use Case |
|---|---|---|---|---|
| READ UNCOMMITTED | Possible | Possible | Possible | Almost never (analytics only) |
| READ COMMITTED | No | Possible | Possible | Default for most apps |
| REPEATABLE READ | No | No | Possible (not in PG) | Financial calculations |
| SERIALIZABLE | No | No | No | Critical consistency (transfers) |
-- Set isolation level for a transaction BEGIN ISOLATION LEVEL SERIALIZABLE; UPDATE accounts SET balance = balance - $1 WHERE id = $2; UPDATE accounts SET balance = balance + $1 WHERE id = $3; COMMIT;
Avoiding Deadlocks
Deadlocks occur when two transactions wait for each other's locks.
-- BAD: Transaction A locks row 1 then 2; Transaction B locks row 2 then 1 -- Transaction A: UPDATE accounts SET balance = balance - 100 WHERE id = 1; UPDATE accounts SET balance = balance + 100 WHERE id = 2; -- Transaction B (concurrent): UPDATE accounts SET balance = balance - 50 WHERE id = 2; UPDATE accounts SET balance = balance + 50 WHERE id = 1; -- DEADLOCK! -- GOOD: Always lock rows in consistent order (by ID) -- Both transactions lock id=1 first, then id=2 BEGIN; SELECT id FROM accounts WHERE id IN (1, 2) ORDER BY id FOR UPDATE; UPDATE accounts SET balance = balance - $1 WHERE id = 1; UPDATE accounts SET balance = balance + $1 WHERE id = 2; COMMIT;
Advisory Locks
Use for application-level mutual exclusion without table locks.
-- Acquire an advisory lock (blocks until available) SELECT pg_advisory_lock($1); -- Do exclusive work... -- Release the lock SELECT pg_advisory_unlock($1); -- Non-blocking variant (returns true/false) SELECT pg_try_advisory_lock($1);
Anti-Patterns
| Anti-Pattern | Problem | Fix |
|---|---|---|
| Fetches unneeded data, breaks on schema change | List explicit columns |
| N+1 queries | Loop fires one query per row | Use JOIN or batch IN query |
| String concatenation in SQL | SQL injection vulnerability | Use parameterized queries ($1, ?) |
| Missing indexes on foreign keys | Slow JOINs and cascading deletes | Add index on every FK column |
(no WHERE) | Deletes ALL rows | Always include WHERE clause |
| Cannot use B-tree index, full scan | Use full-text search (GIN index) |
| Storing CSV in one column | Violates 1NF, impossible to query | Use a junction/join table |
| Scans and sorts entire table | Use or app-level random |
| UUID v4 as primary key | Fragmented B-tree inserts | Use UUID v7 (time-ordered) or SERIAL |
| N+1 in application code | | |
for existence check | Counts all rows | |
Parameterized Query Examples
Node.js (pg)
// SAFE: Parameterized query const result = await pool.query( 'SELECT id, name, email FROM users WHERE email = $1 AND active = $2', [email, true] ); // SAFE: INSERT with RETURNING const { rows } = await pool.query( 'INSERT INTO users (name, email) VALUES ($1, $2) RETURNING id', [name, email] );
Python (psycopg2)
# SAFE: Parameterized query cursor.execute( "SELECT id, name, email FROM users WHERE email = %s AND active = %s", (email, True) ) # SAFE: Batch insert from psycopg2.extras import execute_values execute_values( cursor, "INSERT INTO users (name, email) VALUES %s", [(name, email) for name, email in user_data] )
Go (database/sql)
// SAFE: Parameterized query row := db.QueryRowContext(ctx, "SELECT id, name, email FROM users WHERE email = $1", email, )
Quick Reference: Safe Defaults
| Concern | Default Choice |
|---|---|
| Primary key | or (UUID v7 if distributed) |
| Timestamps | (always with timezone) |
| Money/currency | (never FLOAT) |
| String storage | with reasonable limit or |
| Boolean | (not integer flags) |
| Foreign keys | Always define with behavior |
| Indexes | B-tree default, GIN for JSONB/arrays/full-text |
| Transactions | READ COMMITTED default, SERIALIZABLE for money |
| Migrations | Always reversible, always batched for large tables |