Gsd-skill-creator sql-patterns

Provides SQL best practices for query optimization, schema design, migrations, transactions, and indexing. Use when writing database queries, designing schemas, creating migrations, or when user mentions 'SQL', 'database', 'query', 'schema', 'migration', 'index', 'transaction'.

install

source · Clone the upstream repo

git clone https://github.com/Tibsfox/gsd-skill-creator

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/Tibsfox/gsd-skill-creator "$T" && mkdir -p ~/.claude/skills && cp -r "$T/examples/skills/patterns/sql-patterns" ~/.claude/skills/tibsfox-gsd-skill-creator-sql-patterns && rm -rf "$T"

manifest: examples/skills/patterns/sql-patterns/SKILL.md

source content

SQL Patterns

Best practices for writing safe, performant, and maintainable SQL. All examples use parameterized queries to prevent SQL injection.

Security-First Principles

Every query MUST use parameterized inputs. Never concatenate user input into SQL strings.

-- DANGEROUS: SQL injection vulnerability
SELECT * FROM users WHERE email = '" + email + "';

-- SAFE: Parameterized query
SELECT id, name, email FROM users WHERE email = $1;

Destructive Operation Safety

Never run destructive operations without a WHERE clause and explicit confirmation.

Operation	Risk Level	Required Safeguards
`DROP TABLE`	CRITICAL	Backup first, use `IF EXISTS` , require approval
`TRUNCATE TABLE`	CRITICAL	Backup first, confirm table name, check foreign keys
`DELETE` without `WHERE`	CRITICAL	Always add WHERE clause, run SELECT first to preview
`UPDATE` without `WHERE`	CRITICAL	Always add WHERE clause, run SELECT first to preview
`ALTER TABLE DROP COLUMN`	HIGH	Backup first, check dependencies, use migration
`DROP INDEX`	MEDIUM	Verify query plans won't degrade

Always preview before modifying:

-- Step 1: Preview what will be affected
SELECT id, email, status FROM users WHERE status = 'inactive';

-- Step 2: Verify the count
SELECT COUNT(*) FROM users WHERE status = 'inactive';

-- Step 3: Only then modify (parameterized)
DELETE FROM users WHERE status = $1;  -- $1 = 'inactive'

Query Optimization

Using EXPLAIN

Always analyze query plans before deploying queries that touch large tables.

-- Basic explain
EXPLAIN SELECT id, name FROM users WHERE email = $1;

-- With execution stats (PostgreSQL)
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT u.id, u.name, o.total
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.created_at > $1;

Reading EXPLAIN Output

Term	Meaning	Action
Seq Scan	Full table scan	Add an index on filter columns
Index Scan	Using an index	Good -- verify it's the right index
Index Only Scan	Answered from index alone	Best case for read queries
Nested Loop	Row-by-row join	Fine for small result sets, bad for large
Hash Join	Hash-based join	Good for medium-large joins
Sort	In-memory or disk sort	Add index if used with ORDER BY/DISTINCT
Bitmap Heap Scan	Index + table lookup	Normal for multi-condition queries

Selective Column Queries

-- BAD: SELECT * fetches unnecessary data, breaks if schema changes
SELECT * FROM users WHERE status = $1;

-- GOOD: Explicit columns, only what you need
SELECT id, name, email, status FROM users WHERE status = $1;

Pagination

-- BAD: OFFSET-based pagination degrades with large offsets
SELECT id, name FROM products ORDER BY id LIMIT 20 OFFSET 10000;

-- GOOD: Cursor-based pagination using the last seen ID
SELECT id, name FROM products
WHERE id > $1
ORDER BY id
LIMIT 20;

Indexing Guide

Index Types

Type	Use Case	Example
B-tree (default)	Equality, range, sorting, LIKE 'prefix%'	`CREATE INDEX idx_email ON users (email)`
Hash	Equality only (rare, B-tree usually better)	`CREATE INDEX idx_code ON codes USING hash (code)`
GIN	Full-text search, JSONB, arrays	`CREATE INDEX idx_tags ON posts USING gin (tags)`
GiST	Geometric, range types, full-text	`CREATE INDEX idx_location ON places USING gist (coords)`
Partial	Subset of rows matching a condition	`CREATE INDEX idx_active ON users (email) WHERE active = true`

Composite Indexes

Column order matters. Place equality columns first, then range columns.

-- Query pattern: WHERE status = $1 AND created_at > $2
-- Index matches the query pattern: equality first, range second
CREATE INDEX idx_status_created ON orders (status, created_at);

-- This index will NOT help the query above efficiently:
CREATE INDEX idx_created_status ON orders (created_at, status);

When to Add Indexes

Signal	Action
Seq Scan on large table in EXPLAIN	Add index on filter/join columns
Slow ORDER BY	Add index on sort columns
Slow JOIN	Add index on foreign key columns
Frequent WHERE on same columns	Add composite index
Low selectivity column (e.g., boolean)	Partial index instead of full

When NOT to Index

Tables with fewer than ~1,000 rows (seq scan is faster)
Columns that are rarely queried
Columns with very low cardinality on large tables (unless partial index)
Write-heavy tables where index maintenance cost exceeds read benefit

Schema Design

Normalization Quick Reference

Normal Form Rule Example Violation

1NF

Atomic values, no repeating groups

tags: "a,b,c"

in one column

2NF No partial dependencies on composite key Non-key column depends on part of key

3NF

No transitive dependencies

zip_code

determines

city

users

When to Denormalize

Denormalize only when you have measured performance problems and understand the tradeoff.

Scenario	Denormalization Strategy	Tradeoff
Frequent join of 3+ tables	Materialized view	Stale data, refresh cost
Read-heavy reporting	Summary table	Write complexity, eventual consistency
Display name with foreign key	Cache column + trigger	Must keep in sync
Nested JSON responses	JSONB column	Harder to query, larger rows

Foreign Key Design

-- Always define foreign keys with explicit actions
CREATE TABLE orders (
    id SERIAL PRIMARY KEY,
    user_id INTEGER NOT NULL REFERENCES users(id) ON DELETE RESTRICT,
    product_id INTEGER NOT NULL REFERENCES products(id) ON DELETE RESTRICT,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Index foreign keys (PostgreSQL does NOT auto-index them)
CREATE INDEX idx_orders_user_id ON orders (user_id);
CREATE INDEX idx_orders_product_id ON orders (product_id);

ON DELETE	Behavior	Use When
RESTRICT	Block delete if referenced	Parent record must exist (orders -> users)
CASCADE	Delete children too	Children are meaningless without parent (comments -> post)
SET NULL	Set FK to NULL	Relationship is optional (assigned_to -> users)
NO ACTION	Like RESTRICT but deferred	Checking at transaction commit

Migration Best Practices

Reversible Migrations

Every migration should have a rollback strategy.

-- Migration: up
ALTER TABLE users ADD COLUMN phone VARCHAR(20);

-- Migration: down (rollback)
ALTER TABLE users DROP COLUMN phone;

Zero-Downtime Migration Patterns

Adding a column (safe):

-- Step 1: Add nullable column (no lock, no rewrite)
ALTER TABLE users ADD COLUMN phone VARCHAR(20);

-- Step 2: Backfill in batches (avoid locking entire table)
UPDATE users SET phone = $1 WHERE id BETWEEN $2 AND $3;

-- Step 3: Add NOT NULL constraint after backfill (if needed)
ALTER TABLE users ALTER COLUMN phone SET NOT NULL;

Renaming a column (requires coordination):

-- Step 1: Add new column
ALTER TABLE users ADD COLUMN full_name VARCHAR(255);

-- Step 2: Backfill (batched)
UPDATE users SET full_name = name WHERE id BETWEEN $1 AND $2;

-- Step 3: Application reads from both, writes to both
-- Step 4: Switch application to new column only
-- Step 5: Drop old column in later migration
ALTER TABLE users DROP COLUMN name;

Migration Checklist

Migration has a rollback script
Large table changes use batched updates
No
```
ALTER TABLE ... ADD COLUMN ... DEFAULT
```
on large tables (pre-PG 11)
Index creation uses
```
CONCURRENTLY
```
on production tables
Foreign key columns are indexed
Migration tested on a copy of production data
No data loss on rollback

-- SAFE: Concurrent index creation (doesn't lock writes)
CREATE INDEX CONCURRENTLY idx_users_email ON users (email);

-- UNSAFE: Locks the table during creation
CREATE INDEX idx_users_email ON users (email);

Transaction Patterns

Isolation Levels

Level	Dirty Read	Non-Repeatable Read	Phantom Read	Use Case
READ UNCOMMITTED	Possible	Possible	Possible	Almost never (analytics only)
READ COMMITTED	No	Possible	Possible	Default for most apps
REPEATABLE READ	No	No	Possible (not in PG)	Financial calculations
SERIALIZABLE	No	No	No	Critical consistency (transfers)

-- Set isolation level for a transaction
BEGIN ISOLATION LEVEL SERIALIZABLE;

UPDATE accounts SET balance = balance - $1 WHERE id = $2;
UPDATE accounts SET balance = balance + $1 WHERE id = $3;

COMMIT;

Avoiding Deadlocks

Deadlocks occur when two transactions wait for each other's locks.

-- BAD: Transaction A locks row 1 then 2; Transaction B locks row 2 then 1
-- Transaction A:
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;

-- Transaction B (concurrent):
UPDATE accounts SET balance = balance - 50 WHERE id = 2;
UPDATE accounts SET balance = balance + 50 WHERE id = 1;
-- DEADLOCK!

-- GOOD: Always lock rows in consistent order (by ID)
-- Both transactions lock id=1 first, then id=2
BEGIN;
SELECT id FROM accounts WHERE id IN (1, 2) ORDER BY id FOR UPDATE;
UPDATE accounts SET balance = balance - $1 WHERE id = 1;
UPDATE accounts SET balance = balance + $1 WHERE id = 2;
COMMIT;

Advisory Locks

Use for application-level mutual exclusion without table locks.

-- Acquire an advisory lock (blocks until available)
SELECT pg_advisory_lock($1);

-- Do exclusive work...

-- Release the lock
SELECT pg_advisory_unlock($1);

-- Non-blocking variant (returns true/false)
SELECT pg_try_advisory_lock($1);

Anti-Patterns

Anti-Pattern	Problem	Fix
`SELECT *`	Fetches unneeded data, breaks on schema change	List explicit columns
N+1 queries	Loop fires one query per row	Use JOIN or batch IN query
String concatenation in SQL	SQL injection vulnerability	Use parameterized queries ($1, ?)
Missing indexes on foreign keys	Slow JOINs and cascading deletes	Add index on every FK column
`DELETE FROM table` (no WHERE)	Deletes ALL rows	Always include WHERE clause
`LIKE '%term%'`	Cannot use B-tree index, full scan	Use full-text search (GIN index)
Storing CSV in one column	Violates 1NF, impossible to query	Use a junction/join table
`ORDER BY RANDOM()`	Scans and sorts entire table	Use `TABLESAMPLE` or app-level random
UUID v4 as primary key	Fragmented B-tree inserts	Use UUID v7 (time-ordered) or SERIAL
N+1 in application code	`for user in users: query(user.id)`	`WHERE id IN ($1, $2, ...)`
`COUNT(*)` for existence check	Counts all rows	`SELECT EXISTS (SELECT 1 FROM ... LIMIT 1)`

Parameterized Query Examples

Node.js (pg)

// SAFE: Parameterized query
const result = await pool.query(
  'SELECT id, name, email FROM users WHERE email = $1 AND active = $2',
  [email, true]
);

// SAFE: INSERT with RETURNING
const { rows } = await pool.query(
  'INSERT INTO users (name, email) VALUES ($1, $2) RETURNING id',
  [name, email]
);

Python (psycopg2)

# SAFE: Parameterized query
cursor.execute(
    "SELECT id, name, email FROM users WHERE email = %s AND active = %s",
    (email, True)
)

# SAFE: Batch insert
from psycopg2.extras import execute_values
execute_values(
    cursor,
    "INSERT INTO users (name, email) VALUES %s",
    [(name, email) for name, email in user_data]
)

Go (database/sql)

// SAFE: Parameterized query
row := db.QueryRowContext(ctx,
    "SELECT id, name, email FROM users WHERE email = $1",
    email,
)

Quick Reference: Safe Defaults

Concern	Default Choice
Primary key	`SERIAL` or `BIGSERIAL` (UUID v7 if distributed)
Timestamps	`TIMESTAMPTZ` (always with timezone)
Money/currency	`NUMERIC(19,4)` (never FLOAT)
String storage	`VARCHAR(n)` with reasonable limit or `TEXT`
Boolean	`BOOLEAN` (not integer flags)
Foreign keys	Always define with `ON DELETE` behavior
Indexes	B-tree default, GIN for JSONB/arrays/full-text
Transactions	READ COMMITTED default, SERIALIZABLE for money
Migrations	Always reversible, always batched for large tables