git clone https://github.com/ComeOnOliver/skillshub
T=$(mktemp -d) && git clone --depth=1 https://github.com/ComeOnOliver/skillshub "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/TerminalSkills/skills/sql-optimizer" ~/.claude/skills/comeonoliver-skillshub-sql-optimizer && rm -rf "$T"
skills/TerminalSkills/skills/sql-optimizer/SKILL.mdSQL Optimizer
Overview
Analyze SQL queries for performance problems and produce optimized versions with appropriate indexes. Covers query rewriting, index recommendations, execution plan analysis, and common anti-patterns.
Instructions
When a user asks you to optimize a SQL query or fix slow database performance, follow these steps:
Step 1: Get the query and context
Determine:
- The full SQL query to optimize
- Database engine (PostgreSQL, MySQL, SQLite)
- Table sizes (approximate row counts)
- Existing indexes
- Current execution time
If you have access to the database, gather this yourself:
-- PostgreSQL: Check table sizes SELECT relname AS table_name, reltuples::bigint AS row_count FROM pg_class WHERE relkind = 'r' AND relnamespace = ( SELECT oid FROM pg_namespace WHERE nspname = 'public' ) ORDER BY reltuples DESC; -- PostgreSQL: Check existing indexes SELECT indexname, indexdef FROM pg_indexes WHERE tablename = 'target_table';
Step 2: Analyze the execution plan
-- PostgreSQL EXPLAIN ANALYZE SELECT ...your query here...; -- MySQL EXPLAIN FORMAT=JSON SELECT ...your query here...;
Look for these red flags in the plan:
- Seq Scan on large tables (missing index)
- Nested Loop with large row counts (N+1 pattern)
- Sort with high cost (missing index for ORDER BY)
- Hash Join when a smaller join would suffice
- Rows removed by filter much larger than rows returned (bad selectivity)
Step 3: Apply optimizations
Anti-pattern: SELECT * when you need specific columns
-- Before (fetches all columns, prevents covering index use) SELECT * FROM orders WHERE status = 'pending'; -- After (fetch only needed columns) SELECT id, customer_id, total, created_at FROM orders WHERE status = 'pending';
Anti-pattern: Missing index on WHERE/JOIN columns
-- If this query is slow: SELECT * FROM orders WHERE customer_id = 123 AND status = 'pending'; -- Add a composite index: CREATE INDEX idx_orders_customer_status ON orders (customer_id, status);
Anti-pattern: N+1 queries in application code
-- Before: 1 query + N queries SELECT id FROM orders WHERE date > '2024-01-01'; -- Then for each order: SELECT * FROM order_items WHERE order_id = ?; -- After: Single query with JOIN SELECT o.id, o.total, oi.product_name, oi.quantity, oi.price FROM orders o JOIN order_items oi ON o.id = oi.order_id WHERE o.date > '2024-01-01';
Anti-pattern: Functions on indexed columns
-- Before (cannot use index on created_at) SELECT * FROM users WHERE YEAR(created_at) = 2024; -- After (uses index on created_at) SELECT * FROM users WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01';
Anti-pattern: Correlated subquery that can be a JOIN
-- Before (executes subquery for every row) SELECT name, ( SELECT COUNT(*) FROM orders WHERE orders.customer_id = customers.id ) AS order_count FROM customers; -- After (single pass with JOIN) SELECT c.name, COUNT(o.id) AS order_count FROM customers c LEFT JOIN orders o ON c.id = o.customer_id GROUP BY c.id, c.name;
Anti-pattern: OFFSET for deep pagination
-- Before (scans and discards 10000 rows) SELECT * FROM products ORDER BY id LIMIT 20 OFFSET 10000; -- After (keyset pagination, uses index) SELECT * FROM products WHERE id > 10000 ORDER BY id LIMIT 20;
Step 4: Recommend indexes
Follow these rules for index design:
- Index columns used in WHERE, JOIN ON, and ORDER BY clauses
- Put equality conditions first, then range conditions in composite indexes
- Consider covering indexes that include SELECT columns to avoid table lookups
- Do not create indexes on low-cardinality columns (e.g., boolean flags) alone
- Remove unused or duplicate indexes
-- Composite index for: WHERE status = 'active' AND created_at > '2024-01-01' ORDER BY name CREATE INDEX idx_users_status_created_name ON users (status, created_at, name);
Step 5: Report the results
Present:
- The original query with identified problems
- The optimized query
- Index recommendations with CREATE INDEX statements
- Expected improvement (estimated from plan changes)
Examples
Example 1: Optimize a slow reporting query
User request: "This query takes 45 seconds on 2M rows, can you speed it up?"
Original query:
SELECT c.name, c.email, COUNT(*) AS order_count, SUM(o.total) AS total_spent FROM customers c, orders o WHERE c.id = o.customer_id AND YEAR(o.created_at) = 2024 AND o.status IN ('completed', 'shipped') GROUP BY c.name, c.email HAVING SUM(o.total) > 500 ORDER BY total_spent DESC;
Problems identified:
- Implicit join syntax (old-style comma join)
function prevents index use onYEAR()created_at- No index on
(customer_id, status, created_at) - GROUP BY on non-indexed columns
Optimized query:
SELECT c.name, c.email, COUNT(*) AS order_count, SUM(o.total) AS total_spent FROM customers c JOIN orders o ON c.id = o.customer_id WHERE o.created_at >= '2024-01-01' AND o.created_at < '2025-01-01' AND o.status IN ('completed', 'shipped') GROUP BY c.id, c.name, c.email HAVING SUM(o.total) > 500 ORDER BY total_spent DESC;
Index recommendations:
CREATE INDEX idx_orders_customer_status_date ON orders (customer_id, status, created_at);
Expected result: ~45s down to ~0.5s with the index.
Example 2: Fix N+1 in an ORM
User request: "My Django view is slow, the debug toolbar shows 200+ queries"
Problem: ORM fetching related objects lazily in a loop.
Fix:
# Before: N+1 queries orders = Order.objects.filter(status="pending") for order in orders: print(order.customer.name) # Each access = 1 query # After: 2 queries total orders = Order.objects.filter(status="pending").select_related("customer") for order in orders: print(order.customer.name) # Already loaded
Guidelines
- Always check the execution plan before and after optimization. Do not guess at performance.
- Index recommendations must consider write overhead. Tables with heavy INSERT/UPDATE may not benefit from many indexes.
- Composite index column order matters. Put equality columns first, then range columns, then sort columns.
- For PostgreSQL, use
for the most useful plan output.EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) - Do not blindly add indexes to every column in a WHERE clause. Analyze query patterns first.
- When rewriting queries, verify that the optimized version returns the same results as the original.
- For MySQL, be aware of the optimizer's single-index-per-table limitation in older versions. Use composite indexes.
- If a query cannot be optimized further, suggest materialized views or caching as alternatives.
- Always present the original and optimized queries side by side for easy comparison.