Awesome-claude-code-toolkit database-optimization
Query optimization, indexing strategies, and database performance tuning for PostgreSQL and MySQL
install
source · Clone the upstream repo
git clone https://github.com/rohitg00/awesome-claude-code-toolkit
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/rohitg00/awesome-claude-code-toolkit "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/database-optimization" ~/.claude/skills/rohitg00-awesome-claude-code-toolkit-database-optimization && rm -rf "$T"
manifest:
skills/database-optimization/SKILL.mdsource content
Database Optimization
EXPLAIN Analysis
Always run
EXPLAIN ANALYZE before optimizing. Read the output bottom-up.
-- PostgreSQL EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) SELECT ...; -- MySQL EXPLAIN ANALYZE SELECT ...;
Key metrics to watch:
- Seq Scan on large tables = missing index
- Nested Loop with high row count = consider hash/merge join
- Sort without index = add index on sort column
- Rows estimated vs actual divergence = stale statistics, run
ANALYZE
Index Strategies
B-tree (default, most cases)
CREATE INDEX idx_users_email ON users (email); CREATE INDEX idx_orders_user_date ON orders (user_id, created_at DESC);
Use for: equality, range queries, sorting. Column order matters in composite indexes: put equality columns first, then range/sort columns.
Partial Index (PostgreSQL)
CREATE INDEX idx_orders_pending ON orders (created_at) WHERE status = 'pending';
Use when queries always filter on a specific condition. Dramatically smaller than full indexes.
GIN (PostgreSQL - arrays, JSONB, full-text)
CREATE INDEX idx_products_tags ON products USING GIN (tags); CREATE INDEX idx_docs_search ON documents USING GIN (to_tsvector('english', content));
GiST (PostgreSQL - spatial, range types)
CREATE INDEX idx_locations_point ON locations USING GiST (coordinates); CREATE INDEX idx_events_period ON events USING GiST (tsrange(start_at, end_at));
Covering Index (index-only scans)
-- PostgreSQL CREATE INDEX idx_users_email_name ON users (email) INCLUDE (name); -- MySQL CREATE INDEX idx_users_email_name ON users (email, name);
N+1 Query Detection
Symptom: 1 query to fetch parent + N queries for each child.
# BAD: N+1 users = db.query(User).all() for user in users: print(user.orders) # triggers query per user # GOOD: eager load users = db.query(User).options(joinedload(User.orders)).all()
// BAD: N+1 const users = await User.findAll(); for (const user of users) { const orders = await Order.findAll({ where: { userId: user.id } }); } // GOOD: batch load const users = await User.findAll({ include: [Order] });
Detection: enable query logging, count queries per request. More than 10 queries for a single endpoint is a red flag.
Connection Pooling
Rule of thumb: pool_size = (core_count * 2) + disk_count Typical web app: 10-20 connections per app instance
PostgreSQL:
- Use PgBouncer in transaction mode for serverless/high-connection scenarios
- Set
idle_in_transaction_session_timeout = '30s' - Monitor with
pg_stat_activity
MySQL:
- Set
based on available RAM (each connection uses ~10MB)max_connections - Use ProxySQL for connection multiplexing
- Monitor with
SHOW PROCESSLIST
Read Replicas
- Route all
queries to replicasSELECT - Route all writes to primary
- Account for replication lag (typically 10-100ms)
- Never read-after-write from a replica; use primary for consistency-critical reads
- Use connection-level routing, not query-level
# SQLAlchemy read replica routing class RoutingSession(Session): def get_bind(self, mapper=None, clause=None): if self._flushing or self.is_modified(): return engines["primary"] return engines["replica"]
Partition Strategies
Range Partitioning (time-series data)
-- PostgreSQL CREATE TABLE events ( id bigint GENERATED ALWAYS AS IDENTITY, created_at timestamptz NOT NULL, data jsonb ) PARTITION BY RANGE (created_at); CREATE TABLE events_2025_q1 PARTITION OF events FOR VALUES FROM ('2025-01-01') TO ('2025-04-01'); CREATE TABLE events_2025_q2 PARTITION OF events FOR VALUES FROM ('2025-04-01') TO ('2025-07-01');
Hash Partitioning (even distribution)
CREATE TABLE sessions ( id uuid PRIMARY KEY, user_id bigint NOT NULL ) PARTITION BY HASH (user_id); CREATE TABLE sessions_0 PARTITION OF sessions FOR VALUES WITH (MODULUS 4, REMAINDER 0); CREATE TABLE sessions_1 PARTITION OF sessions FOR VALUES WITH (MODULUS 4, REMAINDER 1);
Partition when tables exceed 50-100GB or when you need to drop old data quickly.
Query Optimization Checklist
- Run
and read the planEXPLAIN ANALYZE - Check for sequential scans on tables with >10K rows
- Verify index usage (check
inidx_scan
)pg_stat_user_indexes - Look for implicit type casts that prevent index use
- Replace
with specific columnsSELECT * - Add
to queries that only need a subsetLIMIT - Use
instead ofEXISTSCOUNT(*) > 0 - Batch
/INSERT
operations (500-1000 rows per batch)UPDATE - Avoid functions on indexed columns in
clausesWHERE - Monitor slow query log (pg:
)log_min_duration_statement = 100
Dangerous Patterns
on unindexed columns (use full-text search instead)LIKE '%term%'
(useORDER BY RANDOM()
or application-level randomization)TABLESAMPLE
masking a join problemSELECT DISTINCT- Missing
onWHERE
/UPDATE
(always verify withDELETE
first)SELECT - Long-running transactions holding locks
- Using
for deep pagination (use keyset/cursor pagination instead)OFFSET