Some_claude_skills postgresql-optimization
Expert in PostgreSQL performance tuning, query optimization, and database administration. Specializes in EXPLAIN analysis, indexing strategies, connection pooling, partitioning, and production-grade
install
source · Clone the upstream repo
git clone https://github.com/curiositech/some_claude_skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/curiositech/some_claude_skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills/postgresql-optimization" ~/.claude/skills/curiositech-some-claude-skills-postgresql-optimization && rm -rf "$T"
manifest:
.claude/skills/postgresql-optimization/SKILL.mdsource content
PostgreSQL Optimization
Overview
Expert in PostgreSQL performance tuning, query optimization, and database administration. Specializes in EXPLAIN analysis, indexing strategies, connection pooling, partitioning, and production-grade PostgreSQL operations.
When to Use
- Diagnosing slow queries with EXPLAIN ANALYZE
- Creating optimal indexes for query patterns
- Designing database schemas for performance
- Configuring PostgreSQL for production workloads
- Implementing connection pooling (PgBouncer, Supavisor)
- Setting up partitioning for large tables
- Analyzing and reducing lock contention
- Migrating or upgrading PostgreSQL versions
Capabilities
Query Optimization
- EXPLAIN / EXPLAIN ANALYZE interpretation
- Query plan analysis and optimization
- Identifying sequential scans vs index scans
- Join optimization and query rewriting
- CTE vs subquery performance trade-offs
- Window function optimization
Indexing Strategies
- B-tree, GIN, GiST, BRIN index selection
- Partial indexes for filtered queries
- Expression indexes for computed values
- Covering indexes (INCLUDE clause)
- Index-only scans optimization
- Concurrent index creation
Schema Design
- Normalization vs denormalization trade-offs
- JSONB column design and indexing
- Array columns and operations
- Enum types vs lookup tables
- Foreign key cascade strategies
- Table inheritance and partitioning
Configuration Tuning
- Memory settings (shared_buffers, work_mem, effective_cache_size)
- Connection limits and pooling
- WAL and checkpoint tuning
- Autovacuum configuration
- Statistics collection settings
Advanced Features
- Partitioning (range, list, hash)
- Materialized views with refresh strategies
- Full-text search with tsvector/tsquery
- PostGIS geospatial queries
- Logical replication setup
- pg_stat_statements analysis
Dependencies
Works well with:
- Schema design and ERD creationdatabase-modeler
- ETL and data processingdata-pipeline-engineer
- Database monitoring and alertingsite-reliability-engineer
- Full-stack data fetchingnextjs-app-router-expert
Examples
Reading EXPLAIN ANALYZE Output
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) SELECT u.*, COUNT(o.id) as order_count FROM users u LEFT JOIN orders o ON o.user_id = u.id WHERE u.created_at > '2024-01-01' GROUP BY u.id; -- Key metrics to look for: -- - "Seq Scan" on large tables → needs index -- - "Rows Removed by Filter" high → filter before join -- - "Sort Method: external merge" → increase work_mem -- - "Buffers: shared hit" vs "shared read" → cache efficiency
Creating Effective Indexes
-- Basic B-tree for equality and range queries CREATE INDEX CONCURRENTLY idx_orders_user_created ON orders (user_id, created_at DESC); -- Partial index for common filter CREATE INDEX CONCURRENTLY idx_orders_pending ON orders (created_at) WHERE status = 'pending'; -- GIN index for JSONB containment queries CREATE INDEX CONCURRENTLY idx_products_metadata ON products USING GIN (metadata jsonb_path_ops); -- Covering index to enable index-only scans CREATE INDEX CONCURRENTLY idx_users_email_covering ON users (email) INCLUDE (name, created_at); -- Expression index for case-insensitive search CREATE INDEX CONCURRENTLY idx_users_email_lower ON users (LOWER(email));
Optimizing N+1 Queries
-- BAD: N+1 pattern (1 + N queries) SELECT * FROM posts WHERE user_id = $1; -- Then for each post: SELECT * FROM comments WHERE post_id = $1; -- GOOD: Single query with lateral join SELECT p.*, c.comments FROM posts p LEFT JOIN LATERAL ( SELECT json_agg(c.*) as comments FROM comments c WHERE c.post_id = p.id ) c ON true WHERE p.user_id = $1; -- GOOD: Window function for aggregates SELECT p.*, COUNT(*) OVER (PARTITION BY p.user_id) as user_post_count FROM posts p WHERE p.user_id = $1;
Table Partitioning
-- Create partitioned table by date range CREATE TABLE events ( id BIGSERIAL, event_type TEXT NOT NULL, payload JSONB, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), PRIMARY KEY (id, created_at) ) PARTITION BY RANGE (created_at); -- Create monthly partitions CREATE TABLE events_2024_01 PARTITION OF events FOR VALUES FROM ('2024-01-01') TO ('2024-02-01'); CREATE TABLE events_2024_02 PARTITION OF events FOR VALUES FROM ('2024-02-01') TO ('2024-03-01'); -- Automate partition creation with pg_partman CREATE EXTENSION pg_partman; SELECT partman.create_parent('public.events', 'created_at', 'native', 'monthly');
Connection Pooling Config (PgBouncer)
; pgbouncer.ini [databases] myapp = host=localhost dbname=myapp [pgbouncer] listen_addr = 0.0.0.0 listen_port = 6432 auth_type = md5 auth_file = /etc/pgbouncer/userlist.txt ; Pool settings pool_mode = transaction ; Recommended for most apps max_client_conn = 1000 default_pool_size = 20 reserve_pool_size = 5 ; Timeouts server_idle_timeout = 600 client_idle_timeout = 0
Performance Configuration
-- Check current settings SHOW shared_buffers; -- ~25% of RAM SHOW effective_cache_size; -- ~75% of RAM SHOW work_mem; -- Per-operation, start small (64MB) SHOW maintenance_work_mem; -- For VACUUM, CREATE INDEX (512MB-1GB) -- Recommended production settings (for 32GB RAM server) ALTER SYSTEM SET shared_buffers = '8GB'; ALTER SYSTEM SET effective_cache_size = '24GB'; ALTER SYSTEM SET work_mem = '64MB'; ALTER SYSTEM SET maintenance_work_mem = '1GB'; ALTER SYSTEM SET random_page_cost = 1.1; -- For SSD storage ALTER SYSTEM SET effective_io_concurrency = 200; -- For SSD -- Reload configuration SELECT pg_reload_conf();
Finding Slow Queries
-- Enable pg_stat_statements CREATE EXTENSION pg_stat_statements; -- Top 10 slowest queries by total time SELECT round(total_exec_time::numeric, 2) as total_ms, calls, round(mean_exec_time::numeric, 2) as avg_ms, round((100 * total_exec_time / sum(total_exec_time) OVER())::numeric, 2) as pct, query FROM pg_stat_statements ORDER BY total_exec_time DESC LIMIT 10; -- Queries with most I/O SELECT round(shared_blks_read::numeric, 2) as disk_reads, round(shared_blks_hit::numeric, 2) as cache_hits, round(100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0), 2) as cache_hit_ratio, query FROM pg_stat_statements ORDER BY shared_blks_read DESC LIMIT 10;
Analyzing Table Bloat
-- Check table bloat SELECT schemaname, tablename, pg_size_pretty(pg_total_relation_size(schemaname || '.' || tablename)) as total_size, pg_size_pretty(pg_relation_size(schemaname || '.' || tablename)) as table_size, n_dead_tup, n_live_tup, round(100.0 * n_dead_tup / nullif(n_live_tup + n_dead_tup, 0), 2) as dead_pct FROM pg_stat_user_tables ORDER BY n_dead_tup DESC LIMIT 10; -- Manual VACUUM for critical tables VACUUM (VERBOSE, ANALYZE) orders; -- Reclaim space (requires exclusive lock) VACUUM FULL orders; -- Use during maintenance window
Best Practices
- Always use EXPLAIN ANALYZE - Don't guess, measure actual query performance
- Create indexes CONCURRENTLY - Avoid blocking writes during index creation
- Partial indexes for hot paths - Index only the rows you query frequently
- Use connection pooling - PgBouncer or Supavisor for production
- Monitor pg_stat_statements - Track query performance over time
- Regular ANALYZE - Keep statistics current for query planner
- **Avoid SELECT *** - Only fetch columns you need
- Batch large updates - Process in chunks to avoid lock contention
- Use prepared statements - Reduce parsing overhead for repeated queries
Common Pitfalls
- Missing indexes - Check for sequential scans on large tables
- Over-indexing - Too many indexes slow down writes
- work_mem too low - Causes disk-based sorts and hash joins
- Connection exhaustion - Not using connection pooling
- Stale statistics - Autovacuum not running frequently enough
- Bloated tables - Not vacuuming after large deletes/updates
- N+1 queries - Fetching related data in loops instead of joins
- SELECT * everywhere - Fetching unnecessary columns