Skilllibrary postgresql
install
source · Clone the upstream repo
git clone https://github.com/merceralex397-collab/skilllibrary
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/merceralex397-collab/skilllibrary "$T" && mkdir -p ~/.claude/skills && cp -r "$T/09-backend-api-and-data/postgresql" ~/.claude/skills/merceralex397-collab-skilllibrary-postgresql && rm -rf "$T"
manifest:
09-backend-api-and-data/postgresql/SKILL.mdsource content
Purpose
Provide concrete, actionable guidance for PostgreSQL schema design, query optimization, connection management, migration workflows, and operational tuning. This skill encodes production-tested patterns so agents produce correct DDL, efficient queries, and safe deployment procedures instead of guessing at PostgreSQL internals.
When to use this skill
Use this skill when:
- designing or modifying PostgreSQL schemas (tables, indexes, constraints, partitions)
- writing or optimizing SQL queries against PostgreSQL (CTEs, window functions, EXPLAIN plans)
- configuring connection pooling (pgbouncer, PgPool-II) or diagnosing connection exhaustion
- creating or reviewing database migrations (alembic, flyway, dbmate, or raw SQL migrations)
- tuning autovacuum, analyzing bloat, or investigating lock contention
- working with PostgreSQL-specific features: advisory locks, pg_stat_statements, LISTEN/NOTIFY, logical replication
Do not use this skill when
- the database is SQLite, MySQL, or another RDBMS — use the appropriate skill instead
- the task is purely ORM-level (model definitions, relationship loading) — use
orm-patterns - the task is BigQuery or analytical warehouse work — use
bigquery - the work is application-layer API design with no direct SQL involvement
Operating procedure
- Identify the PostgreSQL version and extensions in use. Check
andSELECT version()
output. Version determines available features (e.g., MERGE in v15+, partitioning improvements in v12+).\dx - Design schema with normalization first, denormalize with evidence. Start at 3NF. Only denormalize when EXPLAIN ANALYZE shows join costs justify it. Document the trade-off.
- Select index types deliberately:
- B-tree (default): equality and range queries on scalar columns
- GIN: array containment (
), full-text search (@>
), JSONB path queriestsvector - GiST: geometric data, range types, nearest-neighbor searches
- BRIN: large append-only tables with naturally ordered data (timestamps, sequences)
- Hash: equality-only lookups (rare; B-tree usually wins)
- Write migrations as versioned, idempotent SQL files. Each migration file gets a sequential ID. Use
to avoid table locks. Always include a down/rollback migration.CREATE INDEX CONCURRENTLY - Run EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) on every non-trivial query before shipping. Look for sequential scans on large tables, nested loop joins on big result sets, and high buffer hit ratios that mask I/O problems.
- Configure connection pooling externally. Use pgbouncer in transaction mode for web workloads. Size the pool at
for the PostgreSQL server; keep pgbouncer's(2 * CPU cores) + effective_spindle_count
at 2-3× that.default_pool_size - Set transaction isolation deliberately. Use READ COMMITTED (default) for most OLTP. Use REPEATABLE READ for report queries that must see a consistent snapshot. Use SERIALIZABLE only when correctness requires it, and handle serialization failures with retry logic.
- Monitor and tune autovacuum. Check
andpg_stat_user_tables.n_dead_tup
. For high-write tables, lowerlast_autovacuum
to 0.01-0.05. Watch for transaction ID wraparound viaautovacuum_vacuum_scale_factor
.age(datfrozenxid) - Use advisory locks for application-level coordination.
for exclusive,pg_advisory_lock(key)
for non-blocking. Always usepg_try_advisory_lock(key)
or session-level locks that release on disconnect.pg_advisory_unlock - Validate with pg_stat_statements. Enable the extension, then query
for top queries bypg_stat_statements
,total_exec_time
, andcalls
. Target the top 10 queries for optimization.mean_exec_time
Decision rules
- Index creation: Never add an index without first checking if an existing index covers the query. Use
to find unused indexes for removal.pg_stat_user_indexes - CTEs vs subqueries: In PostgreSQL 12+, CTEs are inlined by default. Use
hint only when you need optimization fences. In pre-12, prefer subqueries for performance-critical paths.MATERIALIZED - Partitioning threshold: Consider range or list partitioning when a table exceeds ~50M rows or when queries consistently filter on the partition key. Prefer declarative partitioning (v10+) over inheritance.
- JSONB vs normalized columns: Use JSONB for truly schemaless, rarely-queried metadata. Use typed columns for anything that appears in WHERE clauses, JOINs, or aggregations.
- Sequence vs UUID for PKs: Use
for internal IDs (compact, ordered, index-friendly). Use UUIDv7 (time-sortable) for distributed systems or public-facing IDs.bigserial - Lock strategy: Prefer
for queue-like patterns. AvoidSELECT ... FOR UPDATE SKIP LOCKED
in application code.LOCK TABLE - Bulk operations: Use
for bulk inserts (10-100× faster than INSERT). UseCOPY
for upserts. Batch updates with CTEs returning modified rows.INSERT ... ON CONFLICT
Anti-patterns
- Missing
onWHERE
/UPDATE
: Always require a WHERE clause in DML. UseDELETE
+ verify row count beforeBEGIN
.COMMIT
in production queries: Enumerate columns explicitly.SELECT *
breaks when columns are added and wastes I/O on unused columns.SELECT *- Indexes on low-cardinality columns: An index on a boolean column is almost never useful. PostgreSQL will prefer a sequential scan.
- Using
for pagination: O(n) cost. Use keyset pagination (OFFSET
) instead.WHERE id > $last_seen_id ORDER BY id LIMIT $n - Long-running transactions holding locks: Transactions open for minutes block autovacuum and cause table bloat. Set
.idle_in_transaction_session_timeout - Not using
for index creation:CONCURRENTLY
takes anCREATE INDEX
lock. On production tables, always useACCESS EXCLUSIVE
.CREATE INDEX CONCURRENTLY - Ignoring
: Flying blind on query performance. Enable it inpg_stat_statements
and review weekly.shared_preload_libraries - Storing large blobs in PostgreSQL: Use external object storage (S3) with a URL reference. Large bytea columns bloat tables and backups.
Output requirements
— DDL with explicit types, constraints, and index definitionsSchema Design
— Ordered migration files with up/down, noting lock implicationsMigration Plan
— EXPLAIN output with interpretation and optimization notesQuery Plan Analysis
— pgbouncer or application pool settings with rationaleConnection Config
— Specific queries or commands to verify the change works correctlyValidation
References
Read these when relevant to the specific task:
— Index selection, pooling config, migration workflows, query optimization, partitioningreferences/implementation-patterns.md
— Pre-deploy checks, index coverage, pool sizing, lock audits, EXPLAIN reviewreferences/validation-checklist.md
— Connection exhaustion, deadlocks, bloat, wraparound, replication lagreferences/failure-modes.md
Related skills
— When the task involves ORM model definitions or query generation layered on PostgreSQLorm-patterns
— When comparing embedded vs server database trade-offssqlite
— When the task is primarily about entity relationships and data modelingdata-model
— When adding query logging, slow query tracking, or pgbadger analysisobservability-logging
Failure handling
- Connection exhaustion: Check
for idle connections. Verify pgbouncer pool limits. Kill idle-in-transaction sessions withpg_stat_activity
.pg_terminate_backend() - Deadlocks detected: Read the PostgreSQL log for deadlock details. Ensure consistent lock ordering across transactions. Consider advisory locks for complex workflows.
- Autovacuum not keeping up: Lower per-table
. Check if long-running transactions are blocking vacuum. Monitorautovacuum_vacuum_scale_factor
growth rate.n_dead_tup - Migration failed mid-apply: Check if the migration was wrapped in a transaction. For DDL that cannot be rolled back (e.g.,
), use a two-phase approach: deprecate first, remove later.DROP COLUMN - Query performance regression: Compare EXPLAIN plans before and after. Check for missing statistics (
). Verify index is being used (ANALYZE table_name
as diagnostic, never in production).enable_seqscan = off - Transaction ID wraparound warning: This is critical. Run
on affected tables immediately. MonitorVACUUM FREEZE
— emergency at 2 billion.age(datfrozenxid)