Skillshub analyzing-database-indexes

install
source · Clone the upstream repo
git clone https://github.com/ComeOnOliver/skillshub
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ComeOnOliver/skillshub "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/jeremylongshore/claude-code-plugins-plus-skills/analyzing-database-indexes" ~/.claude/skills/comeonoliver-skillshub-analyzing-database-indexes && rm -rf "$T"
manifest: skills/jeremylongshore/claude-code-plugins-plus-skills/analyzing-database-indexes/SKILL.md
source content

Database Index Advisor

Overview

Analyze database index usage, identify missing indexes causing sequential scans, detect redundant or unused indexes wasting write performance, and recommend optimal index configurations for PostgreSQL and MySQL.

Prerequisites

  • Database credentials with access to
    pg_stat_user_indexes
    ,
    pg_stat_user_tables
    , and
    pg_stat_statements
    (PostgreSQL) or
    performance_schema
    and
    sys
    schema (MySQL)
  • pg_stat_statements
    extension enabled for PostgreSQL query statistics
  • psql
    or
    mysql
    CLI for executing analysis queries
  • Representative workload running (analysis during off-peak hours may miss important query patterns)
  • At least 24 hours of statistics accumulation since the last
    pg_stat_reset()

Instructions

  1. Identify tables with high sequential scan activity (candidates for missing indexes):

    • PostgreSQL:
      SELECT relname, seq_scan, seq_tup_read, idx_scan, n_live_tup FROM pg_stat_user_tables WHERE seq_scan > 100 AND n_live_tup > 10000 ORDER BY seq_tup_read DESC LIMIT 20
    • A table with high
      seq_scan
      count and high
      seq_tup_read
      relative to
      n_live_tup
      is scanning most of the table repeatedly
  2. Find the queries causing sequential scans by correlating with

    pg_stat_statements
    :

    • SELECT query, calls, mean_exec_time, rows FROM pg_stat_statements WHERE query ILIKE '%table_name%' ORDER BY mean_exec_time DESC LIMIT 10
    • Run
      EXPLAIN (ANALYZE, BUFFERS)
      on the top queries to confirm sequential scan usage
  3. Analyze query WHERE clauses and JOIN conditions to determine which columns need indexes. Extract the filtering columns and their selectivity:

    • SELECT column_name, n_distinct, correlation FROM pg_stats WHERE tablename = 'target_table'
    • High
      n_distinct
      (close to row count) indicates good index selectivity
    • correlation
      close to 1.0 or -1.0 suggests the column benefits from a B-tree index
  4. Recommend composite indexes for multi-column queries. Follow the equality-first, range-second ordering:

    • Place columns used with
      =
      operators first in the index
    • Place columns used with
      >
      ,
      <
      ,
      BETWEEN
      , or
      LIKE 'prefix%'
      last
    • Example:
      WHERE status = 'active' AND created_at > '2024-01-01'
      ->
      CREATE INDEX ON orders (status, created_at)
  5. Identify unused indexes wasting write performance:

    • PostgreSQL:
      SELECT indexrelname, idx_scan, pg_size_pretty(pg_relation_size(indexrelid)) AS index_size FROM pg_stat_user_indexes WHERE idx_scan = 0 AND indexrelname NOT LIKE '%pkey' ORDER BY pg_relation_size(indexrelid) DESC
    • Indexes with zero scans over a representative period are candidates for removal (verify they are not used by foreign key constraints or unique enforcement)
  6. Detect redundant indexes where one index is a prefix of another:

    • A single-column index on
      (customer_id)
      is redundant if a composite index on
      (customer_id, created_at)
      exists, because the composite index serves both single-column and multi-column queries
    • Generate DROP INDEX recommendations for the redundant subset indexes
  7. Evaluate partial indexes for filtered queries. If a query always filters

    WHERE status = 'active'
    :

    • CREATE INDEX idx_orders_active ON orders (created_at) WHERE status = 'active'
    • Partial indexes are smaller and faster than full indexes when the filter eliminates most rows
  8. Consider covering indexes (INCLUDE clause in PostgreSQL 11+) for index-only scans:

    • CREATE INDEX idx_orders_covering ON orders (customer_id, created_at) INCLUDE (total_amount, status)
    • The INCLUDE columns are stored in the index leaf pages, enabling index-only scans without heap access
  9. Estimate the impact of each recommendation:

    • Index size:
      SELECT pg_size_pretty(pg_relation_size('index_name'))
      for existing similar indexes
    • Write overhead: each additional index adds approximately 5-15% write latency per INSERT/UPDATE
    • Read improvement: compare EXPLAIN plans with and without the proposed index
  10. Generate a prioritized recommendations report with CREATE INDEX and DROP INDEX statements, estimated storage impact, expected query improvement, and write overhead trade-off analysis.

Output

  • Missing index recommendations as ready-to-execute CREATE INDEX statements with CONCURRENTLY option
  • Unused index report with DROP INDEX candidates and their storage savings
  • Redundant index report identifying prefix-overlapping indexes
  • Index usage statistics showing scan counts, tuple reads, and sizes for all indexes
  • Impact analysis estimating read improvement vs. write overhead for each recommendation

Error Handling

ErrorCauseSolution
pg_stat_statements
not available
Extension not installed
CREATE EXTENSION pg_stat_statements
and add to
shared_preload_libraries
Index creation blocks writes
CREATE INDEX
acquires exclusive lock on the table
Use
CREATE INDEX CONCURRENTLY
which does not block writes (takes longer but safe for production)
Index not used after creationStatistics not updated or query planner choosing sequential scanRun
ANALYZE table_name
; check
random_page_cost
setting (reduce to 1.1 for SSD); verify query uses indexed columns without functions
Statistics reset unexpectedly
pg_stat_reset()
called or database restart cleared stats
Wait 24-48 hours for statistics to accumulate; set up periodic stats collection to a metrics table
Too many indexes on write-heavy tableEach INSERT/UPDATE must update all indexesTarget 5-7 indexes per table maximum; use composite indexes to replace multiple single-column indexes; remove unused indexes

Examples

Identifying a missing composite index for an API endpoint: The

/orders?customer_id=123&status=active
endpoint takes 2 seconds. Analysis shows the orders table (5M rows) has indexes on
(id)
and
(customer_id)
but not
(customer_id, status)
. The query filters on both columns. Adding
CREATE INDEX CONCURRENTLY idx_orders_customer_status ON orders (customer_id, status)
reduces the query to 5ms.

Cleaning up 8 unused indexes saving 12GB: Index usage analysis reveals 8 indexes with zero scans over 30 days, totaling 12GB of storage. After confirming none are used for FK enforcement or unique constraints, dropping them reduces write latency by 18% and frees disk space. Command:

DROP INDEX CONCURRENTLY idx_name
.

Replacing 3 single-column indexes with 1 composite covering index: Table has separate indexes on

(user_id)
,
(created_at)
, and
(status)
. Most queries filter on all three. A single composite index
(user_id, status, created_at) INCLUDE (amount)
replaces all three, reduces total index storage by 40%, and enables index-only scans for the dashboard query.

Resources