Awesome-omni-skill Database Design
Schema design, normalization, query optimization, and data modeling patterns
install
source · Clone the upstream repo
git clone https://github.com/diegosouzapw/awesome-omni-skill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/diegosouzapw/awesome-omni-skill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/development/database-design-fabioc-aloha" ~/.claude/skills/diegosouzapw-awesome-omni-skill-database-design-4b2bed && rm -rf "$T"
manifest:
skills/development/database-design-fabioc-aloha/SKILL.mdsource content
Database Design Skill
Model data to match access patterns. Normalize for integrity, denormalize for performance.
Choosing a Database
Decision Matrix
| Factor | SQL (Relational) | NoSQL (Document) | Key-Value | Graph |
|---|---|---|---|---|
| Schema | Fixed, strict | Flexible | None | Nodes/edges |
| Transactions | ACID built-in | Eventually consistent (usually) | Limited | Varies |
| Joins | Native support | Expensive/manual | N/A | Native (relationships) |
| Scale | Vertical → Horizontal | Horizontal native | Horizontal native | Specialized |
| Best for | Structured data, complex queries | Rapid iteration, varied shape | Cache, sessions | Relationships |
Common Choices (2026)
| Database | Type | Best For |
|---|---|---|
| PostgreSQL | Relational | General purpose, full-featured |
| SQLite | Relational | Embedded, serverless |
| MongoDB | Document | Rapid prototyping, flexible schema |
| Redis | Key-Value | Caching, sessions, pub/sub |
| Cosmos DB | Multi-model | Azure-native, global distribution |
| DynamoDB | Key-Value/Document | AWS-native, massive scale |
| Neo4j | Graph | Social networks, recommendations |
Schema Design Principles
Normalization Levels
| Form | Rule | Trade-off |
|---|---|---|
| 1NF | No repeating groups | Basic structure |
| 2NF | No partial dependencies | Remove redundancy |
| 3NF | No transitive dependencies | Data integrity |
| BCNF | Every determinant is a key | Strict integrity |
When to Denormalize
- Read-heavy workloads
- Complex queries hitting many tables
- When consistency can be eventual
- Reporting/analytics tables
Example: E-Commerce Schema
-- Normalized (3NF) CREATE TABLE customers ( id SERIAL PRIMARY KEY, email VARCHAR(255) UNIQUE NOT NULL, name VARCHAR(255) NOT NULL ); CREATE TABLE orders ( id SERIAL PRIMARY KEY, customer_id INTEGER REFERENCES customers(id), created_at TIMESTAMP DEFAULT NOW(), status VARCHAR(50) NOT NULL ); CREATE TABLE order_items ( id SERIAL PRIMARY KEY, order_id INTEGER REFERENCES orders(id), product_id INTEGER REFERENCES products(id), quantity INTEGER NOT NULL, unit_price DECIMAL(10,2) NOT NULL ); -- Denormalized for reporting CREATE TABLE order_summary ( order_id INTEGER PRIMARY KEY, customer_email VARCHAR(255), customer_name VARCHAR(255), total_amount DECIMAL(10,2), item_count INTEGER, created_at TIMESTAMP );
Modern ORM Patterns
Prisma (TypeScript)
// schema.prisma model User { id Int @id @default(autoincrement()) email String @unique name String? posts Post[] profile Profile? } model Post { id Int @id @default(autoincrement()) title String content String? published Boolean @default(false) author User @relation(fields: [authorId], references: [id]) authorId Int }
// Usage - type-safe queries const user = await prisma.user.findUnique({ where: { email: 'alex@example.com' }, include: { posts: true } });
Drizzle (TypeScript)
// schema.ts import { pgTable, serial, varchar, integer, timestamp } from 'drizzle-orm/pg-core'; export const users = pgTable('users', { id: serial('id').primaryKey(), email: varchar('email', { length: 255 }).unique().notNull(), name: varchar('name', { length: 255 }), }); export const posts = pgTable('posts', { id: serial('id').primaryKey(), title: varchar('title', { length: 255 }).notNull(), authorId: integer('author_id').references(() => users.id), }); // Usage const result = await db.select().from(users).where(eq(users.email, 'alex@example.com'));
Entity Framework (.NET)
public class User { public int Id { get; set; } public string Email { get; set; } = null!; public string? Name { get; set; } public ICollection<Post> Posts { get; } = new List<Post>(); } // Usage var user = await context.Users .Include(u => u.Posts) .FirstOrDefaultAsync(u => u.Email == "alex@example.com");
Query Optimization
The EXPLAIN Plan
-- PostgreSQL EXPLAIN ANALYZE SELECT u.name, COUNT(p.id) as post_count FROM users u LEFT JOIN posts p ON u.id = p.author_id GROUP BY u.id ORDER BY post_count DESC LIMIT 10;
Reading Execution Plans
| Operation | Meaning | Concern |
|---|---|---|
| Seq Scan | Full table scan | Consider index |
| Index Scan | Using index | Good |
| Nested Loop | For each row... | OK for small sets |
| Hash Join | Build hash table | Good for large sets |
| Sort | In-memory sort | Check work_mem |
Index Strategy
-- Single column (most common) CREATE INDEX idx_users_email ON users(email); -- Composite (column order matters!) CREATE INDEX idx_posts_author_date ON posts(author_id, created_at DESC); -- Partial (when you query a subset) CREATE INDEX idx_active_users ON users(email) WHERE status = 'active'; -- Covering (includes all needed columns) CREATE INDEX idx_posts_covering ON posts(author_id) INCLUDE (title, created_at);
Query Anti-Patterns
| Anti-Pattern | Problem | Fix |
|---|---|---|
| Fetches unnecessary data | List specific columns |
| N+1 queries | Loop makes N queries | Use JOIN or eager loading |
| Missing index on WHERE | Full table scan | Add appropriate index |
| OR in WHERE | Can't use index efficiently | UNION ALL or restructure |
| Functions on columns | | |
Data Modeling Patterns
Soft Deletes
-- Instead of DELETE, update a flag ALTER TABLE users ADD COLUMN deleted_at TIMESTAMP; -- Query active users SELECT * FROM users WHERE deleted_at IS NULL; -- Soft delete UPDATE users SET deleted_at = NOW() WHERE id = 123;
Audit Trails
CREATE TABLE audit_log ( id SERIAL PRIMARY KEY, table_name VARCHAR(100) NOT NULL, record_id INTEGER NOT NULL, action VARCHAR(20) NOT NULL, -- INSERT, UPDATE, DELETE old_values JSONB, new_values JSONB, changed_by INTEGER REFERENCES users(id), changed_at TIMESTAMP DEFAULT NOW() ); -- Trigger for automatic auditing (PostgreSQL) CREATE OR REPLACE FUNCTION audit_trigger() RETURNS TRIGGER AS $$ BEGIN INSERT INTO audit_log (table_name, record_id, action, old_values, new_values) VALUES (TG_TABLE_NAME, COALESCE(NEW.id, OLD.id), TG_OP, to_jsonb(OLD), to_jsonb(NEW)); RETURN NEW; END; $$ LANGUAGE plpgsql;
Temporal Tables
-- PostgreSQL example CREATE TABLE products ( id SERIAL PRIMARY KEY, name VARCHAR(255), price DECIMAL(10,2), valid_from TIMESTAMP DEFAULT NOW(), valid_to TIMESTAMP DEFAULT 'infinity' ); -- Query current data SELECT * FROM products WHERE valid_to = 'infinity'; -- Query historical data (what was the price on Jan 1?) SELECT * FROM products WHERE valid_from <= '2026-01-01' AND valid_to > '2026-01-01';
NoSQL Patterns
Document Modeling (MongoDB)
// ❌ Relational thinking (too normalized) { _id: 1, userId: 'u1', productId: 'p1' } // separate collection // ✅ Document thinking (embed when 1:few or frequent access together) { _id: 'order123', customer: { id: 'c1', name: 'Alex', email: 'alex@example.com' }, items: [ { productId: 'p1', name: 'Widget', price: 9.99, quantity: 2 }, { productId: 'p2', name: 'Gadget', price: 19.99, quantity: 1 } ], total: 39.97, createdAt: ISODate('2026-02-11') }
Embedding vs Referencing
| Embed When | Reference When |
|---|---|
| 1:1 or 1:few relationship | 1:many or many:many |
| Data accessed together | Data accessed independently |
| Data doesn't change often | Data changes frequently |
| Size is bounded | Unbounded growth |
Migration Strategies
Schema Migration Best Practices
- Forward-only migrations — Don't modify existing migrations
- Small, incremental changes — One concern per migration
- Backward compatible — Add before remove
- Test migrations — Run against production copy
Zero-Downtime Schema Changes
-- Phase 1: Add new column (nullable) ALTER TABLE users ADD COLUMN new_email VARCHAR(255); -- Phase 2: Backfill data (in batches) UPDATE users SET new_email = email WHERE new_email IS NULL LIMIT 1000; -- Phase 3: Update application to write to both columns -- Phase 4: Make new column required ALTER TABLE users ALTER COLUMN new_email SET NOT NULL; -- Phase 5: Update application to read from new column -- Phase 6: Drop old column ALTER TABLE users DROP COLUMN email; ALTER TABLE users RENAME COLUMN new_email TO email;
Connection Management
Connection Pooling
// Prisma with connection pool datasource db { provider = "postgresql" url = env("DATABASE_URL") // Pool settings in URL: ?connection_limit=10&pool_timeout=30 } // Node postgres pool const pool = new Pool({ max: 20, // Max connections idleTimeoutMillis: 30000, connectionTimeoutMillis: 2000, });
Connection Pool Sizing
Rule of thumb for PostgreSQL:
connections = (core_count * 2) + spindle_count
For cloud: Start with 10-20, monitor and adjust.
Implementation Checklist
New Database
- Choose appropriate database type
- Design initial schema (3NF or justified denormalization)
- Define primary keys and foreign keys
- Plan indexes for known query patterns
- Set up connection pooling
- Configure backups and point-in-time recovery
- Set up monitoring (query performance, connections)
Production Readiness
- Indexes for all WHERE/JOIN columns
- No N+1 query patterns
- Connection pool sized appropriately
- Query timeouts configured
- Slow query logging enabled
- Backup restoration tested
Related Skills
- performance-profiling — Query-level performance analysis
- api-design — Data shapes for API responses
- infrastructure-as-code — Database provisioning
- security-review — Access control, encryption
Design for today's queries, but plan for tomorrow's scale.