Awesome-omni-skill production-principles

Production-ready development principles balancing simplicity with reliability for 10-100 MSP scale

install

source · Clone the upstream repo

git clone https://github.com/diegosouzapw/awesome-omni-skill

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/diegosouzapw/awesome-omni-skill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/product/production-principles" ~/.claude/skills/diegosouzapw-awesome-omni-skill-production-principles && rm -rf "$T"

manifest: skills/product/production-principles/SKILL.md

source content

Production Development Principles

These are universal production development principles for any project.

Philosophy: Simple, scalable, maintainable. Not MVP shortcuts, not enterprise bloat.

Golden Rules (ALWAYS Follow)

If it works reliably, ship it - Perfect is still the enemy of done
YAGNI until you need it - Don't build for hypothetical futures
Simple files > Complex architecture - Start simple, extract when needed
Direct > Abstract - Prefer direct solutions, abstract when patterns emerge
Start hardcoded, extract when needed - Make it configurable after the 3rd use
Quality matters now - We have paying customers, but over-engineering still hurts
200 lines before extracting - Functions/features can be larger now, but extract at 200 lines

Reality Check (Where We Are)

your current customer base: Optimize for this scale, not millions
Production beta: Real customers, but still learning
Multi-tenant: Each customer has unique needs
Speed + Reliability: Ship fast, but don't break things
Technical debt payback: Fix issues that impact customers NOW

Patterns to Avoid (Nuanced for Our Scale)

❌ Avoid Unless Justified

These patterns add complexity. Only use if you meet the criteria:

Factory patterns: Avoid unless you have 5+ different implementations
Dependency injection frameworks: Avoid unless team size >5 developers (Go interfaces are fine)
Abstract base classes: Avoid unless you have 3+ concrete implementations
Event sourcing / CQRS: Avoid unless you have audit requirements or >10,000 events/day
Microservices: Avoid unless monolith is >100k LOC or team >10 developers
Complex repository patterns: Avoid unless you have 5+ data sources (direct queries + transactions are fine)
Service meshes: Avoid unless you have >20 services
API gateways: Avoid unless you have >10 backend services (nginx is enough)
Custom frameworks: Avoid unless you're doing the same thing 10+ times

🚫 Still Completely Banned

Premature optimization: Never optimize before measuring
Speculative generality: Never build for "what if" scenarios
Gold plating: Never add features "because it's cool"
Resume-driven development: Never use tech "to learn it"

Patterns to Use (Production-Ready)

✅ Strongly Encouraged:

Simple functions with clear names
Direct database queries with transactions for multi-step operations
Configuration files/env vars (not hardcoded secrets)
Defensive coding (validation, error handling, retries)
Logging and monitoring (errors, performance, business metrics)
Inline code when <3 uses, extract when 3+ uses (Rule of Three)
Database migrations (not raw SQL changes)
Basic caching when queries are measured as slow (>500ms)
Polling with smart intervals (not webhooks unless push is required)
Functions up to 200 lines (extract at 200, not 50)

Production Concerns (NEW)

🚨 Must Haves for Production

Error Handling
- All external calls wrapped in try/catch
- Errors logged with context (user, customer, operation)
- User-friendly error messages
- Retry logic for transient failures (network, rate limits)
Data Integrity
- Use database transactions for multi-step operations
- Validate inputs before writing to database
- Backups run daily (already set up)
- Soft deletes for critical data (tickets, users)
Observability
- Log all errors with stack traces
- Log slow operations (>2s)
- Monitor API response times
- Track business metrics relevant to your product
Security
- Never log secrets/API keys (use last 4 chars only)
- Validate + sanitize user inputs
- Rate limiting on public endpoints
- Keep dependencies updated (monthly review)
Multi-tenancy (if applicable)
- Every query includes tenant ID filter
- Test with multiple tenants
- No cross-tenant data leaks

⚖️ Production vs Speed Balance

Ship Fast (do these)

Inline validation (no validation framework)
Direct SQL queries (no ORM)
Environment variables for config
Simple retry logic (3 attempts, exponential backoff)
File-based logs (rotate daily)

Take Time (do these right)

Database migrations (use migrate tool)
Authentication/authorization (test thoroughly)
Data export/import (customers depend on this)
Email delivery (use queue + retries)
Payment processing (never cut corners)

Decision Framework (Updated for Production)

Before ANY architectural decision, ask:

Question 1: Is this reliable for production?

If yes: Proceed
If no: What's missing? (error handling, validation, logging)

Question 2: Will 100 customers break this?

If no: Ship it
If yes: What's the bottleneck? Add specific fix (caching, indexing, pagination)

Question 3: Can another dev maintain this in 6 months?

If yes: Good complexity level
If no: Add comments, extract functions, simplify

Question 4: What's the blast radius if this fails?

One user: Ship it, fix if it breaks
One customer: Add error handling + logging
All customers: Add retry logic, monitoring, fallbacks

When to Add Abstraction (NEW)

Triggers for Abstraction

Extract to function/class when:

Rule of Three: Same logic used 3+ times
Domain complexity: Business logic gets complicated (AI logic, ticket routing)
Testing: Hard to test without extraction
Multiple implementations: 3+ ways to do something (Zendesk, Jira, email)
File size: Function/feature exceeds 200 lines

Extraction Examples

✅ Good Abstractions (Justified)

Extract after 3rd duplicate: A validation function used by 3+ handlers
Extract complex business logic: When a single function exceeds 200 lines with conditional logic
Extract when 3+ implementations exist: e.g., EmailProvider, SlackProvider, TeamsProvider — three implementations justify an interface

❌ Still Over-Engineering

Abstract factories when you only have 1 implementation
Generic repository patterns when direct queries work fine
Configuration managers when environment variables are enough

Simplicity Checkpoints (Updated)

Before Starting

Is this the simplest RELIABLE approach?
Do we need this for your current customer base (not 10,000)?
Can this be 1-5 files?
Is error handling included?
Is this easily testable?

During Implementation

Am I adding abstraction before 3rd use?
Am I creating >10 files? (Consolidate related logic)
Did I add error handling + logging?
Would another dev understand this in 6 months?
Is this function >200 lines? (Extract if yes)

Before Committing

Does this handle failures gracefully?
Are errors logged with context?
Is complex logic tested (unit tests)?
Can I deploy this without breaking existing customers?

Scaling Triggers (When to Refactor)

Refactor When You Hit These Limits

Performance (actual, not hypothetical)
- API responses >2s consistently
- Database queries >500ms
- Memory usage growing unbounded
- CPU consistently >70%
Maintainability (team pain)
- Same bug appears 3+ times (extract + fix once)
- Code duplicated 5+ times (extract + reuse)
- New feature takes 2x longer than expected
- Onboarding new dev takes >1 week
Scale (customer impact)
- Customer count exceeding what your current architecture handles
- Request volume exceeding what your database/server can handle
- Database size requiring optimization or sharding
Customer complaints (real problems)
- Specific feature requested by 5+ customers
- Same issue reported 3+ times
- Security concern raised by customer
- Competitor has feature we don't

Don't Refactor For

"Clean code" principles (if it works reliably)
Hypothetical scale (until you're at 80% of limit)
Latest framework/library (unless security fix)
Personal preferences (consistency > perfection)

Mantras (Updated for Production)

"Simple + Reliable beats complex + perfect"
"Scale when you hit limits, not before"
"Make it work, make it right, make it fast - IN THAT ORDER"
"Abstract after 3rd duplicate, not before"
"Add what you need, remove what you don't"
"Customers don't care about architecture"
"200 lines before extracting, not 50"

When to Add "Enterprise" Patterns

Use enterprise patterns ONLY when you meet ALL criteria:

Pattern	Minimum Requirements
Factory Pattern	5+ different implementations
DI Framework	Team of 5+ developers
Microservices	Monolith >100k LOC OR team >10 developers
Event Sourcing	Audit requirement OR >10k events/day
CQRS	Read/write performance measured as bottleneck
Service Mesh	20+ microservices
API Gateway	10+ backend services
Repository Pattern	5+ different data sources

Until you hit these thresholds: Keep it simple

The Prime Directive (Updated)

Build the simplest reliable thing that works for your current customer base. Then ship it.

If you find yourself:

Creating >10 files for a feature
Writing >200 lines without extracting
Thinking about "1000+ customer scalability"
Adding abstraction before 3rd use
Building generic frameworks

STOP and ask:

"What's the simplest RELIABLE way to make this work for 100 customers?"

Remember

You're not building for:

❌ Millions of users (unless you actually have them)
❌ Fortune 500 enterprise (unless you are one)
❌ Infinite scale (you need finite, measured scale)

You're building for:

✅ Your actual current user/customer count
✅ Fast iteration based on real feedback
✅ Reliable service for paying customers
✅ Maintainable codebase that your team can work on

Ship working, reliable code. Ship it fast. Iterate based on customer feedback.