Software_development_department design-system
Decomposes a product concept into architectural components, domain systems, data models, and integration boundaries. Use when starting system architecture or when the user mentions system design or component breakdown.
install
source · Clone the upstream repo
git clone https://github.com/tranhieutt/software_development_department
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/tranhieutt/software_development_department "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills/design-system" ~/.claude/skills/tranhieutt-software-development-department-design-system && rm -rf "$T"
manifest:
.claude/skills/design-system/SKILL.mdsource content
System Design
Phase 1: Clarify requirements (always do this first)
Ask before designing:
- Scale: How many users/requests/day? Read-heavy or write-heavy?
- Consistency: Strong (banking) or eventual (social feed)?
- Availability target: 99.9% (8.7h/yr downtime) or 99.99% (52min/yr)?
- Latency budget: p99 < 100ms? < 1s?
- Geography: Single region or multi-region?
Capacity estimation shortcuts
1M users/day active → ~12 req/s avg, ~120 req/s peak (10x) 1KB per request → 1M req/day = ~1GB/day = ~365GB/year Read:write ratio 10:1 (typical social) → optimize read path first 1 server handles ~1000 req/s (rule of thumb for I/O-bound services)
Component breakdown template
Client layer → Web / Mobile / API consumers CDN → Static assets, edge caching API Gateway → Rate limiting, auth, routing, SSL termination Services → Domain-specific services (User, Order, Payment, Notification) Cache → Redis for hot data (sessions, rate limits, computed results) Database → Primary DB + Read replicas Message queue → Async operations, event-driven decoupling Storage → Object storage for files (S3/GCS) Monitoring → Metrics, logs, traces, alerts
Database selection guide
| Need | Choose |
|---|---|
| ACID transactions, relations | PostgreSQL |
| High-scale document store | MongoDB |
| Key-value, cache, pub/sub | Redis |
| Time-series data | TimescaleDB / InfluxDB |
| Graph relationships | Neo4j |
| Full-text search | Elasticsearch |
| Analytical/OLAP | ClickHouse / BigQuery |
Caching strategies
Cache-aside (read): App checks cache → miss → DB → write to cache Write-through: Write to cache AND DB simultaneously (consistent, slower writes) Write-behind: Write to cache → async flush to DB (fast writes, risk of loss) Read-through: Cache handles DB reads automatically TTL guidelines: - Sessions: 15-30 min - User profile: 5 min - Product catalog: 1 hour - Config/settings: 24 hours
Message queue patterns
When to use queues: ✓ Async processing (email, PDF generation, notifications) ✓ Rate-limiting downstream services ✓ Decoupling services (order → payment → shipping) ✓ Fan-out (1 event → multiple consumers) Queue selection: - RabbitMQ: complex routing, request-reply, low latency - Kafka: high throughput, event log/replay, stream processing - SQS: managed, simple, AWS-native, at-least-once delivery - Redis Streams: lightweight, same infra as cache
API design decisions
REST: Standard CRUD, simple clients, team familiarity (default choice) GraphQL: Multiple clients with different data needs, reduce over-fetching gRPC: Internal service-to-service, binary protocol, streaming needed WebSocket: Real-time bidirectional (chat, live updates, collaborative tools)
Scaling patterns
Vertical (scale up): More CPU/RAM — quick, limited ceiling Horizontal (scale out): More instances — requires stateless services Database read replicas: Offload read traffic (good for 80%+ read workloads) Database sharding: Shard by user_id, geography — last resort, complex CQRS: Separate read/write models — when read/write patterns diverge heavily
Common design mistakes
| Mistake | Better approach |
|---|---|
| Over-engineering for scale you don't have | Start monolith, extract services at clear pain points |
| Synchronous calls to all dependencies | Use async queues for non-critical paths |
| No caching strategy | Cache at API layer + DB query results |
| Storing sessions in DB | Use Redis; DB sessions don't scale horizontally |
| Single point of failure | Redundancy at every critical layer |