Awesome-omni-skill scale
Recommend sharding, caching strategies, and read-replication patterns for Cloudflare architectures. Use this skill when preparing for growth, hitting limits, or optimizing for high traffic.
install
source · Clone the upstream repo
git clone https://github.com/diegosouzapw/awesome-omni-skill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/diegosouzapw/awesome-omni-skill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/devops/scale-majiayu000" ~/.claude/skills/diegosouzapw-awesome-omni-skill-scale-a6c559 && rm -rf "$T"
manifest:
skills/devops/scale-majiayu000/SKILL.mdsource content
Cloudflare Scaling Skill
Strategies for scaling Cloudflare architectures beyond default limits while maintaining cost efficiency.
Scaling Decision Matrix
| Bottleneck | Symptom | Solution |
|---|---|---|
| D1 read latency | >50ms queries | Add KV cache layer |
| D1 write throughput | Queue backlog | Batch writes, add queue buffer |
| D1 storage | Approaching 10GB | Archive to R2, partition tables |
| KV read latency | Cache misses | Key prefixing, predictable keys |
| KV write rate | 1 write/sec/key limit | Shard keys, batch writes |
| R2 throughput | Slow uploads | Presigned URLs, multipart |
| Worker memory | 128MB limit | Streaming, chunked processing |
| Worker CPU | 30s timeout | Queues, Workflows, DO |
| Subrequests | 1000/request limit | Service Bindings RPC |
| Queue throughput | Consumer lag | Increase concurrency, batch size |
Caching Strategies
Cache Hierarchy
Request → Edge Cache → KV Cache → D1 → Origin (Tiered) (Global) (Primary)
KV Cache Patterns
Write-Through Cache
async function getWithCache<T>( kv: KVNamespace, db: D1Database, key: string, query: () => Promise<T>, ttl: number = 3600 ): Promise<T> { // Try cache first const cached = await kv.get(key, 'json'); if (cached !== null) { return cached as T; } // Cache miss - fetch from D1 const fresh = await query(); // Write to cache (non-blocking) kv.put(key, JSON.stringify(fresh), { expirationTtl: ttl }); return fresh; }
Cache Invalidation
// Pattern 1: TTL-based (simple, eventual consistency) await kv.put(key, value, { expirationTtl: 300 }); // 5 min // Pattern 2: Version-based (immediate, more complex) const version = await kv.get('cache:version'); const key = `data:${id}:v${version}`; // Invalidate by incrementing version await kv.put('cache:version', String(Number(version) + 1)); // Pattern 3: Tag-based (flexible, requires cleanup) await kv.put(`user:${userId}:profile`, data); await kv.put(`user:${userId}:settings`, settings); // Invalidate all user data const keys = await kv.list({ prefix: `user:${userId}:` }); for (const key of keys.keys) { await kv.delete(key.name); }
Tiered Cache (Cloudflare CDN)
Enable in Worker for static-like responses:
// Cache API for fine-grained control const cache = caches.default; app.get('/api/products/:id', async (c) => { const cacheKey = new Request(c.req.url); // Check cache const cached = await cache.match(cacheKey); if (cached) { return cached; } // Fetch fresh data const product = await getProduct(c.env.DB, c.req.param('id')); const response = c.json(product); response.headers.set('Cache-Control', 's-maxage=300'); // Store in cache c.executionCtx.waitUntil(cache.put(cacheKey, response.clone())); return response; });
Sharding Strategies
Key-Based Sharding (KV)
When hitting 1 write/sec/key limit:
// Problem: High-frequency counter await kv.put('page:views', views); // Limited to 1/sec // Solution: Shard across multiple keys const SHARD_COUNT = 10; async function incrementCounter(kv: KVNamespace, key: string) { const shard = Math.floor(Math.random() * SHARD_COUNT); const shardKey = `${key}:shard:${shard}`; const current = Number(await kv.get(shardKey)) || 0; await kv.put(shardKey, String(current + 1)); } async function getCounter(kv: KVNamespace, key: string): Promise<number> { let total = 0; for (let i = 0; i < SHARD_COUNT; i++) { const value = await kv.get(`${key}:shard:${i}`); total += Number(value) || 0; } return total; }
Time-Based Sharding (D1)
For high-volume time-series data:
-- Partition by month CREATE TABLE events_2025_01 ( id TEXT PRIMARY KEY, timestamp TEXT NOT NULL, data TEXT ); CREATE TABLE events_2025_02 ( id TEXT PRIMARY KEY, timestamp TEXT NOT NULL, data TEXT ); -- Query router in code function getEventsTable(date: Date): string { const year = date.getFullYear(); const month = String(date.getMonth() + 1).padStart(2, '0'); return `events_${year}_${month}`; }
Entity-Based Sharding (D1)
For multi-tenant applications:
// Tenant-specific D1 databases interface Bindings { DB_TENANT_A: D1Database; DB_TENANT_B: D1Database; // Or use Hyperdrive for external Postgres } function getDbForTenant(env: Bindings, tenantId: string): D1Database { const dbMapping: Record<string, D1Database> = { 'tenant-a': env.DB_TENANT_A, 'tenant-b': env.DB_TENANT_B, }; return dbMapping[tenantId] ?? env.DB_DEFAULT; }
Read Replication Patterns
D1 Read Replicas
D1 automatically creates read replicas. Optimize access:
// Enable Smart Placement in wrangler.jsonc { "placement": { "mode": "smart" } } // Worker runs near data, reducing latency
Multi-Region with Durable Objects
For global coordination with regional caching:
// Durable Object for region-local state export class RegionalCache { private state: DurableObjectState; private cache: Map<string, { value: unknown; expires: number }>; constructor(state: DurableObjectState) { this.state = state; this.cache = new Map(); } async fetch(request: Request): Promise<Response> { const url = new URL(request.url); const key = url.searchParams.get('key'); if (request.method === 'GET' && key) { const cached = this.cache.get(key); if (cached && cached.expires > Date.now()) { return Response.json({ value: cached.value, source: 'cache' }); } return Response.json({ value: null, source: 'miss' }); } if (request.method === 'PUT' && key) { const { value, ttl } = await request.json(); this.cache.set(key, { value, expires: Date.now() + (ttl * 1000), }); return Response.json({ success: true }); } return Response.json({ error: 'Invalid request' }, { status: 400 }); } }
Eventual Consistency Pattern
For data that can tolerate staleness:
interface CacheEntry<T> { data: T; cachedAt: number; staleAfter: number; expireAfter: number; } async function getWithStaleWhileRevalidate<T>( kv: KVNamespace, key: string, fetcher: () => Promise<T>, options: { staleAfter: number; // Serve stale, revalidate in background expireAfter: number; // Force fresh fetch } ): Promise<T> { const cached = await kv.get<CacheEntry<T>>(key, 'json'); const now = Date.now(); if (cached) { // Fresh - return immediately if (now < cached.staleAfter) { return cached.data; } // Stale but not expired - return stale, revalidate async if (now < cached.expireAfter) { // Background revalidation kv.put(key, JSON.stringify(await buildCacheEntry(fetcher, options))); return cached.data; } } // Expired or missing - must fetch fresh const entry = await buildCacheEntry(fetcher, options); await kv.put(key, JSON.stringify(entry)); return entry.data; } async function buildCacheEntry<T>( fetcher: () => Promise<T>, options: { staleAfter: number; expireAfter: number } ): Promise<CacheEntry<T>> { const now = Date.now(); return { data: await fetcher(), cachedAt: now, staleAfter: now + options.staleAfter, expireAfter: now + options.expireAfter, }; }
Queue Scaling
Horizontal Scaling (Concurrency)
{ "queues": { "consumers": [ { "queue": "events", "max_batch_size": 100, // Max messages per invocation "max_concurrency": 20, // Parallel consumer instances "max_retries": 1, "dead_letter_queue": "events-dlq" } ] } }
Batch Processing Optimization
export default { async queue(batch: MessageBatch, env: Bindings) { // Group messages for efficient D1 batching const byType = new Map<string, unknown[]>(); for (const msg of batch.messages) { const type = msg.body.type; if (!byType.has(type)) byType.set(type, []); byType.get(type)!.push(msg.body.payload); } // Process each type as a batch for (const [type, payloads] of byType) { await processBatch(type, payloads, env); } // Ack all at once batch.ackAll(); }, }; async function processBatch( type: string, payloads: unknown[], env: Bindings ) { // Batch insert to D1 (≤1000 at a time) const BATCH_SIZE = 1000; for (let i = 0; i < payloads.length; i += BATCH_SIZE) { const chunk = payloads.slice(i, i + BATCH_SIZE); await insertBatch(env.DB, type, chunk); } }
Memory Management
Streaming for Large Payloads
// Problem: Loading entire file into memory const data = await r2.get(key); const json = await data.json(); // May exceed 128MB // Solution: Stream processing app.get('/export/:key', async (c) => { const object = await c.env.R2.get(c.req.param('key')); if (!object) return c.json({ error: 'Not found' }, 404); return new Response(object.body, { headers: { 'Content-Type': object.httpMetadata?.contentType ?? 'application/octet-stream', 'Content-Length': String(object.size), }, }); });
Chunked Processing with Durable Objects
// For files >50MB, process in chunks with checkpointing export class ChunkedProcessor { private state: DurableObjectState; async processFile(r2Key: string, chunkSize: number = 1024 * 1024) { // Get checkpoint let offset = (await this.state.storage.get<number>('offset')) ?? 0; const object = await this.env.R2.get(r2Key, { range: { offset, length: chunkSize }, }); if (!object) { // Processing complete await this.state.storage.delete('offset'); return { complete: true }; } // Process chunk await this.processChunk(await object.arrayBuffer()); // Save checkpoint await this.state.storage.put('offset', offset + chunkSize); // Schedule next chunk via alarm await this.state.storage.setAlarm(Date.now() + 100); return { complete: false, offset: offset + chunkSize }; } }
Scaling Checklist
Before Launch
- KV cache layer for hot data
- D1 indexes on all query columns
- Queue DLQs configured
- Smart Placement enabled
- Batch size ≤1000 for D1 writes
At 10K req/day
- Monitor D1 query performance
- Review cache hit rates
- Check queue consumer lag
At 100K req/day
- Add Analytics Engine for metrics (free)
- Consider key sharding for hot KV keys
- Review batch processing efficiency
At 1M req/day
- Implement Tiered Cache
- Add read-through caching
- Consider time-based D1 partitioning
- Review and optimize indexes
At 10M req/day
- Multi-region strategy
- Entity-based sharding
- Durable Objects for coordination
- Custom rate limiting
Cost Implications
| Scaling Strategy | Additional Cost | When to Use |
|---|---|---|
| KV caching | $0.50/M reads | D1 read heavy |
| Key sharding | More KV reads | >1 write/sec/key |
| Time partitioning | None (same D1) | >10GB data |
| Tiered Cache | None (CDN) | Cacheable responses |
| DO coordination | CPU time | Global state |
| Queue scaling | Per message | High throughput |
Anti-Patterns
| Pattern | Problem | Solution |
|---|---|---|
| Cache everything | KV costs add up | Cache hot data only |
| Shard too early | Complexity without benefit | Monitor first |
| Ignore TTLs | Stale data | Set appropriate TTLs |
| Skip DLQ | Lost messages | Always add DLQ |
| Over-replicate | Cost multiplication | Right-size replication |