Harness-engineering perf-server-side-caching

Server-Side Caching

install

source · Clone the upstream repo

git clone https://github.com/Intense-Visions/harness-engineering

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/Intense-Visions/harness-engineering "$T" && mkdir -p ~/.claude/skills && cp -r "$T/agents/skills/claude-code/perf-server-side-caching" ~/.claude/skills/intense-visions-harness-engineering-perf-server-side-caching-26d5d3 && rm -rf "$T"

manifest: agents/skills/claude-code/perf-server-side-caching/SKILL.md

source content

Server-Side Caching

Design and implement server-side caching strategies — cache-aside, read-through, write-through, and write-behind patterns with Redis and Memcached, multi-tier caching architectures, serialization optimization, and distributed cache consistency.

When to Use

Database queries are repeated with identical parameters across multiple requests
API endpoint response times exceed 100ms for data that changes infrequently
Database CPU utilization is high and read-heavy workload dominates
Application needs sub-millisecond read latency for frequently accessed data
You need to choose between Redis and Memcached for a caching layer
User sessions, authentication tokens, or rate-limiting counters need fast storage
A multi-tier caching strategy (in-process + distributed) is being designed
Write-heavy workloads need buffering to reduce database write pressure
Cache warming is needed after deployments or cache failures
Microservices share data that would benefit from a centralized cache

Instructions

Choose a caching pattern. Select based on read/write ratio and consistency requirements:

Cache-Aside (Lazy Loading):
Read:  App → Cache → [miss] → DB → Write to Cache → Return
Write: App → DB → Invalidate Cache

Read-Through:
Read:  App → Cache → [miss] → Cache fetches from DB → Return
Write: App → DB → Invalidate Cache

Write-Through:
Read:  App → Cache → [miss] → DB → Return
Write: App → Cache → Cache writes to DB → Return

Write-Behind (Write-Back):
Read:  App → Cache → [miss] → DB → Return
Write: App → Cache → Return (async write to DB later)

Implement cache-aside pattern. The most common server-side caching pattern:

async function getUser(userId) {
  const cacheKey = `user:${userId}`;

  // Try cache first
  const cached = await redis.get(cacheKey);
  if (cached) {
    return JSON.parse(cached);
  }

  // Cache miss: fetch from database
  const user = await db.users.findById(userId);

  // Write to cache with TTL
  await redis.setex(cacheKey, 3600, JSON.stringify(user));

  return user;
}

async function updateUser(userId, data) {
  // Write to database
  await db.users.update(userId, data);

  // Invalidate cache (not update — avoids race conditions)
  await redis.del(`user:${userId}`);
}

Choose between Redis and Memcached.

Feature	Redis	Memcached
Data structures	Strings, hashes, lists, sets, sorted sets, streams	Strings only
Persistence	RDB snapshots, AOF log	None (pure cache)
Replication	Built-in primary/replica	None native
Memory efficiency	Slightly higher overhead per key	Lower overhead, slab allocator
Max value size	512MB	1MB (default)
Multi-threaded	Single-threaded (event loop) + I/O threads (Redis 6+)	Multi-threaded
Use case	Complex data, pub/sub, persistence needed	Simple key-value, maximum throughput

Default recommendation: Redis for most applications (richer feature set). Memcached for simple key-value with very high throughput requirements.

Implement multi-tier caching. Combine in-process (L1) and distributed (L2) caches:

const lruCache = new Map(); // L1: in-process, ~1000 entries
const LRU_MAX = 1000;
const LRU_TTL = 30000; // 30 seconds

async function getCached(key) {
  // L1: check in-process cache (sub-microsecond)
  const l1 = lruCache.get(key);
  if (l1 && Date.now() - l1.timestamp < LRU_TTL) {
    return l1.value;
  }

  // L2: check Redis (~0.5-1ms)
  const l2 = await redis.get(key);
  if (l2) {
    const value = JSON.parse(l2);
    lruCache.set(key, { value, timestamp: Date.now() });
    if (lruCache.size > LRU_MAX) {
      const firstKey = lruCache.keys().next().value;
      lruCache.delete(firstKey);
    }
    return value;
  }

  return null; // Cache miss — caller fetches from DB
}

Optimize serialization. JSON is human-readable but not the most efficient. For high-throughput caching:
- JSON — universally supported, ~100-500MB/s serialization
- MessagePack — binary JSON, 30-50% smaller, 2-3x faster
- Protocol Buffers — schema-based, smallest size, fastest decode, requires schema management
Set appropriate TTLs. TTL prevents stale data accumulation and bounds memory growth:
- Session data: 30 minutes - 24 hours
- User profiles: 5-15 minutes
- Product catalog: 1-5 minutes
- Rate-limiting counters: match the rate window (1 minute, 1 hour)
- Config/feature flags: 30-60 seconds

Handle cache failures gracefully. The cache is not the source of truth — the database is:

async function getUserWithFallback(userId) {
  try {
    const cached = await redis.get(`user:${userId}`);
    if (cached) return JSON.parse(cached);
  } catch (error) {
    // Cache is down — fall through to database
    logger.warn('Cache unavailable, falling back to DB', { error });
  }
  return db.users.findById(userId);
}

Details

Cache Consistency in Distributed Systems

In a distributed system, cache consistency depends on the invalidation pattern:

Delete on write (preferred) — delete cache entry when source data changes. Next read triggers cache-aside refill. Avoids race conditions where two concurrent updates write conflicting values.
Update on write (risky) — update cache with new value on write. Race condition: if two writes happen concurrently, cache may contain the loser's value while DB has the winner's value.
TTL-based expiry — accept eventual consistency. Cache may be stale for up to TTL seconds. Simplest to implement.

Worked Example: Instagram Redis Caching

Instagram uses Redis for caching user timelines, storing 300 million user sessions with sub-millisecond read latency. Their architecture uses consistent hashing across a Redis cluster to distribute keys evenly. Each Redis node handles ~100,000 operations per second. They use Redis hashes for user profile data (storing fields individually rather than serializing the entire profile), enabling partial reads and updates without deserializing the full object. Session data uses Redis strings with 24-hour TTL for automatic cleanup.

Worked Example: GitHub Multi-Tier Caching

GitHub reduced database load by 50% by implementing a three-tier caching strategy for repository metadata: (1) per-process LRU cache (L1, ~5MB per process, 10-second TTL) handles repeated reads within a single request lifecycle, (2) Redis cluster (L2, 100GB total) handles cross-process caching with 5-minute TTL, (3) CDN edge caching (L3) handles public repository pages with 60-second TTL. A single page view for a popular repository like

torvalds/linux

hits L1 for repeated reads of the same repo metadata during template rendering, L2 for repo stats and contributor data, and L3 for the rendered HTML.

Anti-Patterns

Caching without TTL. Without TTL, cache entries live forever. Stale data accumulates, eventually consuming all available memory. Redis will evict keys using its eviction policy (default:

noeviction

, which rejects new writes when memory is full). Always set explicit TTLs.

Serializing entire ORM objects. ORM objects include metadata, lazy-loaded relation proxies, and internal state. A User ORM object might serialize to 5KB when the actual data is 500 bytes. Extract plain data objects before caching.

Using cache as primary data store without persistence. Redis can persist data (RDB/AOF), but cache instances are often configured without persistence for performance. If the cache restarts, all data is lost. Never use a non-persistent cache as the sole store for important data.

Not handling cache failures gracefully. If every request requires cache and cache goes down, the entire application fails. Implement circuit breakers and database fallback for when cache is unavailable.

Source

Redis documentation — https://redis.io/docs/
Memcached wiki — https://github.com/memcached/memcached/wiki
"Scaling Memcache at Facebook" (NSDI 2013) — https://www.usenix.org/conference/nsdi13/technical-sessions/presentation/nishtala
Martin Kleppmann, "Designing Data-Intensive Applications" (O'Reilly), Chapter 5
AWS ElastiCache best practices — https://docs.aws.amazon.com/AmazonElastiCache/latest/mem-ug/BestPractices.html

Process

Read the instructions and examples in this document.
Apply the patterns to your implementation, adapting to your specific context.
Verify your implementation against the details and edge cases listed above.

Harness Integration

Type: knowledge — this skill is a reference document, not a procedural workflow.
No tools or state — consumed as context by other skills and agents.

Success Criteria

The patterns described in this document are applied correctly in the implementation.
Edge cases and anti-patterns listed in this document are avoided.
Cache hit rate exceeds 80% for read-heavy workloads with appropriate TTLs set per data type.
Cache failures do not cascade to application failures (graceful degradation to database).
Memory consumption is bounded with explicit TTLs and eviction policies configured.