Harness-engineering api-http-caching

HTTP Caching

install
source · Clone the upstream repo
git clone https://github.com/Intense-Visions/harness-engineering
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/Intense-Visions/harness-engineering "$T" && mkdir -p ~/.claude/skills && cp -r "$T/agents/skills/claude-code/api-http-caching" ~/.claude/skills/intense-visions-harness-engineering-api-http-caching && rm -rf "$T"
manifest: agents/skills/claude-code/api-http-caching/SKILL.md
source content

HTTP Caching

HTTP CACHING IS A FIRST-CLASS PERFORMANCE MECHANISM BUILT INTO THE HTTP PROTOCOL — CORRECT CACHE-CONTROL DIRECTIVES AND ETAG GENERATION CAN ELIMINATE REDUNDANT NETWORK ROUND-TRIPS AND ORIGIN LOAD BY ORDERS OF MAGNITUDE. MISCONFIGURED CACHING EITHER SERVES STALE DATA OR PREVENTS CACHING ENTIRELY, WASTING INFRASTRUCTURE AND LATENCY.

When to Use

  • Setting cache headers on a new API endpoint for the first time
  • Diagnosing why a CDN is not caching expected responses
  • Implementing ETag generation for a resource that clients frequently poll
  • Deciding between
    max-age
    ,
    s-maxage
    , and
    no-cache
    for a given endpoint
  • Configuring cache invalidation after a write operation
  • Reviewing a PR that adds
    Cache-Control: no-store
    to every response
  • Designing the
    Vary
    header strategy for an endpoint that uses content negotiation
  • Understanding how CDN behavior differs from browser cache behavior

Instructions

Key Concepts

  1. Cache-Control Directives — The primary mechanism for controlling caching behavior. Applied to both requests and responses. The most important response directives:

    • max-age=N
      — The response is fresh for N seconds from the response time. Applies to all caches (browser and shared/CDN).
    • s-maxage=N
      — Overrides
      max-age
      for shared caches (CDNs, proxies) only. Browser cache still uses
      max-age
      .
    • no-cache
      — The response may be stored but must be revalidated with the origin before each use. Not "do not cache" — it forces revalidation.
    • no-store
      — The response must not be stored anywhere. Use only for truly sensitive data (session tokens, PII). This is the only directive that truly prevents caching.
    • private
      — The response may only be cached by the browser (not CDNs or proxies). Use for user-specific responses.
    • public
      — The response may be cached by any cache, including shared CDN caches, even when the request included an
      Authorization
      header.
    • immutable
      — Tells the browser the response will never change while fresh; skip revalidation entirely (useful for fingerprinted assets).
    • stale-while-revalidate=N
      — Serve stale content for up to N seconds while revalidating in the background. Improves perceived latency.
  2. ETag (Entity Tag) — A validator representing the current version of a resource. Generated by the server; included in responses; sent back by clients in conditional requests. Strong ETags (

    "abc123"
    ) guarantee byte-for-byte identity. Weak ETags (
    W/"abc123"
    ) indicate semantic equivalence (content equivalent but possibly different encoding). ETags enable cache revalidation — the cache sends the stored ETag and receives either
    304 Not Modified
    (cache is current) or
    200 OK
    with a new body.

    HTTP/1.1 200 OK
    ETag: "d41d8cd98f00b204e9800998ecf8427e"
    Cache-Control: max-age=300
    
  3. Last-Modified Header — A weaker validator using timestamps. Less precise than ETags (1-second granularity) but simpler to generate. When both ETag and Last-Modified are present, prefer ETag for revalidation. Used in conjunction with

    If-Modified-Since
    and
    If-None-Match
    conditional requests (see
    api-conditional-requests
    ).

  4. Vary Header — Tells caches to store separate cache entries for responses that differ by specified request headers. Critical for content-negotiated responses. Without

    Vary
    , a CDN may serve a cached JSON response to a client requesting XML.

    Vary: Accept, Accept-Encoding, Authorization
    

    Caution:

    Vary: Authorization
    effectively disables CDN caching because most CDNs will not cache responses that vary by
    Authorization
    . For authenticated but shared resources (e.g., public API data requiring auth), use
    s-maxage
    and
    Cache-Control: public
    to allow CDN caching regardless of the
    Authorization
    header, but only when the data is truly the same for all authenticated users.

  5. CDN vs. Browser Cache — Browser caches are private (per-user). CDN/proxy caches are shared.

    s-maxage
    controls shared cache TTL.
    private
    directive prevents CDN caching entirely. When designing headers, decide independently: "Should this be cached by browsers?" and "Should this be cached by CDNs?"

  6. Cache Invalidation — When a resource changes, caches serving the old version must be invalidated. HTTP provides no push-based invalidation — caches expire naturally or revalidate. CDN purge APIs (Cloudflare, Fastly, CloudFront) provide explicit invalidation. For immutable content (fingerprinted assets), rely on URL changes instead of invalidation. For mutable API resources, keep

    max-age
    low or use
    stale-while-revalidate
    to bound staleness.

Worked Example

GitHub's API demonstrates a sophisticated cache strategy across different resource types:

Public repository data (CDN-cacheable, short TTL):

GET /repos/torvalds/linux
Authorization: Bearer ghp_...
HTTP/1.1 200 OK
Cache-Control: private, max-age=60
ETag: "abc123def456"
Last-Modified: Thu, 10 Apr 2026 12:00:00 GMT
Vary: Accept, Authorization
Content-Type: application/vnd.github.v3+json

The

private
directive prevents CDN caching for authenticated responses (
private
overrides
s-maxage
per RFC 9111 §5.2.2.7, so
s-maxage
must not be combined with
private
). The
max-age=60
allows browser caching for 60 seconds.

Conditional revalidation (client uses stored ETag):

GET /repos/torvalds/linux
If-None-Match: "abc123def456"
Authorization: Bearer ghp_...
HTTP/1.1 304 Not Modified
ETag: "abc123def456"
Cache-Control: private, max-age=60

304 response: no body transmitted. The browser uses its cached copy. This reduces data transfer for polling clients.

Static release asset (immutable, long TTL):

GET /releases/download/v6.8/linux-6.8.tar.gz
HTTP/1.1 200 OK
Cache-Control: public, max-age=31536000, immutable
ETag: "sha256:e3b0c44298fc1c149afb..."
Content-Type: application/octet-stream

Immutable assets with 1-year TTL: never revalidated while fresh. URL changes when content changes.

Write operation followed by cache invalidation signal:

PATCH /repos/torvalds/linux
Content-Type: application/json

{ "description": "Updated description" }
HTTP/1.1 200 OK
Cache-Control: no-cache
ETag: "newetag789"

no-cache
on the write response tells any cache that stored this write's response must revalidate. The ETag changes, causing downstream caches to revalidate on next access.

Anti-Patterns

  1. Using

    no-store
    everywhere as a "safe default."
    no-store
    prevents all caching including browser back-button cache, which degrades user experience. Most API responses are not sensitive enough to warrant
    no-store
    . Use
    no-cache
    for data that should always be fresh but not secret,
    private, max-age=N
    for user-specific data, and
    no-store
    only for responses containing session tokens, passwords, or regulated PII.

  2. Omitting the Vary header on negotiated responses. A CDN without

    Vary: Accept
    will serve a cached JSON body to a client requesting CSV. A CDN without
    Vary: Accept-Encoding
    will serve a cached gzip body to a client that cannot decompress gzip. Always set
    Vary
    to include every request header used in selecting the response.

  3. Generating weak or non-unique ETags. ETags generated from

    Last-Modified
    timestamps with second-level granularity will return stale
    304 Not Modified
    responses if the resource changes more than once per second. Use content hashes (MD5, SHA-256 of the response body) for strong ETags. Avoid ETags based on database
    updated_at
    timestamps alone.

  4. Using long TTLs on mutable resources without ETags.

    Cache-Control: max-age=3600
    on an order status endpoint will serve a 1-hour-stale response for an order that ships 5 minutes after the first fetch. Either keep
    max-age
    short (60-300s) for frequently changing resources, add ETags for revalidation, or use
    stale-while-revalidate
    with background refresh.

Details

Cache-Control Directive Decision Tree

Is the response user-specific (contains PII or auth-scoped data)?
  Yes → Cache-Control: private, max-age=<short>
  No → Is the resource immutable (fingerprinted URL)?
    Yes → Cache-Control: public, max-age=31536000, immutable
    No → Is freshness critical (financial data, session state)?
      Yes → Cache-Control: no-cache (ETag-based revalidation)
      No → Cache-Control: public, s-maxage=<CDN TTL>, max-age=<browser TTL>

stale-while-revalidate

The

stale-while-revalidate
directive allows a cache to serve a stale response immediately while revalidating asynchronously. This hides revalidation latency from users:

Cache-Control: max-age=60, stale-while-revalidate=300

The resource is fresh for 60 seconds. Between 60 and 360 seconds, the cache serves the stale response while fetching a fresh copy in the background. After 360 seconds, the cache must revalidate before serving.

Real-World Case Study: Fastly CDN at Stripe

Stripe's public API documentation and static asset infrastructure uses aggressive caching with URL-based versioning for immutable assets and

stale-while-revalidate
for API reference pages. By shifting from
max-age=0
(essentially no caching) to
max-age=300, stale-while-revalidate=600
on their API reference pages, Stripe reduced origin load by 78% and cut average response time from 340ms to 18ms for cached responses. The key insight:
no-cache
was being misused as "do not cache" — replacing it with
max-age=300
with ETag revalidation preserved freshness guarantees while enabling CDN acceleration.

Source

Process

  1. Classify the resource: user-specific (private), shared/public (CDN-eligible), or immutable (fingerprinted).
  2. Set
    Cache-Control
    directives: use
    s-maxage
    for CDN TTL,
    max-age
    for browser TTL,
    private
    for user-specific responses,
    no-store
    only for sensitive secrets.
  3. Generate ETags: use a content hash of the response body for strong ETags. Include
    Last-Modified
    as a fallback.
  4. Set
    Vary
    to include all request headers used in response selection (Accept, Accept-Encoding, Accept-Language).
  5. Run
    harness validate
    to confirm skill files are well-formed.

Harness Integration

  • Type: knowledge -- this skill is a reference document, not a procedural workflow.
  • No tools or state -- consumed as context by other skills and agents.
  • related_skills: api-conditional-requests, api-content-negotiation, perf-cdn-cache-control, perf-http2-multiplexing

Success Criteria

  • Every cacheable response has an explicit
    Cache-Control
    header with appropriate
    max-age
    or
    s-maxage
    values.
  • ETags are generated from content hashes, not timestamps, and change whenever the response body changes.
  • Vary
    headers include all request headers used in content negotiation.
  • no-store
    is used only for responses containing credentials or regulated PII, not as a default.
  • Write operations (PUT/PATCH/DELETE) invalidate or shorten TTLs for affected cache entries via CDN purge or short
    max-age
    .