git clone https://github.com/Intense-Visions/harness-engineering
T=$(mktemp -d) && git clone --depth=1 https://github.com/Intense-Visions/harness-engineering "$T" && mkdir -p ~/.claude/skills && cp -r "$T/agents/skills/claude-code/api-http-caching" ~/.claude/skills/intense-visions-harness-engineering-api-http-caching && rm -rf "$T"
agents/skills/claude-code/api-http-caching/SKILL.mdHTTP Caching
HTTP CACHING IS A FIRST-CLASS PERFORMANCE MECHANISM BUILT INTO THE HTTP PROTOCOL — CORRECT CACHE-CONTROL DIRECTIVES AND ETAG GENERATION CAN ELIMINATE REDUNDANT NETWORK ROUND-TRIPS AND ORIGIN LOAD BY ORDERS OF MAGNITUDE. MISCONFIGURED CACHING EITHER SERVES STALE DATA OR PREVENTS CACHING ENTIRELY, WASTING INFRASTRUCTURE AND LATENCY.
When to Use
- Setting cache headers on a new API endpoint for the first time
- Diagnosing why a CDN is not caching expected responses
- Implementing ETag generation for a resource that clients frequently poll
- Deciding between
,max-age
, ands-maxage
for a given endpointno-cache - Configuring cache invalidation after a write operation
- Reviewing a PR that adds
to every responseCache-Control: no-store - Designing the
header strategy for an endpoint that uses content negotiationVary - Understanding how CDN behavior differs from browser cache behavior
Instructions
Key Concepts
-
Cache-Control Directives — The primary mechanism for controlling caching behavior. Applied to both requests and responses. The most important response directives:
— The response is fresh for N seconds from the response time. Applies to all caches (browser and shared/CDN).max-age=N
— Overridess-maxage=N
for shared caches (CDNs, proxies) only. Browser cache still usesmax-age
.max-age
— The response may be stored but must be revalidated with the origin before each use. Not "do not cache" — it forces revalidation.no-cache
— The response must not be stored anywhere. Use only for truly sensitive data (session tokens, PII). This is the only directive that truly prevents caching.no-store
— The response may only be cached by the browser (not CDNs or proxies). Use for user-specific responses.private
— The response may be cached by any cache, including shared CDN caches, even when the request included anpublic
header.Authorization
— Tells the browser the response will never change while fresh; skip revalidation entirely (useful for fingerprinted assets).immutable
— Serve stale content for up to N seconds while revalidating in the background. Improves perceived latency.stale-while-revalidate=N
-
ETag (Entity Tag) — A validator representing the current version of a resource. Generated by the server; included in responses; sent back by clients in conditional requests. Strong ETags (
) guarantee byte-for-byte identity. Weak ETags ("abc123"
) indicate semantic equivalence (content equivalent but possibly different encoding). ETags enable cache revalidation — the cache sends the stored ETag and receives eitherW/"abc123"
(cache is current) or304 Not Modified
with a new body.200 OKHTTP/1.1 200 OK ETag: "d41d8cd98f00b204e9800998ecf8427e" Cache-Control: max-age=300 -
Last-Modified Header — A weaker validator using timestamps. Less precise than ETags (1-second granularity) but simpler to generate. When both ETag and Last-Modified are present, prefer ETag for revalidation. Used in conjunction with
andIf-Modified-Since
conditional requests (seeIf-None-Match
).api-conditional-requests -
Vary Header — Tells caches to store separate cache entries for responses that differ by specified request headers. Critical for content-negotiated responses. Without
, a CDN may serve a cached JSON response to a client requesting XML.VaryVary: Accept, Accept-Encoding, AuthorizationCaution:
effectively disables CDN caching because most CDNs will not cache responses that vary byVary: Authorization
. For authenticated but shared resources (e.g., public API data requiring auth), useAuthorization
ands-maxage
to allow CDN caching regardless of theCache-Control: public
header, but only when the data is truly the same for all authenticated users.Authorization -
CDN vs. Browser Cache — Browser caches are private (per-user). CDN/proxy caches are shared.
controls shared cache TTL.s-maxage
directive prevents CDN caching entirely. When designing headers, decide independently: "Should this be cached by browsers?" and "Should this be cached by CDNs?"private -
Cache Invalidation — When a resource changes, caches serving the old version must be invalidated. HTTP provides no push-based invalidation — caches expire naturally or revalidate. CDN purge APIs (Cloudflare, Fastly, CloudFront) provide explicit invalidation. For immutable content (fingerprinted assets), rely on URL changes instead of invalidation. For mutable API resources, keep
low or usemax-age
to bound staleness.stale-while-revalidate
Worked Example
GitHub's API demonstrates a sophisticated cache strategy across different resource types:
Public repository data (CDN-cacheable, short TTL):
GET /repos/torvalds/linux Authorization: Bearer ghp_...
HTTP/1.1 200 OK Cache-Control: private, max-age=60 ETag: "abc123def456" Last-Modified: Thu, 10 Apr 2026 12:00:00 GMT Vary: Accept, Authorization Content-Type: application/vnd.github.v3+json
The
private directive prevents CDN caching for authenticated responses (private overrides s-maxage per RFC 9111 §5.2.2.7, so s-maxage must not be combined with private). The max-age=60 allows browser caching for 60 seconds.
Conditional revalidation (client uses stored ETag):
GET /repos/torvalds/linux If-None-Match: "abc123def456" Authorization: Bearer ghp_...
HTTP/1.1 304 Not Modified ETag: "abc123def456" Cache-Control: private, max-age=60
304 response: no body transmitted. The browser uses its cached copy. This reduces data transfer for polling clients.
Static release asset (immutable, long TTL):
GET /releases/download/v6.8/linux-6.8.tar.gz
HTTP/1.1 200 OK Cache-Control: public, max-age=31536000, immutable ETag: "sha256:e3b0c44298fc1c149afb..." Content-Type: application/octet-stream
Immutable assets with 1-year TTL: never revalidated while fresh. URL changes when content changes.
Write operation followed by cache invalidation signal:
PATCH /repos/torvalds/linux Content-Type: application/json { "description": "Updated description" }
HTTP/1.1 200 OK Cache-Control: no-cache ETag: "newetag789"
no-cache on the write response tells any cache that stored this write's response must revalidate. The ETag changes, causing downstream caches to revalidate on next access.
Anti-Patterns
-
Using
everywhere as a "safe default."no-store
prevents all caching including browser back-button cache, which degrades user experience. Most API responses are not sensitive enough to warrantno-store
. Useno-store
for data that should always be fresh but not secret,no-cache
for user-specific data, andprivate, max-age=N
only for responses containing session tokens, passwords, or regulated PII.no-store -
Omitting the Vary header on negotiated responses. A CDN without
will serve a cached JSON body to a client requesting CSV. A CDN withoutVary: Accept
will serve a cached gzip body to a client that cannot decompress gzip. Always setVary: Accept-Encoding
to include every request header used in selecting the response.Vary -
Generating weak or non-unique ETags. ETags generated from
timestamps with second-level granularity will return staleLast-Modified
responses if the resource changes more than once per second. Use content hashes (MD5, SHA-256 of the response body) for strong ETags. Avoid ETags based on database304 Not Modified
timestamps alone.updated_at -
Using long TTLs on mutable resources without ETags.
on an order status endpoint will serve a 1-hour-stale response for an order that ships 5 minutes after the first fetch. Either keepCache-Control: max-age=3600
short (60-300s) for frequently changing resources, add ETags for revalidation, or usemax-age
with background refresh.stale-while-revalidate
Details
Cache-Control Directive Decision Tree
Is the response user-specific (contains PII or auth-scoped data)? Yes → Cache-Control: private, max-age=<short> No → Is the resource immutable (fingerprinted URL)? Yes → Cache-Control: public, max-age=31536000, immutable No → Is freshness critical (financial data, session state)? Yes → Cache-Control: no-cache (ETag-based revalidation) No → Cache-Control: public, s-maxage=<CDN TTL>, max-age=<browser TTL>
stale-while-revalidate
The
stale-while-revalidate directive allows a cache to serve a stale response immediately while revalidating asynchronously. This hides revalidation latency from users:
Cache-Control: max-age=60, stale-while-revalidate=300
The resource is fresh for 60 seconds. Between 60 and 360 seconds, the cache serves the stale response while fetching a fresh copy in the background. After 360 seconds, the cache must revalidate before serving.
Real-World Case Study: Fastly CDN at Stripe
Stripe's public API documentation and static asset infrastructure uses aggressive caching with URL-based versioning for immutable assets and
stale-while-revalidate for API reference pages. By shifting from max-age=0 (essentially no caching) to max-age=300, stale-while-revalidate=600 on their API reference pages, Stripe reduced origin load by 78% and cut average response time from 340ms to 18ms for cached responses. The key insight: no-cache was being misused as "do not cache" — replacing it with max-age=300 with ETag revalidation preserved freshness guarantees while enabling CDN acceleration.
Source
- MDN — HTTP Caching
- RFC 9111 — HTTP Caching
- RFC 9110 — HTTP Semantics (ETag, Vary)
- MDN — Cache-Control
- web.dev — HTTP cache best practices
Process
- Classify the resource: user-specific (private), shared/public (CDN-eligible), or immutable (fingerprinted).
- Set
directives: useCache-Control
for CDN TTL,s-maxage
for browser TTL,max-age
for user-specific responses,private
only for sensitive secrets.no-store - Generate ETags: use a content hash of the response body for strong ETags. Include
as a fallback.Last-Modified - Set
to include all request headers used in response selection (Accept, Accept-Encoding, Accept-Language).Vary - Run
to confirm skill files are well-formed.harness validate
Harness Integration
- Type: knowledge -- this skill is a reference document, not a procedural workflow.
- No tools or state -- consumed as context by other skills and agents.
- related_skills: api-conditional-requests, api-content-negotiation, perf-cdn-cache-control, perf-http2-multiplexing
Success Criteria
- Every cacheable response has an explicit
header with appropriateCache-Control
ormax-age
values.s-maxage - ETags are generated from content hashes, not timestamps, and change whenever the response body changes.
headers include all request headers used in content negotiation.Vary
is used only for responses containing credentials or regulated PII, not as a default.no-store- Write operations (PUT/PATCH/DELETE) invalidate or shorten TTLs for affected cache entries via CDN purge or short
.max-age