API Caching Strategies
Optimize performance with HTTP cache headers, ETags, and CDN caching
Why Cache APIs?
Faster Responses
Cached responses are served instantly without hitting the origin server, reducing latency from seconds to milliseconds.
Reduce Server Load
Cache hits bypass your application server entirely, allowing you to handle more users with the same infrastructure.
Lower Costs
Fewer server requests means reduced compute costs, database queries, and bandwidth usage.
Better User Experience
Fast API responses lead to snappier applications, improving user satisfaction and engagement.
Cache-Control Header
The Cache-Control header is the primary mechanism for controlling caching behavior in HTTP/1.1 and beyond.
Public Caching (CDN & Browser)
Cache-Control: public, max-age=3600
Allows caching by CDNs, proxies, and browsers. Response stays fresh for 1 hour (3600 seconds).
Private Caching (Browser Only)
Cache-Control: private, max-age=600
Only the user's browser can cache this. Use for user-specific data that shouldn't be shared.
Revalidation Required
Cache-Control: no-cache
Cache can store the response but must revalidate with the server before using it. Not the same as "don't cache"!
No Caching At All
Cache-Control: no-store
Never store the response anywhere. Use for sensitive data like banking or health information.
Cache-Control Directives
Cacheability
Expiration
ETag & If-None-Match
ETags provide a fingerprint of the response content, enabling efficient cache validation without transferring the entire response.
How ETags Work
- Server generates a unique ETag for the response content
- Client stores the ETag alongside the cached response
- On subsequent requests, client sends
If-None-Match: "etag" - If content unchanged, server returns
304 Not Modified - Client uses cached response, saving bandwidth
HTTP/1.1 200 OK
ETag: "a1b2c3d4e5f6"
Content-Type: application/json
{
"id": 123,
"name": "Widget",
"price": 29.99
}
Conditional Request (Content Unchanged)
If-None-Match: "a1b2c3d4e5f6"
HTTP/1.1 304 Not Modified
ETag: "a1b2c3d4e5f6"
(no body - use cached response)
Conditional Request (Content Changed)
If-None-Match: "a1b2c3d4e5f6"
HTTP/1.1 200 OK
ETag: "newetag789"
Content-Type: application/json
{
"id": 123,
"name": "Widget Pro",
"price": 39.99
}
💪 Strong ETags
"a1b2c3d4"
Byte-for-byte identical. Use for exact content matching.
〰️ Weak ETags
W/"a1b2c3d4"
Semantically equivalent. Use when minor changes (whitespace, formatting) don't matter.
Last-Modified & If-Modified-Since
Date-based cache validation using timestamps. Simpler than ETags but less precise.
How It Works
- Server includes
Last-Modifiedheader with response - Client stores the timestamp with cached response
- On revalidation, client sends
If-Modified-Since - Server compares timestamps and returns 304 or new content
When to Use
HTTP/1.1 200 OK
Last-Modified: Wed, 12 Feb 2025 08:00:00 GMT
Content-Type: application/json
{
"id": 456,
"title": "Caching Guide",
"updated": "2025-02-12"
}
Conditional Request with Timestamp
If-Modified-Since: Wed, 12 Feb 2025 08:00:00 GMT
HTTP/1.1 304 Not Modified
Last-Modified: Wed, 12 Feb 2025 08:00:00 GMT
(no body)
CDN Caching
Content Delivery Networks cache responses at edge locations worldwide, serving users from the nearest point of presence.
Edge Caching
CDN nodes cache responses at locations near your users. A user in Tokyo gets served from a Tokyo edge, not your origin in the US.
Cache Invalidation
Purge specific URLs or patterns when content changes. Most CDNs offer API endpoints or dashboards for instant purges.
Vary Header
Tell CDNs to cache different versions based on request headers like Accept-Language or Accept-Encoding.
CDN-Optimized Headers
Cache-Control: public, max-age=60, s-maxage=3600
Vary: Accept-Encoding, Accept-Language
Browser caches for 1 minute, CDN caches for 1 hour. Separate cached versions for different encodings and languages.
Surrogate-Control (CDN-Specific)
Cache-Control: private, no-cache
Surrogate-Control: max-age=3600
Browsers don't cache (user-specific data), but CDN caches for edge-side personalization.
What to Cache
Not all API responses are suitable for caching. Here's a guide to help you decide.
GET Requests
GET requests are safe and idempotent - perfect candidates for caching.
GET /api/products
GET /api/categories
Static Data
Configuration, translations, feature flags - data that rarely changes.
GET /api/config
GET /api/i18n/en
Reference Data
Country lists, categories, taxonomies - lookup data that changes infrequently.
GET /api/countries
GET /api/currencies
POST/PUT/DELETE
Write operations should never be cached - they modify server state.
POST /api/orders
DELETE /api/users/123
User-Specific Data
Personal data, preferences, account info - use private at most.
GET /api/me/profile
GET /api/cart
Real-Time Data
Stock prices, live scores, chat messages - data that changes constantly.
GET /api/stocks/AAPL
GET /api/notifications
Cache Invalidation
"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton
⏱️ TTL-Based (Time To Live)
Cache-Control: max-age=3600
Cache expires automatically after a set time. Simple but may serve stale data until TTL expires.
Best for: Content that can tolerate slight staleness (product listings, blog posts).
📣 Event-Based
When content updates, actively purge related cache entries via CDN API or cache-busting.
Best for: Critical content that must be fresh (pricing, inventory, breaking news).
🔢 Versioned URLs
/api/v2/products?v=1707730800
Include version or timestamp in URL. New content = new URL = cache miss = fresh data.
Best for: Static assets, API responses that clients can re-fetch with new version.
Combining Strategies
# Short TTL + ETag for validation
Cache-Control: public, max-age=60
ETag: "v123-abc"
# Client revalidates after 60s
# If content unchanged: 304 (no data transfer)
# If content changed: 200 with new data + new ETag
Short TTL ensures freshness; ETags prevent unnecessary data transfer when content hasn't changed.
Stale-While-Revalidate Pattern
Cache-Control: public, max-age=60, stale-while-revalidate=3600
Serve potentially stale content immediately while fetching fresh content in background. Best of both worlds: speed + freshness.