Caching Strategies That Cut Response Times by 90%: A Practical Web Developer Guide
Your database is melting. Every page load triggers 20 queries. Response times hover around 800ms on a good day, spike to 3 seconds during traffic bursts. Your infrastructure costs are climbing as you scale up database instances. Sound familiar?
Then you implement caching. Suddenly:
- Database queries drop by 95%
- Response times plummet to 50ms
- Your servers handle 10x the traffic
- Infrastructure costs decrease
Caching is often called "the closest thing to magic in computer science"—it's one of the few optimization techniques that can deliver 10-100x performance improvements with relatively straightforward implementation. But caching isn't just "add Redis and hope for the best." The wrong caching strategy can make things worse, serving stale data, introducing race conditions, or consuming memory without providing benefits.
This guide covers battle-tested caching strategies for modern web applications, from browser caching to distributed Redis patterns, with practical code examples and decision frameworks to choose the right approach for your use case.
The Caching Hierarchy
Modern web applications have multiple caching layers:
graph TD
A[User Request] --> B{Browser Cache}
B -->|Miss| C{CDN Cache}
C -->|Miss| D{Application Cache<br/>Redis/Memory}
D -->|Miss| E{Database Query Cache}
E -->|Miss| F[Database]
B -->|Hit| G[Return Cached]
C -->|Hit| G
D -->|Hit| G
E -->|Hit| G
F --> G
style B fill:#c5e1a5
style C fill:#bbdefb
style D fill:#fff9c4
style E fill:#ffccbc
style F fill:#f8bbd0
Each layer has different characteristics:
| Layer | Speed | Scope | Size Limit | Control | Best For |
|---|---|---|---|---|---|
| Browser Cache | Fastest (0ms) | Per-user | ~100MB | Low | Static assets, public content |
| CDN Cache | Very Fast (< 50ms) | Global | Large | Medium | Static assets, public APIs |
| Application Cache (in-memory) | Fast (< 1ms) | Per-server | Limited by RAM | High | Server-side computations |
| Application Cache (Redis) | Fast (< 5ms) | Shared | Large | High | Session data, computed results |
| Database Query Cache | Medium (10-50ms) | Per-DB | Moderate | Low | Repeated queries |
Core Caching Patterns
1. Cache-Aside (Lazy Loading)
The application manages the cache explicitly. On read: check cache, if miss, fetch from database, populate cache.
graph LR
A[Request Data] --> B{Check Cache}
B -->|Hit| C[Return Cached Data]
B -->|Miss| D[Query Database]
D --> E[Store in Cache]
E --> F[Return Data]
style B fill:#c5e1a5
style D fill:#ffccbc
Implementation:
// cache-aside.ts
import { Redis } from 'ioredis';
const redis = new Redis({
host: process.env.REDIS_HOST,
port: 6379,
db: 0,
});
interface User {
id: string;
name: string;
email: string;
}
async function getUserById(userId: string): Promise<User | null> {
const cacheKey = `user:${userId}`;
// 1. Try cache first
const cached = await redis.get(cacheKey);
if (cached) {
console.log('Cache hit');
return JSON.parse(cached);
}
// 2. Cache miss - fetch from database
console.log('Cache miss - fetching from DB');
const user = await db.query('SELECT * FROM users WHERE id = $1', [userId]);
if (user) {
// 3. Store in cache with expiration
await redis.setex(cacheKey, 3600, JSON.stringify(user)); // 1 hour TTL
}
return user;
}
// Usage
const user = await getUserById('user_123');
Pros:
- Simple to implement and understand
- Works well for read-heavy workloads
- Cache failures don't break the application
Cons:
- Cache miss penalty (extra latency)
- Potential cache stampede on popular items
- Stale data possible if not invalidated
2. Write-Through Cache
Data is written to cache and database simultaneously. Cache is always consistent with the database.
// write-through.ts
async function updateUser(userId: string, updates: Partial<User>): Promise<User> {
const cacheKey = `user:${userId}`;
// 1. Update database
const updatedUser = await db.query('UPDATE users SET name = $1, email = $2 WHERE id = $3 RETURNING *', [
updates.name,
updates.email,
userId,
]);
// 2. Immediately update cache (or invalidate)
if (updatedUser) {
await redis.setex(cacheKey, 3600, JSON.stringify(updatedUser));
}
return updatedUser;
}
Pros:
- Cache always consistent
- Reduces cache miss rate
Cons:
- Write latency (must write to both)
- Cache pollution (writing data that's never read)
3. Write-Behind (Write-Back) Cache
Write to cache immediately, asynchronously write to database. Maximize write performance.
// write-behind.ts
import { Queue, Worker } from 'bullmq';
const writeQueue = new Queue('database-writes', {
connection: { host: 'redis', port: 6379 },
});
async function updateUserWriteBehind(userId: string, updates: Partial<User>): Promise<void> {
const cacheKey = `user:${userId}`;
// 1. Update cache immediately
const currentUser = JSON.parse((await redis.get(cacheKey)) || '{}');
const updatedUser = { ...currentUser, ...updates };
await redis.setex(cacheKey, 3600, JSON.stringify(updatedUser));
// 2. Queue database write (async)
await writeQueue.add('update-user', {
userId,
updates,
timestamp: Date.now(),
});
}
// Background worker persists to database
const worker = new Worker(
'database-writes',
async (job) => {
const { userId, updates } = job.data;
await db.query('UPDATE users SET name = $1, email = $2 WHERE id = $3', [updates.name, updates.email, userId]);
},
{
connection: { host: 'redis', port: 6379 },
},
);
Pros:
- Extremely fast writes
- Can batch database writes
Cons:
- Risk of data loss if cache fails
- Complex to implement correctly
- Eventual consistency
4. Read-Through Cache
Cache sits between application and database. Application only talks to cache; cache handles database fetches.
// read-through.ts
class ReadThroughCache<T> {
constructor(
private redis: Redis,
private loader: (key: string) => Promise<T | null>,
private ttl: number = 3600,
) {}
async get(key: string): Promise<T | null> {
// Check cache
const cached = await this.redis.get(key);
if (cached) {
return JSON.parse(cached);
}
// Cache miss - load from source
const value = await this.loader(key);
if (value) {
// Populate cache
await this.redis.setex(key, this.ttl, JSON.stringify(value));
}
return value;
}
}
// Usage
const userCache = new ReadThroughCache<User>(
redis,
async (userId) => {
return await db.query('SELECT * FROM users WHERE id = $1', [userId]);
},
3600,
);
const user = await userCache.get('user:123');
Advanced Caching Strategies
Cache Warming
Pre-populate cache with frequently accessed data before traffic arrives:
// cache-warming.ts
import cron from 'node-cron';
async function warmPopularUserCache() {
console.log('Starting cache warming...');
// Get top 1000 most active users
const popularUsers = await db.query(`
SELECT user_id, COUNT(*) as activity_count
FROM user_activity
WHERE created_at > NOW() - INTERVAL '24 hours'
GROUP BY user_id
ORDER BY activity_count DESC
LIMIT 1000
`);
// Pre-load into cache
const promises = popularUsers.map(async ({ user_id }) => {
const user = await db.query('SELECT * FROM users WHERE id = $1', [user_id]);
if (user) {
await redis.setex(`user:${user_id}`, 3600, JSON.stringify(user));
}
});
await Promise.all(promises);
console.log(`Warmed cache with ${popularUsers.length} users`);
}
// Run cache warming daily at 5am (before traffic peak)
cron.schedule('0 5 * * *', warmPopularUserCache);
// Also warm on application startup
warmPopularUserCache();
Cache Stampede Prevention
When a popular cache key expires, multiple requests might simultaneously try to refresh it, overwhelming the database.
Solution: Locking and Early Recomputation
// cache-stampede-prevention.ts
async function getWithStampedePrevention<T>(key: string, loader: () => Promise<T>, ttl: number = 3600): Promise<T> {
const lockKey = `lock:${key}`;
const lockTTL = 10; // 10 second lock
// Try to get from cache
const cached = await redis.get(key);
if (cached) {
return JSON.parse(cached);
}
// Acquire lock
const lockAcquired = await redis.set(lockKey, '1', 'EX', lockTTL, 'NX');
if (lockAcquired) {
// We got the lock - we're responsible for loading
try {
const value = await loader();
await redis.setex(key, ttl, JSON.stringify(value));
return value;
} finally {
await redis.del(lockKey);
}
} else {
// Someone else is loading - wait a bit and retry
await new Promise((resolve) => setTimeout(resolve, 100));
return getWithStampedePrevention(key, loader, ttl);
}
}
// Usage
const user = await getWithStampedePrevention(
'user:123',
() => db.query('SELECT * FROM users WHERE id = $1', ['123']),
3600,
);
Probabilistic Early Expiration
Refresh cache before it expires for popular items:
// probabilistic-early-refresh.ts
async function getWithProbabilisticRefresh<T>(key: string, loader: () => Promise<T>, ttl: number = 3600): Promise<T> {
const cached = await redis.get(key);
const ttlRemaining = await redis.ttl(key);
if (cached) {
// Probabilistically refresh before expiration
const delta = ttl - ttlRemaining;
const probability = delta / ttl;
// As key gets older, higher chance of refresh
if (Math.random() < probability) {
// Refresh asynchronously (don't wait)
loader().then((value) => {
redis.setex(key, ttl, JSON.stringify(value));
});
}
return JSON.parse(cached);
}
// Cache miss - load and cache
const value = await loader();
await redis.setex(key, ttl, JSON.stringify(value));
return value;
}
Multi-Tier Caching
Combine in-memory (L1) and Redis (L2) for best performance:
// multi-tier-cache.ts
import NodeCache from 'node-cache';
const l1Cache = new NodeCache({
stdTTL: 60, // 1 minute in-memory
checkperiod: 120,
useClones: false, // For performance
});
async function getFromL1L2Cache<T>(key: string, loader: () => Promise<T>): Promise<T> {
// L1 check (in-memory)
const l1Value = l1Cache.get<T>(key);
if (l1Value !== undefined) {
console.log('L1 cache hit');
return l1Value;
}
// L2 check (Redis)
const l2Value = await redis.get(key);
if (l2Value) {
console.log('L2 cache hit');
const parsed = JSON.parse(l2Value);
// Populate L1
l1Cache.set(key, parsed);
return parsed;
}
// Full cache miss
console.log('Cache miss - loading from source');
const value = await loader();
// Populate both layers
l1Cache.set(key, value);
await redis.setex(key, 3600, JSON.stringify(value));
return value;
}
// Usage
const product = await getFromL1L2Cache('product:123', () => db.query('SELECT * FROM products WHERE id = $1', ['123']));
Cache Invalidation Strategies
"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton
Time-Based Expiration (TTL)
Simplest approach: let cache entries expire after a fixed time:
// TTL-based expiration
await redis.setex('user:123', 300, JSON.stringify(user)); // 5 minutes
Pros: Simple, prevents stale data Cons: Arbitrary TTL, potential inconsistency
Event-Based Invalidation
Invalidate cache when source data changes:
// event-based-invalidation.ts
import { EventEmitter } from 'events';
const cacheInvalidator = new EventEmitter();
// Invalidate on user update
async function updateUser(userId: string, updates: Partial<User>) {
const updatedUser = await db.query('UPDATE users SET name = $1, email = $2 WHERE id = $3 RETURNING *', [
updates.name,
updates.email,
userId,
]);
// Invalidate all related caches
const cacheKeys = [`user:${userId}`, `user:${userId}:profile`, `user:${userId}:settings`, `user:${userId}:projects`];
await redis.del(...cacheKeys);
// Emit event for distributed invalidation
cacheInvalidator.emit('user:updated', userId);
return updatedUser;
}
Tag-Based Invalidation
Group related cache entries by tags:
// tag-based-invalidation.ts
class TaggedCache {
private redis: Redis;
async set(key: string, value: any, ttl: number, tags: string[]) {
// Store the value
await this.redis.setex(key, ttl, JSON.stringify(value));
// Associate with tags
const tagPromises = tags.map((tag) => this.redis.sadd(`tag:${tag}`, key));
await Promise.all(tagPromises);
}
async invalidateByTag(tag: string) {
// Get all keys with this tag
const keys = await this.redis.smembers(`tag:${tag}`);
if (keys.length > 0) {
// Delete all tagged keys
await this.redis.del(...keys);
}
// Delete the tag set itself
await this.redis.del(`tag:${tag}`);
}
}
// Usage
const cache = new TaggedCache(redis);
await cache.set('user:123', user, 3600, ['user', 'user_123', 'org_456']);
await cache.set('project:789', project, 3600, ['project', 'user_123', 'org_456']);
// Invalidate all cache entries for organization 456
await cache.invalidateByTag('org_456');
Cache Versioning
Use version numbers in cache keys to invalidate without deletion:
// cache-versioning.ts
let cacheVersion = 1;
function getCacheKey(type: string, id: string): string {
return `v${cacheVersion}:${type}:${id}`;
}
async function invalidateAllCaches() {
// Increment version - old caches become inaccessible
cacheVersion++;
// Store new version in Redis for distributed systems
await redis.set('cache:version', cacheVersion);
}
// On app startup, get current version
const storedVersion = await redis.get('cache:version');
cacheVersion = storedVersion ? parseInt(storedVersion) : 1;
CDN and Browser Caching
HTTP Cache Headers
// express-cache-headers.ts
import express from 'express';
const app = express();
// Static assets: aggressive caching
app.use(
'/static',
express.static('public', {
maxAge: '1y', // 1 year
immutable: true,
}),
);
// API responses: conditional caching
app.get('/api/products', (req, res) => {
res.set({
'Cache-Control': 'public, max-age=300', // 5 minutes
ETag: generateETag(products),
Vary: 'Accept-Encoding',
});
res.json(products);
});
// User-specific data: no caching
app.get('/api/user/profile', (req, res) => {
res.set({
'Cache-Control': 'private, no-cache, no-store, must-revalidate',
Pragma: 'no-cache',
Expires: '0',
});
res.json(userProfile);
});
// Conditional requests (ETags)
function generateETag(data: any): string {
const hash = crypto.createHash('md5').update(JSON.stringify(data)).digest('hex');
return `"${hash}"`;
}
app.get('/api/data', (req, res) => {
const data = getData();
const etag = generateETag(data);
// Check if client has current version
if (req.headers['if-none-match'] === etag) {
res.status(304).end(); // Not Modified
return;
}
res.set('ETag', etag);
res.json(data);
});
Cache-Control Directive Reference
| Directive | Meaning | Use Case |
|---|---|---|
public |
Can be cached by any cache | Public, non-sensitive content |
private |
Cache in browser only, not CDN | User-specific data |
no-cache |
Must revalidate on every use | Frequently changing data |
no-store |
Never cache | Sensitive data |
max-age=300 |
Cache for 300 seconds | Moderately fresh data |
s-maxage=3600 |
CDN cache for 1 hour | Different TTL for CDN |
immutable |
Never revalidate | Fingerprinted assets |
must-revalidate |
Cache must revalidate when stale | Ensure freshness |
Stale-While-Revalidate
Serve stale content while fetching fresh data in background:
// stale-while-revalidate.ts
app.get('/api/slow-endpoint', async (req, res) => {
res.set({
'Cache-Control': 'max-age=60, stale-while-revalidate=300',
});
// Takes 2 seconds to compute
const data = await expensiveComputation();
res.json(data);
});
// Client gets:
// - First request: waits 2 seconds
// - Within 60s: instant (cached)
// - 60s-360s: instant (stale) + background refresh
// - After 360s: waits 2 seconds (stale expired)
Caching Strategy Decision Tree
graph TD
A[Need to Cache?] --> B{Data Changes?}
B -->|Rarely| C[Long TTL<br/>1 hour - 1 day]
B -->|Occasionally| D[Medium TTL<br/>5-30 minutes]
B -->|Frequently| E[Short TTL<br/>30-300 seconds]
B -->|Real-time| F[No Cache or<br/>Stale-While-Revalidate]
C --> G{Shareable?}
D --> G
E --> G
G -->|Yes| H[Redis/CDN]
G -->|No| I[In-Memory/Browser]
H --> J{Invalidation Needed?}
J -->|Yes| K[Event-Based Invalidation]
J -->|No| L[TTL Only]
style C fill:#c5e1a5
style D fill:#fff9c4
style E fill:#ffccbc
style K fill:#bbdefb
Performance Impact: Before and After Caching
Real-world example from a typical web application:
| Metric | Before Caching | After Caching | Improvement |
|---|---|---|---|
| Avg Response Time | 850ms | 45ms | 18.9x faster |
| P95 Response Time | 2.3s | 120ms | 19.2x faster |
| Database Queries/sec | 1,250 | 85 | 93% reduction |
| Max Concurrent Users | 500 | 5,000+ | 10x capacity |
| Infrastructure Cost | $2,800/mo | $800/mo | 71% savings |
Common Pitfalls and How to Avoid Them
1. cache.set() Without TTL
// ❌ BAD: No TTL - cache grows forever
await redis.set('user:123', JSON.stringify(user));
// ✅ GOOD: Always set TTL
await redis.setex('user:123', 3600, JSON.stringify(user));
2. Caching Errors
// ❌ BAD: Caching error responses
try {
const data = await fetchData();
await redis.setex('data', 300, JSON.stringify(data));
return data;
} catch (error) {
// Don't cache errors!
throw error;
}
// ✅ GOOD: Only cache success
const data = await fetchData();
if (data) {
await redis.setex('data', 300, JSON.stringify(data));
}
return data;
3. Thundering Herd
// ❌ BAD: All requests refresh simultaneously
const data = await redis.get('popular:data');
if (!data) {
// 1000 concurrent requests all fetch from DB
return await expensiveQuery();
}
// ✅ GOOD: Use locking (see stampede prevention above)
return await getWithStampedePrevention('popular:data', expensiveQuery);
Monitoring Cache Effectiveness
// cache-metrics.ts
import { Counter, Histogram } from 'prom-client';
const cacheHits = new Counter({
name: 'cache_hits_total',
help: 'Total number of cache hits',
labelNames: ['cache_type', 'key_prefix'],
});
const cacheMisses = new Counter({
name: 'cache_misses_total',
help: 'Total number of cache misses',
labelNames: ['cache_type', 'key_prefix'],
});
const cacheLatency = new Histogram({
name: 'cache_operation_duration_seconds',
help: 'Cache operation latency',
labelNames: ['operation', 'cache_type'],
buckets: [0.001, 0.005, 0.01, 0.05, 0.1, 0.5],
});
async function getWithMetrics<T>(key: string, loader: () => Promise<T>, cacheType: string = 'redis'): Promise<T> {
const keyPrefix = key.split(':')[0];
const timer = cacheLatency.startTimer({ operation: 'get', cache_type: cacheType });
const cached = await redis.get(key);
timer();
if (cached) {
cacheHits.inc({ cache_type: cacheType, key_prefix: keyPrefix });
return JSON.parse(cached);
}
cacheMisses.inc({ cache_type: cacheType, key_prefix: keyPrefix });
const value = await loader();
await redis.setex(key, 3600, JSON.stringify(value));
return value;
}
Key Metrics to Track:
- Hit Rate:
hits / (hits + misses)— should be > 80% - Miss Rate:
misses / (hits + misses)— should be < 20% - Eviction Rate: How often cache is full
- Average TTL: How long items stay cached
- Cache Latency: p50, p95, p99 response times
Conclusion
Effective caching requires understanding:
- What to cache: High-read, low-write data with acceptable staleness
- Where to cache: Choose the right layer (browser, CDN, app, database)
- How long to cache: Balance freshness vs. performance
- When to invalidate: Event-based, time-based, or tag-based
The most successful caching strategies combine multiple approaches:
- Browser/CDN caching for static assets (aggressive)
- Application caching for computed data (moderate)
- Database query caching as last resort
- Proper invalidation to balance performance and freshness
Start simple with cache-aside and TTL-based expiration, then layer in advanced strategies as needed. Monitor cache effectiveness and iterate based on actual hit rates and performance metrics.
Ready to supercharge your application with intelligent caching strategies? Sign up for ScanlyApp and get comprehensive performance monitoring and caching recommendations integrated into your development workflow.
Related articles: Also see the 2026 web performance guide caching is a key pillar of, testing your caching rules and ensuring cache invalidation works correctly, and TTFB improvements that caching has the largest single impact on.
