Caching

Why This Matters

Caching is one of the most impactful performance optimizations. Every FAANG interview expects you to discuss caching strategy, placement, and invalidation.

Cache at Every Layer

Client (Browser Cache)
  → CDN (Edge Cache)
    → API Gateway Cache
      → Application Cache (Redis/Memcached)
        → Database Cache (Query Cache, Buffer Pool)
          → Disk (OS Page Cache)

Caching Strategies

1. Cache-Aside (Lazy Loading) — Most Common

Read:
  1. Check cache → HIT? Return cached data
  2. MISS? Read from DB
  3. Write result to cache
  4. Return data

Write:
  1. Write to DB
  2. Invalidate (delete) cache entry

Pros: Only caches what's actually requested, cache failure doesn't break the system Cons: Cache miss = 3 network calls (cache check + DB read + cache write), stale data possible Use when: Read-heavy, data changes infrequently

2. Write-Through

Write:
  1. Write to cache AND DB simultaneously
  2. Cache always has latest data

Read:
  1. Always read from cache (always fresh)

Pros: Cache is always consistent with DB Cons: Write latency increases (write to 2 places), caches data that may never be read Use when: Data must be fresh, can tolerate write latency

3. Write-Behind (Write-Back)

Write:
  1. Write to cache only (return immediately)
  2. Asynchronously flush to DB in batches

Read:
  1. Always read from cache

Pros: Very fast writes, batch DB writes reduce load Cons: Risk of data loss if cache crashes before flush, complex implementation Use when: Write-heavy, can tolerate small data loss risk (e.g., analytics, counters)

4. Read-Through

Read:
  1. Application reads from cache
  2. Cache itself fetches from DB on miss
  3. Cache populates itself

Pros: Application code is simpler (cache handles loading) Cons: First request always slow, cache library must support DB loading Use when: Using a cache library that supports read-through (like Caffeine, Guava)

5. Refresh-Ahead

Cache proactively refreshes entries BEFORE they expire
(predicts which entries will be needed soon)

Pros: Reduces cache miss latency Cons: Wastes resources on unused predictions Use when: Highly predictable access patterns

Cache Eviction Policies

Policy	How It Works	Best For
LRU (Least Recently Used)	Evict least recently accessed item	General purpose — most common
LFU (Least Frequently Used)	Evict least frequently accessed item	When frequency matters more than recency
FIFO	Evict oldest item	Simple, queue-like
TTL (Time to Live)	Evict after fixed time	Time-sensitive data
Random	Evict random item	Surprisingly good in some workloads

Interview default: Use LRU with TTL. Mention LFU if the access pattern has "hot" items.

Cache Invalidation

"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton

Strategies

1. TTL-based expiration:

Set TTL when writing to cache
Data automatically expires
Simple but allows stale reads until TTL

2. Event-based invalidation:

When data changes, publish event → invalidate cache
More consistent but more complex
Use with message queues or CDC (Change Data Capture)

3. Write-through invalidation:

On write: update DB → delete cache key
Next read repopulates cache
Why delete, not update? — race condition: stale write can overwrite fresh data

The Race Condition Problem

Thread A reads DB (value = old)
Thread B writes DB (value = new)
Thread B deletes cache
Thread A writes cache (value = old) ← STALE!

Solutions:

Use distributed locks (expensive)
Double-delete with delay
Accept brief staleness (most systems do this)

Distributed Caching

Redis

In-memory key-value store with rich data structures
Supports: strings, hashes, lists, sets, sorted sets, streams, bitmaps
Persistence: RDB snapshots + AOF (append-only file)
Replication: master-replica with automatic failover (Redis Sentinel)
Clustering: Redis Cluster for horizontal scaling (hash slots)
Pub/Sub: built-in message broker
Lua scripting: atomic operations
Use cases: caching, sessions, leaderboards, rate limiting, queues

Memcached

In-memory key-value store, simpler than Redis
Multithreaded (Redis is single-threaded per node)
No data structures beyond strings
No persistence, no replication (built for pure caching)
Use cases: simple caching where you don't need Redis features

Redis vs Memcached

Feature	Redis	Memcached
Data structures	Rich (lists, sets, hashes, etc.)	Strings only
Persistence	Yes (RDB + AOF)	No
Replication	Yes (master-replica)	No
Clustering	Yes (Redis Cluster)	Client-side sharding
Pub/Sub	Yes	No
Lua scripting	Yes	No
Multi-threaded	No (single-thread per shard)	Yes
Memory efficiency	Less (overhead per key)	More (slab allocator)

Interview default: "I'd use Redis" — it handles almost every caching use case.

CDN Caching

Cache-Control Headers

Cache-Control: public, max-age=31536000, immutable
  → CDN + browser can cache for 1 year

Cache-Control: private, no-cache
  → Only browser can cache, must revalidate each time

Cache-Control: no-store
  → Never cache (sensitive data)

CDN Invalidation

Purge by URL — invalidate specific resource
Purge by tag — invalidate all resources with a tag
Purge all — nuclear option
Versioned URLs — style.abc123.css (best: no invalidation needed)

Cache Stampede (Thundering Herd)

Problem

Popular cache key expires → 1000 requests simultaneously hit DB → DB overwhelmed

Solutions

Locking — first request acquires lock, fetches data, others wait
Refresh-ahead — refresh before expiration
Probabilistic early expiration — each request has small chance of refreshing before TTL
Stale-while-revalidate — serve stale data while refreshing in background

Cache Sizing & Hit Ratio

Cache Hit Ratio

Hit Ratio = cache hits / (cache hits + cache misses)

> 95% — excellent
80-95% — good
< 80% — review caching strategy

Sizing Rules of Thumb

Cache the hot working set — often 20% of data serves 80% of reads (Pareto)
Start with 10-20% of dataset size
Monitor hit ratio and adjust
Consider memory cost vs DB load reduction

Multi-Level Caching

L1: Local in-process cache (HashMap, Caffeine, Guava)
    → ~1μs access, limited by JVM heap

L2: Distributed cache (Redis)
    → ~1ms access, shared across all app servers

L3: CDN edge cache
    → ~10ms access, geographically distributed

L4: Database buffer pool
    → ~10ms access, inside DB server

Pattern: Check L1 → miss → L2 → miss → DB → populate L2 → populate L1

Interview Tips

Always add caching in your design (especially for read-heavy systems)
Specify WHERE the cache sits (client, CDN, app, DB)
Discuss the caching strategy (cache-aside is safe default)
Mention TTL and invalidation approach
Address cache stampede for popular items
Calculate: if DB handles 10K QPS, cache at 95% hit ratio means only 500 QPS hit DB

Resources

📖 DDIA Chapter 3: Storage and Retrieval (buffer pools)
📖 "System Design Interview" by Alex Xu — Chapter on Cache
🔗 Redis documentation
🎥 ByteByteGo — Caching Strategies
🔗 Facebook's Memcache paper
🔗 AWS ElastiCache Best Practices

Previous: 05 - Load Balancing & Reverse Proxies | Next: 07 - Databases Deep Dive