Caching

Why This Matters

Caching is one of the most impactful performance optimizations. Every FAANG interview expects you to discuss caching strategy, placement, and invalidation.


Cache at Every Layer

Client (Browser Cache)
  → CDN (Edge Cache)
    → API Gateway Cache
      → Application Cache (Redis/Memcached)
        → Database Cache (Query Cache, Buffer Pool)
          → Disk (OS Page Cache)

Caching Strategies

1. Cache-Aside (Lazy Loading) — Most Common

Read:
  1. Check cache → HIT? Return cached data
  2. MISS? Read from DB
  3. Write result to cache
  4. Return data

Write:
  1. Write to DB
  2. Invalidate (delete) cache entry

Pros: Only caches what's actually requested, cache failure doesn't break the system Cons: Cache miss = 3 network calls (cache check + DB read + cache write), stale data possible Use when: Read-heavy, data changes infrequently

2. Write-Through

Write:
  1. Write to cache AND DB simultaneously
  2. Cache always has latest data

Read:
  1. Always read from cache (always fresh)

Pros: Cache is always consistent with DB Cons: Write latency increases (write to 2 places), caches data that may never be read Use when: Data must be fresh, can tolerate write latency

3. Write-Behind (Write-Back)

Write:
  1. Write to cache only (return immediately)
  2. Asynchronously flush to DB in batches

Read:
  1. Always read from cache

Pros: Very fast writes, batch DB writes reduce load Cons: Risk of data loss if cache crashes before flush, complex implementation Use when: Write-heavy, can tolerate small data loss risk (e.g., analytics, counters)

4. Read-Through

Read:
  1. Application reads from cache
  2. Cache itself fetches from DB on miss
  3. Cache populates itself

Pros: Application code is simpler (cache handles loading) Cons: First request always slow, cache library must support DB loading Use when: Using a cache library that supports read-through (like Caffeine, Guava)

5. Refresh-Ahead

Cache proactively refreshes entries BEFORE they expire
(predicts which entries will be needed soon)

Pros: Reduces cache miss latency Cons: Wastes resources on unused predictions Use when: Highly predictable access patterns


Cache Eviction Policies

PolicyHow It WorksBest For
LRU (Least Recently Used)Evict least recently accessed itemGeneral purpose — most common
LFU (Least Frequently Used)Evict least frequently accessed itemWhen frequency matters more than recency
FIFOEvict oldest itemSimple, queue-like
TTL (Time to Live)Evict after fixed timeTime-sensitive data
RandomEvict random itemSurprisingly good in some workloads

Interview default: Use LRU with TTL. Mention LFU if the access pattern has "hot" items.


Cache Invalidation

"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton

Strategies

1. TTL-based expiration:

  • Set TTL when writing to cache
  • Data automatically expires
  • Simple but allows stale reads until TTL

2. Event-based invalidation:

  • When data changes, publish event → invalidate cache
  • More consistent but more complex
  • Use with message queues or CDC (Change Data Capture)

3. Write-through invalidation:

  • On write: update DB → delete cache key
  • Next read repopulates cache
  • Why delete, not update? — race condition: stale write can overwrite fresh data

The Race Condition Problem

Thread A reads DB (value = old)
Thread B writes DB (value = new)
Thread B deletes cache
Thread A writes cache (value = old) ← STALE!

Solutions:

  • Use distributed locks (expensive)
  • Double-delete with delay
  • Accept brief staleness (most systems do this)

Distributed Caching

Redis

  • In-memory key-value store with rich data structures
  • Supports: strings, hashes, lists, sets, sorted sets, streams, bitmaps
  • Persistence: RDB snapshots + AOF (append-only file)
  • Replication: master-replica with automatic failover (Redis Sentinel)
  • Clustering: Redis Cluster for horizontal scaling (hash slots)
  • Pub/Sub: built-in message broker
  • Lua scripting: atomic operations
  • Use cases: caching, sessions, leaderboards, rate limiting, queues

Memcached

  • In-memory key-value store, simpler than Redis
  • Multithreaded (Redis is single-threaded per node)
  • No data structures beyond strings
  • No persistence, no replication (built for pure caching)
  • Use cases: simple caching where you don't need Redis features

Redis vs Memcached

FeatureRedisMemcached
Data structuresRich (lists, sets, hashes, etc.)Strings only
PersistenceYes (RDB + AOF)No
ReplicationYes (master-replica)No
ClusteringYes (Redis Cluster)Client-side sharding
Pub/SubYesNo
Lua scriptingYesNo
Multi-threadedNo (single-thread per shard)Yes
Memory efficiencyLess (overhead per key)More (slab allocator)

Interview default: "I'd use Redis" — it handles almost every caching use case.


CDN Caching

Cache-Control Headers

Cache-Control: public, max-age=31536000, immutable
  → CDN + browser can cache for 1 year

Cache-Control: private, no-cache
  → Only browser can cache, must revalidate each time

Cache-Control: no-store
  → Never cache (sensitive data)

CDN Invalidation

  • Purge by URL — invalidate specific resource
  • Purge by tag — invalidate all resources with a tag
  • Purge all — nuclear option
  • Versioned URLsstyle.abc123.css (best: no invalidation needed)

Cache Stampede (Thundering Herd)

Problem

Popular cache key expires → 1000 requests simultaneously hit DB → DB overwhelmed

Solutions

  1. Locking — first request acquires lock, fetches data, others wait
  2. Refresh-ahead — refresh before expiration
  3. Probabilistic early expiration — each request has small chance of refreshing before TTL
  4. Stale-while-revalidate — serve stale data while refreshing in background

Cache Sizing & Hit Ratio

Cache Hit Ratio

Hit Ratio = cache hits / (cache hits + cache misses)
  • > 95% — excellent
  • 80-95% — good
  • < 80% — review caching strategy

Sizing Rules of Thumb

  • Cache the hot working set — often 20% of data serves 80% of reads (Pareto)
  • Start with 10-20% of dataset size
  • Monitor hit ratio and adjust
  • Consider memory cost vs DB load reduction

Multi-Level Caching

L1: Local in-process cache (HashMap, Caffeine, Guava)
    → ~1μs access, limited by JVM heap

L2: Distributed cache (Redis)
    → ~1ms access, shared across all app servers

L3: CDN edge cache
    → ~10ms access, geographically distributed

L4: Database buffer pool
    → ~10ms access, inside DB server

Pattern: Check L1 → miss → L2 → miss → DB → populate L2 → populate L1


Interview Tips

  • Always add caching in your design (especially for read-heavy systems)
  • Specify WHERE the cache sits (client, CDN, app, DB)
  • Discuss the caching strategy (cache-aside is safe default)
  • Mention TTL and invalidation approach
  • Address cache stampede for popular items
  • Calculate: if DB handles 10K QPS, cache at 95% hit ratio means only 500 QPS hit DB

Resources


Previous: 05 - Load Balancing & Reverse Proxies | Next: 07 - Databases Deep Dive