CAP Theorem & Consistency Models
Why This Matters
The CAP theorem is the most fundamental trade-off in distributed systems. Every FAANG interviewer expects you to understand it and apply it to your design choices.
CAP Theorem
Definition
In a distributed system, when a network partition occurs, you must choose between:
- Consistency — Every read receives the most recent write (or an error)
- Availability — Every request receives a response (possibly stale)
- Partition tolerance — System continues operating despite network splits
The Reality
- Partition tolerance is not optional — networks WILL fail in distributed systems
- So the real choice is: CP or AP during a partition
- When there's no partition (normal operation), you can have both C and A
Network Partition Occurs:
→ CP: Return error or wait (refuse stale reads)
Examples: HBase, MongoDB (strong reads), Google Spanner
→ AP: Return potentially stale data (remain available)
Examples: Cassandra, DynamoDB, CouchDB
Common Misconception
"CAP means you can only pick 2 of 3." More accurately:
- During normal operation: you get all 3
- During partition: you choose C or A
- Most systems are tunable (e.g., Cassandra can be configured CP or AP per query)
PACELC Theorem (Extended CAP)
CAP only describes behavior during partitions. PACELC adds normal operation:
if Partition → choose Availability or Consistency
else (normal) → choose Latency or Consistency
| System | During Partition | Normal Operation |
|---|---|---|
| DynamoDB | AP | EL (eventual, low latency) |
| Cassandra | AP | EL |
| MongoDB | CP | EC (consistent) |
| Google Spanner | CP | EC |
| PostgreSQL (single) | CA (no partition) | EC |
Consistency Models
Strong Consistency (Linearizability)
- Every read returns the most recent write
- As if there's a single copy of data
- All clients see the same order of operations
- Cost: Higher latency, lower availability
Example: After writing x = 5, every subsequent read from ANY node returns 5.
Where used: Bank accounts, inventory counts, leader election
Sequential Consistency
- All operations appear in SOME total order
- Each client's operations appear in their program order
- Different clients may see different interleavings
- Weaker than linearizability (no real-time ordering)
Causal Consistency
- Operations with causal relationship are seen in order
- Concurrent (unrelated) operations may appear in any order
- "If I saw event A, and A caused B, I'll see B after A"
Example: In a chat, if Alice replies to Bob's message, everyone sees Bob's message before Alice's reply. But concurrent messages from unrelated threads can appear in any order.
Eventual Consistency
- If no new writes occur, all replicas will eventually converge
- Reads may return stale data
- No guarantee on how long "eventually" takes
- Lowest latency, highest availability
Example: DNS propagation — update a record, it takes minutes to hours for all resolvers to see it.
Where used: Social media feeds, view counters, recommendation data
Read-Your-Writes Consistency
- A user always sees their own writes
- Other users may see stale data
- Important for user experience
Implementation: Route user's reads to the leader, or track write timestamp and wait for replica to catch up.
Monotonic Read Consistency
- Once a user reads a value, they never see an older value
- Prevents "going back in time" when reading from different replicas
Implementation: Sticky sessions (always read from same replica) or version tracking.
Consistency Levels in Real Systems
Cassandra Consistency Levels
| Level | Reads from / Writes to | Guarantee |
|---|---|---|
| ONE | 1 replica | Fastest, weakest |
| QUORUM | ⌊N/2⌋ + 1 replicas | Strong if R + W > N |
| ALL | All N replicas | Strongest, slowest |
| LOCAL_QUORUM | Quorum within local DC | Strong within DC |
Tunable consistency: R + W > N guarantees overlap:
- N=3, R=2, W=2 → strong consistency
- N=3, R=1, W=1 → eventual consistency (fast!)
- N=3, R=3, W=1 → read-heavy optimized (but writes fast)
DynamoDB
- Eventually consistent reads — default, cheapest, may return stale
- Strongly consistent reads — 2x cost, always returns latest
- Transactional reads/writes — ACID across multiple items
Conflict Resolution
When two replicas accept conflicting writes:
Last Write Wins (LWW)
- Use timestamp to pick latest write
- Simple but can lose data
- Problem: clocks aren't perfectly synchronized
- Used by: Cassandra, DynamoDB
Version Vectors / Vector Clocks
- Track causality: each node maintains a version counter
- Detect conflicts (concurrent writes have incomparable vectors)
- Require application-level merge
- Used by: Riak, DynamoDB (returns all conflicting versions)
Application-Level Merge
- Return all conflicting versions to the application
- Application decides how to merge
- Most flexible, most complex
- Example: Shopping cart → merge = union of items
CRDTs
- Data structures designed to automatically merge without conflicts
- See: 43 - CRDT & Conflict-Free Replication
Linearizability vs Serializability
| Linearizability | Serializability | |
|---|---|---|
| About | Single object, real-time order | Multiple objects, transaction isolation |
| Scope | Distributed consistency | Database isolation level |
| Guarantees | Latest value visible to all | Transactions appear serial |
| Cost | Network coordination | Locking or MVCC |
Strict Serializability = Linearizability + Serializability (strongest guarantee, e.g., Google Spanner)
Practical Trade-Offs for Interviews
| Scenario | Consistency Choice | Why |
|---|---|---|
| Bank balance | Strong | Can't show wrong balance |
| Shopping cart | Eventual | Better to show stale cart than error |
| Social media likes | Eventual | Temporary wrong count is OK |
| Inventory (last item) | Strong | Don't oversell |
| User profile | Read-your-writes | User sees their own changes |
| Search index | Eventual | Small delay in searchability OK |
| Leader election | Strong (linearizable) | Must agree on who is leader |
Resources
- 📖 DDIA Chapter 5: Replication (consistency models)
- 📖 DDIA Chapter 9: Consistency and Consensus
- 🔗 Jepsen — Consistency Models
- 🔗 Martin Kleppmann — Please stop calling databases CP or AP
- 🎥 Gaurav Sen — CAP Theorem
- 🔗 Google Spanner paper
Previous: 09 - Consistent Hashing & Data Partitioning | Next: 11 - Distributed Consensus