CAP Theorem & Consistency Models

Why This Matters

The CAP theorem is the most fundamental trade-off in distributed systems. Every FAANG interviewer expects you to understand it and apply it to your design choices.

CAP Theorem

Definition

In a distributed system, when a network partition occurs, you must choose between:

Consistency — Every read receives the most recent write (or an error)
Availability — Every request receives a response (possibly stale)
Partition tolerance — System continues operating despite network splits

The Reality

Partition tolerance is not optional — networks WILL fail in distributed systems
So the real choice is: CP or AP during a partition
When there's no partition (normal operation), you can have both C and A

Network Partition Occurs:
  → CP: Return error or wait (refuse stale reads)
       Examples: HBase, MongoDB (strong reads), Google Spanner
  → AP: Return potentially stale data (remain available)
       Examples: Cassandra, DynamoDB, CouchDB

Common Misconception

"CAP means you can only pick 2 of 3." More accurately:

During normal operation: you get all 3
During partition: you choose C or A
Most systems are tunable (e.g., Cassandra can be configured CP or AP per query)

PACELC Theorem (Extended CAP)

CAP only describes behavior during partitions. PACELC adds normal operation:

if Partition → choose Availability or Consistency
else (normal) → choose Latency or Consistency

System	During Partition	Normal Operation
DynamoDB	AP	EL (eventual, low latency)
Cassandra	AP	EL
MongoDB	CP	EC (consistent)
Google Spanner	CP	EC
PostgreSQL (single)	CA (no partition)	EC

Consistency Models

Strong Consistency (Linearizability)

Every read returns the most recent write
As if there's a single copy of data
All clients see the same order of operations
Cost: Higher latency, lower availability

Example: After writing x = 5, every subsequent read from ANY node returns 5.

Where used: Bank accounts, inventory counts, leader election

Sequential Consistency

All operations appear in SOME total order
Each client's operations appear in their program order
Different clients may see different interleavings
Weaker than linearizability (no real-time ordering)

Causal Consistency

Operations with causal relationship are seen in order
Concurrent (unrelated) operations may appear in any order
"If I saw event A, and A caused B, I'll see B after A"

Example: In a chat, if Alice replies to Bob's message, everyone sees Bob's message before Alice's reply. But concurrent messages from unrelated threads can appear in any order.

Eventual Consistency

If no new writes occur, all replicas will eventually converge
Reads may return stale data
No guarantee on how long "eventually" takes
Lowest latency, highest availability

Example: DNS propagation — update a record, it takes minutes to hours for all resolvers to see it.

Where used: Social media feeds, view counters, recommendation data

Read-Your-Writes Consistency

A user always sees their own writes
Other users may see stale data
Important for user experience

Implementation: Route user's reads to the leader, or track write timestamp and wait for replica to catch up.

Monotonic Read Consistency

Once a user reads a value, they never see an older value
Prevents "going back in time" when reading from different replicas

Implementation: Sticky sessions (always read from same replica) or version tracking.

Consistency Levels in Real Systems

Cassandra Consistency Levels

Level	Reads from / Writes to	Guarantee
ONE	1 replica	Fastest, weakest
QUORUM	⌊N/2⌋ + 1 replicas	Strong if R + W > N
ALL	All N replicas	Strongest, slowest
LOCAL_QUORUM	Quorum within local DC	Strong within DC

Tunable consistency: R + W > N guarantees overlap:

N=3, R=2, W=2 → strong consistency
N=3, R=1, W=1 → eventual consistency (fast!)
N=3, R=3, W=1 → read-heavy optimized (but writes fast)

DynamoDB

Eventually consistent reads — default, cheapest, may return stale
Strongly consistent reads — 2x cost, always returns latest
Transactional reads/writes — ACID across multiple items

Conflict Resolution

When two replicas accept conflicting writes:

Last Write Wins (LWW)

Use timestamp to pick latest write
Simple but can lose data
Problem: clocks aren't perfectly synchronized
Used by: Cassandra, DynamoDB

Version Vectors / Vector Clocks

Track causality: each node maintains a version counter
Detect conflicts (concurrent writes have incomparable vectors)
Require application-level merge
Used by: Riak, DynamoDB (returns all conflicting versions)

Application-Level Merge

Return all conflicting versions to the application
Application decides how to merge
Most flexible, most complex
Example: Shopping cart → merge = union of items

CRDTs

Data structures designed to automatically merge without conflicts
See: 43 - CRDT & Conflict-Free Replication

Linearizability vs Serializability

	Linearizability	Serializability
About	Single object, real-time order	Multiple objects, transaction isolation
Scope	Distributed consistency	Database isolation level
Guarantees	Latest value visible to all	Transactions appear serial
Cost	Network coordination	Locking or MVCC

Strict Serializability = Linearizability + Serializability (strongest guarantee, e.g., Google Spanner)

Practical Trade-Offs for Interviews

Scenario	Consistency Choice	Why
Bank balance	Strong	Can't show wrong balance
Shopping cart	Eventual	Better to show stale cart than error
Social media likes	Eventual	Temporary wrong count is OK
Inventory (last item)	Strong	Don't oversell
User profile	Read-your-writes	User sees their own changes
Search index	Eventual	Small delay in searchability OK
Leader election	Strong (linearizable)	Must agree on who is leader

Resources

📖 DDIA Chapter 5: Replication (consistency models)
📖 DDIA Chapter 9: Consistency and Consensus
🔗 Jepsen — Consistency Models
🔗 Martin Kleppmann — Please stop calling databases CP or AP
🎥 Gaurav Sen — CAP Theorem
🔗 Google Spanner paper

Previous: 09 - Consistent Hashing & Data Partitioning | Next: 11 - Distributed Consensus