CAP Theorem & Consistency Models

Why This Matters

The CAP theorem is the most fundamental trade-off in distributed systems. Every FAANG interviewer expects you to understand it and apply it to your design choices.


CAP Theorem

Definition

In a distributed system, when a network partition occurs, you must choose between:

  • Consistency — Every read receives the most recent write (or an error)
  • Availability — Every request receives a response (possibly stale)
  • Partition tolerance — System continues operating despite network splits

The Reality

  • Partition tolerance is not optional — networks WILL fail in distributed systems
  • So the real choice is: CP or AP during a partition
  • When there's no partition (normal operation), you can have both C and A
Network Partition Occurs:
  → CP: Return error or wait (refuse stale reads)
       Examples: HBase, MongoDB (strong reads), Google Spanner
  → AP: Return potentially stale data (remain available)
       Examples: Cassandra, DynamoDB, CouchDB

Common Misconception

"CAP means you can only pick 2 of 3." More accurately:

  • During normal operation: you get all 3
  • During partition: you choose C or A
  • Most systems are tunable (e.g., Cassandra can be configured CP or AP per query)

PACELC Theorem (Extended CAP)

CAP only describes behavior during partitions. PACELC adds normal operation:

if Partition → choose Availability or Consistency
else (normal) → choose Latency or Consistency
SystemDuring PartitionNormal Operation
DynamoDBAPEL (eventual, low latency)
CassandraAPEL
MongoDBCPEC (consistent)
Google SpannerCPEC
PostgreSQL (single)CA (no partition)EC

Consistency Models

Strong Consistency (Linearizability)

  • Every read returns the most recent write
  • As if there's a single copy of data
  • All clients see the same order of operations
  • Cost: Higher latency, lower availability

Example: After writing x = 5, every subsequent read from ANY node returns 5.

Where used: Bank accounts, inventory counts, leader election

Sequential Consistency

  • All operations appear in SOME total order
  • Each client's operations appear in their program order
  • Different clients may see different interleavings
  • Weaker than linearizability (no real-time ordering)

Causal Consistency

  • Operations with causal relationship are seen in order
  • Concurrent (unrelated) operations may appear in any order
  • "If I saw event A, and A caused B, I'll see B after A"

Example: In a chat, if Alice replies to Bob's message, everyone sees Bob's message before Alice's reply. But concurrent messages from unrelated threads can appear in any order.

Eventual Consistency

  • If no new writes occur, all replicas will eventually converge
  • Reads may return stale data
  • No guarantee on how long "eventually" takes
  • Lowest latency, highest availability

Example: DNS propagation — update a record, it takes minutes to hours for all resolvers to see it.

Where used: Social media feeds, view counters, recommendation data

Read-Your-Writes Consistency

  • A user always sees their own writes
  • Other users may see stale data
  • Important for user experience

Implementation: Route user's reads to the leader, or track write timestamp and wait for replica to catch up.

Monotonic Read Consistency

  • Once a user reads a value, they never see an older value
  • Prevents "going back in time" when reading from different replicas

Implementation: Sticky sessions (always read from same replica) or version tracking.


Consistency Levels in Real Systems

Cassandra Consistency Levels

LevelReads from / Writes toGuarantee
ONE1 replicaFastest, weakest
QUORUM⌊N/2⌋ + 1 replicasStrong if R + W > N
ALLAll N replicasStrongest, slowest
LOCAL_QUORUMQuorum within local DCStrong within DC

Tunable consistency: R + W > N guarantees overlap:

  • N=3, R=2, W=2 → strong consistency
  • N=3, R=1, W=1 → eventual consistency (fast!)
  • N=3, R=3, W=1 → read-heavy optimized (but writes fast)

DynamoDB

  • Eventually consistent reads — default, cheapest, may return stale
  • Strongly consistent reads — 2x cost, always returns latest
  • Transactional reads/writes — ACID across multiple items

Conflict Resolution

When two replicas accept conflicting writes:

Last Write Wins (LWW)

  • Use timestamp to pick latest write
  • Simple but can lose data
  • Problem: clocks aren't perfectly synchronized
  • Used by: Cassandra, DynamoDB

Version Vectors / Vector Clocks

  • Track causality: each node maintains a version counter
  • Detect conflicts (concurrent writes have incomparable vectors)
  • Require application-level merge
  • Used by: Riak, DynamoDB (returns all conflicting versions)

Application-Level Merge

  • Return all conflicting versions to the application
  • Application decides how to merge
  • Most flexible, most complex
  • Example: Shopping cart → merge = union of items

CRDTs

  • Data structures designed to automatically merge without conflicts
  • See: 43 - CRDT & Conflict-Free Replication

Linearizability vs Serializability

LinearizabilitySerializability
AboutSingle object, real-time orderMultiple objects, transaction isolation
ScopeDistributed consistencyDatabase isolation level
GuaranteesLatest value visible to allTransactions appear serial
CostNetwork coordinationLocking or MVCC

Strict Serializability = Linearizability + Serializability (strongest guarantee, e.g., Google Spanner)


Practical Trade-Offs for Interviews

ScenarioConsistency ChoiceWhy
Bank balanceStrongCan't show wrong balance
Shopping cartEventualBetter to show stale cart than error
Social media likesEventualTemporary wrong count is OK
Inventory (last item)StrongDon't oversell
User profileRead-your-writesUser sees their own changes
Search indexEventualSmall delay in searchability OK
Leader electionStrong (linearizable)Must agree on who is leader

Resources


Previous: 09 - Consistent Hashing & Data Partitioning | Next: 11 - Distributed Consensus