Databases Deep Dive
Why This Matters
At FAANG scale, a single database won't cut it. You need to know how to scale reads, writes, and storage independently.
Indexing
What is an Index?
A data structure that speeds up reads at the cost of slower writes and extra storage.
B-Tree Index (Default)
[50]
/ \
[20,30] [70,80]
/ | \ / | \
[10] [25] [35] [60] [75] [90]
- Balanced tree, O(log n) lookups
- Good for: range queries, equality, sorting
- Used by: PostgreSQL, MySQL, most RDBMS
Hash Index
- Hash(key) → location
- O(1) exact match lookups
- Bad for: range queries, sorting
- Used by: Memcached, some DB internal structures
Composite Index
sqlCREATE INDEX idx_user_city_age ON users(city, age); -- Fast: WHERE city = 'Paris' AND age > 25 -- Fast: WHERE city = 'Paris' (left prefix) -- Slow: WHERE age > 25 (skips left column)
Leftmost prefix rule: composite index on (A, B, C) can answer queries on (A), (A, B), (A, B, C) — but not (B) or (C) alone.
Other Index Types
| Type | Use Case |
|---|---|
| Full-text index | Text search (LIKE '%word%') |
| Spatial index (R-tree) | Geographic queries (nearby points) |
| Bitmap index | Low-cardinality columns (gender, status) |
| Covering index | Index contains all query columns (index-only scan) |
| Partial index | Index only rows matching a condition |
Indexing Best Practices
- Index columns used in WHERE, JOIN, ORDER BY
- Don't over-index — each index slows writes
- Monitor slow queries and add indexes accordingly
- Consider composite indexes for multi-column queries
- Use EXPLAIN/EXPLAIN ANALYZE to verify index usage
Replication
Single-Leader (Master-Slave)
Writes → Leader → Replication → Follower 1 (reads)
→ Follower 2 (reads)
→ Follower 3 (reads)
- One leader accepts all writes
- Followers replicate from leader, serve reads
- Replication lag — followers may serve stale data
- Failover — if leader dies, promote a follower
Synchronous vs Asynchronous replication:
| Aspect | Synchronous | Asynchronous |
|---|---|---|
| Write confirmed after | Leader + replica acknowledge | Leader acknowledges only |
| Durability | Higher (data on 2+ nodes) | Lower (could lose recent writes) |
| Latency | Higher (wait for replica) | Lower |
| Common in | Financial systems, Spanner | Most systems (MySQL, PG default) |
Multi-Leader
Leader A (Region US) ↔ Leader B (Region EU)
- Each region has its own leader
- Write to any leader, replicate to others
- Problem: write conflicts when same data modified in two regions
- Conflict resolution: Last Write Wins (LWW), merge, custom logic
- Used by: CouchDB, Cassandra, DynamoDB Global Tables
Leaderless (Dynamo-style)
Client → Write to N replicas simultaneously
Client → Read from N replicas, resolve conflicts
- No single leader — any node accepts writes
- Quorum: W + R > N ensures overlap
- N=3, W=2, R=2 → at least one node has latest data
- Anti-entropy: background repair of inconsistencies
- Used by: Cassandra, Riak, DynamoDB
Sharding (Partitioning)
Why Shard?
When data exceeds single machine capacity (storage/throughput), split it across multiple machines.
Horizontal Sharding Strategies
1. Range-based sharding:
Shard A: users A-M
Shard B: users N-Z
- Good for range queries
- Risk: hotspots (if some ranges get more traffic)
2. Hash-based sharding:
Shard = hash(user_id) % num_shards
- Even distribution
- Can't do range queries efficiently
- Adding/removing shards requires rehashing (use consistent hashing)
3. Directory-based sharding:
Lookup service: user_id → shard_id
- Most flexible
- Lookup service is a single point of failure
4. Geographic sharding:
EU data → EU shard
US data → US shard
- Data locality for compliance (GDPR) and latency
Sharding Challenges
| Challenge | Description | Solution |
|---|---|---|
| Cross-shard queries | JOINs across shards are expensive | Denormalize, application-level joins |
| Cross-shard transactions | Distributed transactions are hard | Saga pattern, eventual consistency |
| Hotspots | One shard gets disproportionate traffic | Better shard key, further splitting |
| Rebalancing | Adding/removing shards | Consistent hashing, virtual shards |
| Referential integrity | Foreign keys across shards | Application-level enforcement |
| Secondary indexes | Index spans multiple shards | Local indexes + scatter-gather |
Choosing a Shard Key
Good shard key properties:
- High cardinality (many unique values)
- Even distribution of data and traffic
- Used in most queries (avoids cross-shard queries)
- Doesn't create hotspots
Examples:
user_id— good for user-centric appsorder_id— good for order systemstimestamp— BAD (all writes go to latest shard)country_code— BAD (US shard much larger than others)
Read Replicas in Practice
Scaling Reads
Write path: App → Leader
Read path: App → Load Balancer → Replica 1 / Replica 2 / Replica 3
Replication Lag Issues
- Reading your own writes: User writes, then reads from replica that hasn't replicated yet → sees stale data
- Solution: Read from leader for recently-written data
- Monotonic reads: User sees new data, then stale data on next read (different replica)
- Solution: Sticky sessions to same replica
Connection Pooling
Problem
DB connections are expensive (TCP + auth + session setup). Creating one per request = slow.
Solution
App Server → Connection Pool (10-50 connections) → Database
- Pool size = CPU cores × 2 + disk spindles (HikariCP formula)
- Popular: HikariCP (Java), pgBouncer (PostgreSQL), ProxySQL (MySQL)
Database Scaling Summary
| Technique | Scales | Complexity |
|---|---|---|
| Read replicas | Reads | Low |
| Caching (Redis) | Reads | Low-Medium |
| Vertical scaling | Everything (temporarily) | None |
| Sharding | Reads + Writes + Storage | High |
| CQRS | Reads (separate read model) | Medium |
| Denormalization | Reads | Medium |
Resources
- 📖 DDIA Chapter 5: Replication
- 📖 DDIA Chapter 6: Partitioning
- 📖 "Database Internals" by Alex Petrov
- 🔗 Use The Index, Luke
- 🎥 Gaurav Sen — Sharding
- 🔗 Uber's Schemaless (sharding at scale)
- 🔗 Amazon DynamoDB paper
Previous: 06 - Caching | Next: 08 - Message Queues & Event Streaming