31 - Design Chat System

Series: System Design & Distributed Systems Previous: 30 - Design Notification System | Next: 32 - Design News Feed


1. Requirements

Functional Requirements

FeatureDetails
1:1 chatReal-time messaging between two users
Group chatUp to 500 members per group
Online presenceShow online/offline/last-seen status
Read receiptsDelivered, read indicators
Media sharingImages, video, files up to 100MB
Message historyPersistent, searchable
Push notificationsFor offline users

Non-Functional Requirements

  • Latency: < 100ms message delivery for online users
  • Availability: 99.99% uptime
  • Ordering: Messages appear in correct order per conversation
  • Durability: Zero message loss
  • Scale: 500M DAU, 60B messages/day (WhatsApp scale)

2. Capacity Estimation

DAU:               500M
Avg messages/user: 40/day
Total messages:    20B/day  ~  230K msg/sec
Avg message size:  100 bytes text
Storage/day:       20B x 100B = 2TB text
Media messages:    ~5% = 1B/day, avg 200KB = 200TB/day
ResourceEstimate
Write QPS~230K msg/sec
Peak QPS~460K msg/sec
Text storage/year~730TB
Media storage/year~73PB

3. High-Level Architecture

                        +-------------------+
                        |   Load Balancer   |
                        +--------+----------+
                                 |
                  +--------------+--------------+
                  |              |              |
           +------+------+ +----+-----+ +-----+------+
           | API Gateway | | API GW 2 | | API GW N   |
           +------+------+ +----------+ +------------+
                  |
     +------------+-------------+
     |            |             |
+----+----+ +----+-----+ +----+------+
|  Chat   | | Presence | |  Media    |
| Service | | Service  | |  Service  |
+----+----+ +----+-----+ +----+------+
     |            |             |
     |    +-------+-------+    |
     |    |  User Session  |   |
     |    |   Registry     |   |
     |    +----------------+   |
     |                         |
+----+-------------------------+----+
|         Message Queue             |
|     (Kafka / RabbitMQ)            |
+----+-----+----+---+----+---------+
     |     |    |   |    |
  +--+--+  | +--+--+|  +-+--------+
  |Store|  | |Push  ||  | Fan-out  |
  |Svc  |  | |Notif ||  | Service  |
  +--+--+  | +------+|  +----------+
     |     |          |
+----+-----+----------+----+
|    Message Database       |     +------------------+
|  (Cassandra / HBase)     |     |   Object Store   |
+---------------------------+     |   (S3 / GCS)     |
                                  +------------------+

4. Communication Protocol: WebSocket

Why WebSocket over HTTP?

FeatureHTTP PollingLong PollingWebSocket
LatencyHigh (poll interval)MediumLow (real-time)
Server loadVery highMediumLow
BidirectionalNoNoYes
Connection overheadPer requestPer timeoutOnce

Connection Flow

Client                          Server
  |--- HTTP Upgrade Request ------>|
  |<--- 101 Switching Protocols ---|
  |                                |
  |<====== Bidirectional WS ======>|
  |  send message                  |
  |  receive message               |
  |  presence updates              |
  |  typing indicators             |

Interview Tip: Mention that HTTP is still used for non-real-time operations like profile updates, media uploads, and login. Only chat messaging uses WebSocket.


5. Message Flow: Send, Store, Deliver, Acknowledge

Sender             Chat Server            Database          Receiver
  |                     |                     |                |
  |-- 1. Send msg ----->|                     |                |
  |                     |-- 2. Store msg ---->|                |
  |                     |<-- 3. ACK stored ---|                |
  |<-- 4. ACK sent -----|                     |                |
  |                     |                     |                |
  |                     |-- 5. Route to receiver's server ---->|
  |                     |                     |                |
  |                     |                 6. Is receiver online?
  |                     |                     |                |
  |                     |  [ONLINE] --------->|-- 7. Push WS ->|
  |                     |                     |                |
  |                     |  [OFFLINE] -------->| 8. Push Notif  |
  |                     |                     |                |
  |                     |<-------- 9. Delivered ACK -----------|
  |                     |                     |                |
  |<-- 10. Delivered ---|                     |                |
  |                     |<-------- 11. Read ACK ---------------|
  |<-- 12. Read --------|                     |                |

Message States

SENT --> DELIVERED --> READ
  |          |
  +-> FAILED +-> EXPIRED (for ephemeral)

6. Chat Server Architecture

User Session Registry

Tracks which chat server each user is connected to.

+---------------------------+
| User Session Registry     |
| (Redis Cluster)           |
+---------------------------+
| user_id | server_id | ws  |
|---------|-----------|-----|
| U1      | CS-3      | ws7 |
| U2      | CS-1      | ws2 |
| U3      | CS-3      | ws9 |
+---------------------------+

When a message arrives for U2, the system looks up the registry to find U2 is on CS-1, then routes the message there.

Scaling Chat Servers

  • Each server holds ~50K-100K concurrent WebSocket connections
  • Stateless routing: any server can handle any user
  • Session registry in Redis provides location lookup
  • Consistent hashing or random assignment for initial connection

7. Message Storage

Option A: Per-User Inbox (Write-Heavy)

inbox_user_123:
  { msg_id: M1, from: U456, text: "hello", ts: 1700000001 }
  { msg_id: M2, from: U789, text: "hey",   ts: 1700000005 }
  • Pros: Fast reads for a user's messages
  • Cons: Group messages duplicated N times (one per member)

Option B: Message Table + Conversation Index (Balanced)

messages table:
  msg_id | conversation_id | sender_id | content | timestamp
  -------|-----------------|-----------|---------|----------
  M1     | conv_AB         | A         | hello   | t1
  M2     | conv_AB         | B         | hi      | t2

conversation_index table:
  user_id | conversation_id | last_read_msg_id | unread_count
  --------|-----------------|------------------|------------
  A       | conv_AB         | M2               | 0
  B       | conv_AB         | M1               | 1
  • Pros: No duplication, efficient for group chat
  • Cons: Requires join or secondary index for user's view

Recommended: Hybrid

  • Use a wide-column store (Cassandra/HBase) partitioned by conversation_id
  • Partition key: conversation_id, clustering key: timestamp
  • Separate sync table per user for unread tracking

Interview Tip: WhatsApp uses custom Erlang + Mnesia. For interviews, Cassandra is the standard answer -- it handles high write throughput and time-series data well.


8. Group Chat Fan-Out

Small Groups (< 500 members)

Sender --> Chat Server --> Fan-out Service
                              |
                  +-----------+-----------+
                  |           |           |
              Member 1    Member 2    Member N
              (online:    (online:    (offline:
               WS push)   WS push)   push notif)
  • Store message once in group conversation
  • Fan-out delivery to each member via their chat server
  • Push notification for offline members

Why Not Fan-Out on Write for Large Groups?

For groups with 500 members, writing 500 copies per message is expensive. Instead:

  • Store once, fan-out on delivery only
  • Offline members pull on reconnect (lazy loading)

9. Online Presence (Heartbeat)

Client --> Heartbeat every 5s --> Presence Service --> Redis
                                       |
                              +--------+--------+
                              |                 |
                        Set TTL = 30s      Pub/Sub to
                        in Redis           friends' clients

Presence States

StateCondition
OnlineHeartbeat received within TTL
OfflineNo heartbeat past TTL expiry
AwayNo app interaction > 5 min, heartbeat still active
Last seenTimestamp of last heartbeat before going offline

Optimization for Large Friend Lists

  • Don't push presence to all friends immediately
  • Use lazy evaluation: check presence when user opens a chat
  • For group chats: only fetch presence of visible members

Interview Tip: Facebook designed a presence system that batches updates and uses a fan-out budget -- each user's presence change fans out to at most N friends.


10. End-to-End Encryption (E2EE)

Alice                     Server                     Bob
  |                         |                          |
  |  Generate key pair      |     Generate key pair    |
  |  (public + private)     |     (public + private)   |
  |                         |                          |
  |-- Upload public key --->|<--- Upload public key ---|
  |                         |                          |
  |<-- Bob's public key ----|---- Alice's public key ->|
  |                         |                          |
  | Encrypt with            |          Decrypt with    |
  | Bob's public key        |          Bob's private   |
  |-- Encrypted msg ------->|--------- Encrypted msg ->|
  |                         |                          |
  | Server CANNOT read      |                          |
  • Signal Protocol (used by WhatsApp, Signal)
  • Double Ratchet Algorithm for forward secrecy
  • Server stores only ciphertext

11. Message Ordering

Challenge

Messages from different senders may arrive out of order due to network delays.

Solution: Per-Conversation Sequence Numbers

conversation_id | sequence_num | msg_id | sender | timestamp
----------------|--------------|--------|--------|----------
conv_AB         | 1            | M1     | A      | t1
conv_AB         | 2            | M2     | B      | t2
conv_AB         | 3            | M3     | A      | t3
  • Each conversation has a monotonically increasing sequence counter
  • Chat server assigns sequence number atomically
  • Client sorts by sequence number, not timestamp
  • For 1:1 chats, server clock ordering is sufficient
  • For group chats, use a centralized sequencer per conversation

12. Push Notifications for Offline Users

Chat Server --> Message Queue --> Push Notification Service
                                       |
                          +------------+------------+
                          |            |            |
                        APNs         FCM        Web Push
                       (iOS)      (Android)    (Browser)

Flow

  1. Chat server checks user session registry
  2. User not connected --> enqueue to push notification service
  3. Push service formats payload per platform
  4. Sends via APNs (Apple), FCM (Google), or Web Push
  5. On reconnect, client fetches undelivered messages from DB

Optimization

  • Batch notifications: "You have 5 new messages from Alice"
  • Rate limit: don't buzz for every message in an active group
  • Silent push for data sync, audible push for direct messages

13. Media Sharing

Sender                  Media Service              S3/Blob Store
  |                          |                          |
  |-- Upload media --------->|                          |
  |                          |-- Store blob ----------->|
  |                          |<-- Return CDN URL -------|
  |<-- Media URL + metadata--|                          |
  |                          |                          |
  |-- Send chat message with media URL to Chat Server --|
  |       (normal message flow with media_url field)    |

Media Handling

StepDetail
UploadClient uploads to Media Service via HTTP (not WebSocket)
ProcessingCompress, generate thumbnail, virus scan
StorageS3 with CDN (CloudFront) for fast delivery
MessageChat message contains media_url, not the binary
DownloadReceiver fetches from CDN using the URL

Optimization

  • Client-side compression before upload
  • Resumable uploads for large files (tus protocol)
  • Progressive image loading (blur placeholder -> full image)

14. Message Search

Architecture

Messages DB --> Change Data Capture --> Elasticsearch
                (Debezium / CDC)

Search Index Design

  • Index by: conversation_id, sender_id, content, timestamp
  • Full-text search on message content
  • Filter by conversation, date range, sender
  • E2EE caveat: server cannot index encrypted messages -- search happens client-side for E2EE chats

Search Flow

  1. User types query in search bar
  2. API call to Search Service
  3. Query Elasticsearch with filters
  4. Return matching messages with conversation context
  5. Client navigates to the message in conversation view

15. Complete System Diagram

+-------+    +-------+    +-------+
|Client |    |Client |    |Client |
|  (WS) |    |  (WS) |    |  (WS) |
+---+---+    +---+---+    +---+---+
    |            |            |
    +------+-----+-----+-----+
           |           |
    +------+------+    |
    |   L4 Load   |    |
    |  Balancer   |    |
    +------+------+    |
           |           |
  +--------+--------+  |      +------------------+
  | Chat Server     |  |      | API Gateway      |
  | Cluster (WS)    |  |      | (HTTP REST)      |
  +---+----+----+---+  |      +---+----+---------+
      |    |    |       |          |    |
      |    |    +-------+----------+    |
      |    |            |               |
 +----+----+---+  +-----+-------+  +---+----------+
 | User Session|  | Presence    |  | Media        |
 | Registry    |  | Service     |  | Service      |
 | (Redis)     |  | (Redis+Pub) |  | (HTTP upload)|
 +---------+---+  +------+------+  +---+----------+
           |             |             |
     +-----+----+   +---+---+    +----+-----+
     | Kafka /  |   | Redis |    | S3 + CDN |
     | Msg Queue|   +-------+    +----------+
     +---+------+
         |
  +------+------+------+
  |      |      |      |
+-+-+ +--+-+ +--+-+ +--+--+
|DB | |Push| |Fan | |Search|
|Svc| |Ntf | |Out | |  Svc |
+---+ +----+ +----+ +--+--+
  |                     |
+-+-----------+    +----+--------+
| Cassandra / |    |Elasticsearch|
| HBase       |    +-------------+
+-------------+

16. Interview Tips

  1. Start with requirements: Clarify 1:1 vs group, scale, E2EE needs
  2. WebSocket is the answer for real-time: but explain HTTP for other operations
  3. Message ordering: Per-conversation sequence numbers, not global clocks
  4. Storage choice matters: Cassandra/HBase for high-write chat workloads, not MySQL
  5. Fan-out strategy: Distinguish between small groups (fan-out on delivery) and large groups (lazy pull)
  6. Don't forget: Read receipts, typing indicators, presence -- these are differentiators
  7. E2EE is a trade-off: Server-side search becomes impossible

17. Resources

  • Alex Xu - System Design Interview Vol. 1, Chapter 12: Design a Chat System
  • WhatsApp Architecture: Erlang + Mnesia + FreeBSD tuning
  • Facebook Messenger Architecture (2015 engineering blog)
  • Signal Protocol documentation (signal.org/docs)
  • Discord Engineering Blog: How Discord Stores Billions of Messages
  • Martin Kleppmann - Designing Data-Intensive Applications, Chapter 11 (Stream Processing)

Previous: 30 - Design Notification System | Next: 32 - Design News Feed