37 - Design Dropbox or Google Drive

Series: System Design & Distributed Systems Previous: 36 - Design Twitter | Next: 38 - Design Web Crawler

1. Requirements

Functional Requirements

Feature	Details
Upload files	Any file type, up to 10GB
Download files	Retrieve files from any device
Sync across devices	Automatic sync when files change
Share files/folders	Share via link or direct access
File versioning	Access previous versions, rollback
Offline support	Edit locally, sync when online
Notifications	Alert collaborators on changes

Non-Functional Requirements

Reliability: Zero data loss (durability > 99.9999999%)
Availability: 99.99%
Scale: 500M users, 1B files/day uploaded
Bandwidth: Minimize data transfer (delta sync)
Latency: Sync within seconds for small files
Consistency: No silent data overwrites (conflict detection)

2. Capacity Estimation

Registered users:       500M
DAU:                    100M
Avg files per user:     200
Total files:            100B
Avg file size:          500KB
Total storage:          100B x 500KB = 50PB
Uploads/day:            1B files = ~12K uploads/sec
Avg upload size:        500KB
Upload bandwidth:       12K/sec x 500KB = 6GB/sec = 48Gbps
Sync notifications:     ~10x uploads = 120K events/sec

Resource	Estimate
Upload QPS	~12K/sec
Storage total	~50PB
Storage growth/day	~500TB
Sync events/sec	~120K

3. High-Level Architecture

+--------+                              +--------+
| Desktop|                              | Mobile |
| Client |                              | Client |
+---+----+                              +---+----+
    |                                       |
    +----------+------- API GW ----+--------+
               |                   |
        +------+------+    +------+------+
        | Upload/     |    | Sync        |
        | Download    |    | Service     |
        | Service     |    | (WebSocket/ |
        +------+------+    |  Long Poll) |
               |           +------+------+
               |                  |
        +------+------+    +------+------+
        | Block       |    | Notification|
        | Server      |    | Service     |
        +------+------+    +------+------+
               |                  |
    +----------+----------+       |
    |          |          |       |
+---+---+ +---+---+ +----+---+   |
| Block | | Meta  | | Version |   |
| Store | | DB    | | Store   |   |
| (S3)  | |(MySQL)| |(MySQL)  |   |
+-------+ +-------+ +---------+   |
                                   |
                           +-------+------+
                           | Message      |
                           | Queue(Kafka) |
                           +--------------+

4. File Chunking

Why Chunk Files?

Problem: Uploading a 1GB file fails at 900MB.
         Must restart entire upload.

Solution: Split into 4MB chunks.
          Upload chunks independently.
          Resume from last successful chunk.

Chunking Process

Original File (16MB)
  |
  v
Split into 4MB chunks:
  +--------+--------+--------+--------+
  | Chunk1 | Chunk2 | Chunk3 | Chunk4 |
  | 4MB    | 4MB    | 4MB    | 4MB    |
  | SHA256 | SHA256 | SHA256 | SHA256 |
  | =abc1  | =def2  | ghi3   | =jkl4  |
  +--------+--------+--------+--------+

Each chunk:
  - Has a unique hash (SHA-256 of content)
  - Is uploaded independently
  - Can be retried on failure
  - Is deduplicated across all users

Chunk Metadata

json
{
  "file_id": "f_12345",
  "file_name": "report.pdf",
  "file_size": 16777216,
  "chunks": [
    { "index": 0, "hash": "abc1...", "size": 4194304 },
    { "index": 1, "hash": "def2...", "size": 4194304 },
    { "index": 2, "hash": "ghi3...", "size": 4194304 },
    { "index": 3, "hash": "jkl4...", "size": 4194304 }
  ],
  "version": 3,
  "modified_at": "2024-01-15T14:30:00Z"
}

Interview Tip: The chunk size is a trade-off. Smaller chunks = better dedup and resumability but more metadata overhead. 4MB is the standard answer (Dropbox uses 4MB).

5. Deduplication

Content-Based Hashing

User A uploads file: "quarterly_report.pdf"
  SHA-256 of chunks: [abc1, def2, ghi3, jkl4]

User B uploads identical file: "Q4_report.pdf"
  SHA-256 of chunks: [abc1, def2, ghi3, jkl4]

Block store already has abc1, def2, ghi3, jkl4
  --> Skip upload for User B, just add metadata reference!

Dedup Flow

Client                  Block Server              Block Store (S3)
  |                          |                          |
  |-- "I have chunk abc1" -->|                          |
  |                          |-- Check: does abc1 exist?|
  |                          |<-- YES, already stored --|
  |<-- "Skip, already have"--|                          |
  |                          |                          |
  |-- "I have chunk xyz9" -->|                          |
  |                          |-- Check: does xyz9 exist?|
  |                          |<-- NO                    |
  |-- Upload chunk xyz9 ---->|                          |
  |                          |-- Store xyz9 ----------->|
  |<-- "Stored" -------------|                          |

Dedup Savings

Dropbox reported 60%+ storage savings from dedup
Especially effective for:
- Multiple users sharing same documents
- Small edits to large files (most chunks unchanged)
- Common libraries, installers, media files

6. Sync Conflict Resolution

Conflict Detection

User A (Laptop)              Server              User B (Desktop)
  |                            |                        |
  |-- Edit file.txt v3 ------>|                        |
  |   (base: v2)              |-- v3 stored            |
  |                            |                        |
  |                            |<-- Edit file.txt ------|
  |                            |    (base: v2)          |
  |                            |                        |
  |                            |  CONFLICT: two edits   |
  |                            |  based on same v2      |

Resolution Strategies

Strategy	Description	Used By
Last writer wins	Latest timestamp overwrites	Simple but lossy
Keep both	Save conflicted copy with suffix	Dropbox
Merge	Attempt automatic merge (text files)	Google Docs
Lock	Prevent concurrent edits	Enterprise systems

Dropbox Approach

1. Server accepts first write (User A's edit) as v3
2. Server detects conflict for User B's edit (also based on v2)
3. Server saves User B's version as "file (User B's conflicted copy).txt"
4. Both users are notified of the conflict
5. Manual resolution by users

Conflict-Free for Most Cases

Optimistic locking with version numbers:

  Client sends: { file_id: "f_123", base_version: 2, new_chunks: [...] }

  Server checks:
    current_version == 2? --> Accept, create v3
    current_version > 2?  --> Conflict! Reject, client must pull latest first

7. File Metadata Database

Schema

sql
users:
  user_id       BIGINT PRIMARY KEY
  email         VARCHAR(255) UNIQUE
  quota_bytes   BIGINT DEFAULT 2147483648  -- 2GB free
  used_bytes    BIGINT DEFAULT 0

files:
  file_id       BIGINT PRIMARY KEY
  user_id       BIGINT REFERENCES users
  parent_id     BIGINT REFERENCES files(file_id)  -- folder hierarchy
  name          VARCHAR(255)
  is_folder     BOOLEAN
  size_bytes    BIGINT
  mime_type     VARCHAR(100)
  current_ver   INT
  created_at    TIMESTAMP
  modified_at   TIMESTAMP
  deleted_at    TIMESTAMP NULL  -- soft delete (trash)

INDEX idx_parent ON files(user_id, parent_id, name)

file_versions:
  file_id       BIGINT
  version       INT
  chunk_hashes  TEXT[]  -- ordered list of chunk hashes
  size_bytes    BIGINT
  created_by    BIGINT
  created_at    TIMESTAMP
  PRIMARY KEY (file_id, version)

chunks:
  chunk_hash    CHAR(64) PRIMARY KEY  -- SHA-256
  s3_path       VARCHAR(500)
  size_bytes    INT
  ref_count     INT  -- number of file_versions referencing this chunk

Why Not Store Files in Database?

Databases optimized for structured data, not BLOBs
Object stores (S3) provide 11 nines durability
CDN integration works naturally with object storage
Cost: S3 = $0.023/GB/month vs EBS = $0.10/GB/month

8. Block Storage vs Object Storage

Feature	Block Storage (EBS)	Object Storage (S3)
Access pattern	Random read/write	Whole-object read/write
Latency	Low (< 1ms)	Medium (10-100ms)
Durability	99.999%	99.999999999% (11 nines)
Cost	$0.10/GB/month	$0.023/GB/month
Scalability	Limited to volume size	Unlimited
Use case	Databases, OS disks	File storage, backups

Decision: Use S3 for chunk storage (durable, cheap, scalable).

9. Notification Service for Changes

Long Polling vs WebSocket

Long Polling:
  Client --> Server: "Any changes since version 42?"
  Server holds connection open until:
    a) Change detected --> respond with changes
    b) Timeout (60s) --> respond with "no changes", client reconnects

WebSocket:
  Client <--> Server: persistent bidirectional connection
  Server pushes changes instantly

Method	Pros	Cons
Long polling	Simple, firewall-friendly	Higher latency, more connections
WebSocket	Real-time, efficient	Stateful, harder to scale
SSE	Simple push, HTTP-based	Unidirectional

Notification Flow

User A edits file --> Upload Service --> Metadata DB updated
                                              |
                                         Kafka event:
                                         { file_id, user_id, action: "modified" }
                                              |
                                              v
                                    +-------- +--------+
                                    |                  |
                              Sync Service        Push Notification
                              (WebSocket to        (APNs/FCM for
                               online clients)      offline clients)
                                    |
                              +-----+-----+
                              |           |
                         User B       User C
                         (online:     (offline:
                          WS push)    push notif)

Sync Protocol

Client                          Sync Service
  |                                  |
  |-- "My state: {file_id: v3}" --->|
  |                                  |
  |                            Compare with server state:
  |                            file_id is at v5 on server
  |                                  |
  |<-- "Update: file_id v3->v5" ----|
  |    { changed_chunks: [2, 4] }   |
  |                                  |
  |-- Download chunks 2, 4 -------->|
  |<-- Chunk data ------------------|
  |                                  |
  |-- "Ack: file_id now at v5" ---->|

10. Bandwidth Optimization: Delta Sync

Problem

User edits 1 line in a 100MB file. Re-uploading 100MB wastes bandwidth.

Solution: Only Upload Changed Chunks

Version 2: [chunk_A, chunk_B, chunk_C, chunk_D]
                                  ^
                                  | (user edited this region)
Version 3: [chunk_A, chunk_B, chunk_C', chunk_D]

Only chunk_C' is new. Upload only chunk_C' (4MB instead of 100MB).

rsync-Style Delta Compression (Advanced)

For changes within a single chunk:

Old chunk (4MB): [......AAAA...BBB......CCC.......]
New chunk (4MB): [......AAAA...BBX......CCC.......]
                                  ^ (1 byte changed)

Delta: { offset: 1234, old: "B", new: "X" }
Upload delta (tiny) instead of full 4MB chunk

Dropbox uses this approach for text-heavy files. The client computes a rolling hash (similar to rsync) to find changed regions within chunks.

11. Versioning and History

Version Chain

file.txt:
  v1 (Jan 1)  --> chunks: [A, B, C]
  v2 (Jan 5)  --> chunks: [A, B', C]     (B changed)
  v3 (Jan 10) --> chunks: [A, B', C, D]  (D added)
  v4 (Jan 15) --> chunks: [A, B'', C, D] (B changed again)

Storage Efficiency

Chunks are immutable and shared across versions
v1 and v2 share chunks A and C (no duplication)
Only changed/new chunks consume additional storage
Old versions can be garbage collected after retention period

Restore Flow

User requests: "Restore file.txt to v2"
  1. Look up file_versions for v2: chunks = [A, B', C]
  2. All chunks already in block store
  3. Update files.current_ver = 5 (new version pointing to v2 chunks)
  4. Sync new version to all devices

12. Sharing Model

ACL-Based Sharing

sql
file_shares:
  file_id       BIGINT
  shared_with   BIGINT  -- user_id or group_id
  permission    ENUM('viewer', 'commenter', 'editor', 'owner')
  created_at    TIMESTAMP
  expires_at    TIMESTAMP NULL
  PRIMARY KEY (file_id, shared_with)

Link Sharing

sql
share_links:
  link_id       CHAR(32) PRIMARY KEY  -- random token
  file_id       BIGINT
  permission    ENUM('viewer', 'editor')
  password      VARCHAR(255) NULL  -- optional password protection
  expires_at    TIMESTAMP NULL
  created_by    BIGINT
  view_count    INT DEFAULT 0

Permission Inheritance

Folder A (shared with Bob: editor)
  |-- File 1.txt    (Bob inherits editor access)
  |-- Subfolder B   (Bob inherits editor access)
       |-- File 2.txt (Bob inherits editor access)

13. Offline Support

Client-Side Architecture

+----------------------------------+
|         Desktop Client           |
+----------------------------------+
| File Watcher (inotify/FSEvents)  |
|    - Detect local file changes   |
+----------------------------------+
| Local Metadata DB (SQLite)       |
|    - Track file states & versions|
+----------------------------------+
| Chunk Engine                     |
|    - Split, hash, delta compute  |
+----------------------------------+
| Sync Queue                       |
|    - Queue changes while offline |
|    - Replay when back online     |
+----------------------------------+
| Network Layer                    |
|    - WebSocket for notifications |
|    - HTTPS for upload/download   |
+----------------------------------+

Offline Workflow

1. User edits file offline
2. File watcher detects change
3. Chunk engine computes new chunks
4. Change queued in local sync queue
5. When network returns:
   a. Pull remote changes first (server state)
   b. Detect conflicts if any
   c. Push local changes
   d. Resolve conflicts

14. Complete System Diagram

+----------+    +----------+    +----------+
| Desktop  |    | Mobile   |    | Web      |
| Client   |    | Client   |    | Client   |
+----+-----+    +----+-----+    +----+-----+
     |               |               |
     +-------+-------+-------+-------+
             |               |
      +------+------+  +----+--------+
      | API Gateway |  | CDN         |
      | (Auth, Rate |  | (static,    |
      |  Limit)     |  |  thumbnails)|
      +------+------+  +-------------+
             |
    +--------+--------+---------+
    |        |        |         |
+---+---+ +--+---+ +--+----+ +-+-------+
|Upload | |Down- | |Sync   | |Share    |
|Svc    | |load  | |Svc    | |Svc      |
|       | |Svc   | |(WS)   | |         |
+---+---+ +--+---+ +--+----+ +---------+
    |         |        |
    v         v        v
+---+---------+--------+---+
|       Block Server       |
| (chunk dedup, routing)   |
+---+----------------------+
    |
+---+---+---+
|       |   |
v       v   v
+---+ +---+ +----------+
|S3 | |S3 | |S3        |
|R1 | |R2 | |(archive) |
+---+ +---+ +----------+

+------------------+    +------------------+
| Metadata DB      |    | Kafka            |
| (MySQL, sharded  |    | (sync events,    |
|  by user_id)     |    |  notifications)  |
+------------------+    +--------+---------+
                                 |
                        +--------+---------+
                        | Notification     |
                        | Service          |
                        | (WS + Push)      |
                        +------------------+

15. Trade-Off Summary

Decision	Option A	Option B	Recommendation
Chunk size	1MB (more dedup)	4MB (less overhead)	4MB standard
Dedup scope	Per-user	Global (all users)	Global (60%+ savings)
Conflict resolution	Last write wins	Keep both copies	Keep both (Dropbox)
Sync notification	Long polling	WebSocket	WebSocket for desktop, long poll for web
Storage	Block store	Object store (S3)	S3 for chunks
Delta sync	Full chunk re-upload	rsync delta	Delta for text, full for binary

16. Interview Tips

Chunking is the foundation: Everything else builds on splitting files into chunks
Content-based hashing enables dedup: SHA-256 hash as chunk identifier
Delta sync saves bandwidth: Only upload changed chunks, not the whole file
Conflict resolution is essential: Show you understand optimistic locking + conflict copies
Separate metadata from data: MySQL for metadata, S3 for chunk storage
Notification model matters: WebSocket for real-time sync, long polling as fallback
Versioning is cheap: Chunks are immutable, versions just reference different chunk lists
Offline-first design: The client is a full local system with SQLite and sync queue

17. Resources

Alex Xu - System Design Interview Vol. 1, Chapter 15: Design Google Drive
Dropbox Engineering Blog: "How We've Scaled Dropbox" (2012)
"Building Dropbox's Sync Engine" (Dropbox Tech Blog)
"Delta Sync at Dropbox" - rsync-based approach
Google Drive Architecture (GCP documentation)
Martin Kleppmann - Designing Data-Intensive Applications, Chapter 5 (Replication)
Tushy Protocol (tus.io) - Resumable uploads standard

Previous: 36 - Design Twitter | Next: 38 - Design Web Crawler