37 - Design Dropbox or Google Drive

Series: System Design & Distributed Systems Previous: 36 - Design Twitter | Next: 38 - Design Web Crawler


1. Requirements

Functional Requirements

FeatureDetails
Upload filesAny file type, up to 10GB
Download filesRetrieve files from any device
Sync across devicesAutomatic sync when files change
Share files/foldersShare via link or direct access
File versioningAccess previous versions, rollback
Offline supportEdit locally, sync when online
NotificationsAlert collaborators on changes

Non-Functional Requirements

  • Reliability: Zero data loss (durability > 99.9999999%)
  • Availability: 99.99%
  • Scale: 500M users, 1B files/day uploaded
  • Bandwidth: Minimize data transfer (delta sync)
  • Latency: Sync within seconds for small files
  • Consistency: No silent data overwrites (conflict detection)

2. Capacity Estimation

Registered users:       500M
DAU:                    100M
Avg files per user:     200
Total files:            100B
Avg file size:          500KB
Total storage:          100B x 500KB = 50PB
Uploads/day:            1B files = ~12K uploads/sec
Avg upload size:        500KB
Upload bandwidth:       12K/sec x 500KB = 6GB/sec = 48Gbps
Sync notifications:     ~10x uploads = 120K events/sec
ResourceEstimate
Upload QPS~12K/sec
Storage total~50PB
Storage growth/day~500TB
Sync events/sec~120K

3. High-Level Architecture

+--------+                              +--------+
| Desktop|                              | Mobile |
| Client |                              | Client |
+---+----+                              +---+----+
    |                                       |
    +----------+------- API GW ----+--------+
               |                   |
        +------+------+    +------+------+
        | Upload/     |    | Sync        |
        | Download    |    | Service     |
        | Service     |    | (WebSocket/ |
        +------+------+    |  Long Poll) |
               |           +------+------+
               |                  |
        +------+------+    +------+------+
        | Block       |    | Notification|
        | Server      |    | Service     |
        +------+------+    +------+------+
               |                  |
    +----------+----------+       |
    |          |          |       |
+---+---+ +---+---+ +----+---+   |
| Block | | Meta  | | Version |   |
| Store | | DB    | | Store   |   |
| (S3)  | |(MySQL)| |(MySQL)  |   |
+-------+ +-------+ +---------+   |
                                   |
                           +-------+------+
                           | Message      |
                           | Queue(Kafka) |
                           +--------------+

4. File Chunking

Why Chunk Files?

Problem: Uploading a 1GB file fails at 900MB.
         Must restart entire upload.

Solution: Split into 4MB chunks.
          Upload chunks independently.
          Resume from last successful chunk.

Chunking Process

Original File (16MB)
  |
  v
Split into 4MB chunks:
  +--------+--------+--------+--------+
  | Chunk1 | Chunk2 | Chunk3 | Chunk4 |
  | 4MB    | 4MB    | 4MB    | 4MB    |
  | SHA256 | SHA256 | SHA256 | SHA256 |
  | =abc1  | =def2  | ghi3   | =jkl4  |
  +--------+--------+--------+--------+

Each chunk:
  - Has a unique hash (SHA-256 of content)
  - Is uploaded independently
  - Can be retried on failure
  - Is deduplicated across all users

Chunk Metadata

json
{ "file_id": "f_12345", "file_name": "report.pdf", "file_size": 16777216, "chunks": [ { "index": 0, "hash": "abc1...", "size": 4194304 }, { "index": 1, "hash": "def2...", "size": 4194304 }, { "index": 2, "hash": "ghi3...", "size": 4194304 }, { "index": 3, "hash": "jkl4...", "size": 4194304 } ], "version": 3, "modified_at": "2024-01-15T14:30:00Z" }

Interview Tip: The chunk size is a trade-off. Smaller chunks = better dedup and resumability but more metadata overhead. 4MB is the standard answer (Dropbox uses 4MB).


5. Deduplication

Content-Based Hashing

User A uploads file: "quarterly_report.pdf"
  SHA-256 of chunks: [abc1, def2, ghi3, jkl4]

User B uploads identical file: "Q4_report.pdf"
  SHA-256 of chunks: [abc1, def2, ghi3, jkl4]

Block store already has abc1, def2, ghi3, jkl4
  --> Skip upload for User B, just add metadata reference!

Dedup Flow

Client                  Block Server              Block Store (S3)
  |                          |                          |
  |-- "I have chunk abc1" -->|                          |
  |                          |-- Check: does abc1 exist?|
  |                          |<-- YES, already stored --|
  |<-- "Skip, already have"--|                          |
  |                          |                          |
  |-- "I have chunk xyz9" -->|                          |
  |                          |-- Check: does xyz9 exist?|
  |                          |<-- NO                    |
  |-- Upload chunk xyz9 ---->|                          |
  |                          |-- Store xyz9 ----------->|
  |<-- "Stored" -------------|                          |

Dedup Savings

  • Dropbox reported 60%+ storage savings from dedup
  • Especially effective for:
    • Multiple users sharing same documents
    • Small edits to large files (most chunks unchanged)
    • Common libraries, installers, media files

6. Sync Conflict Resolution

Conflict Detection

User A (Laptop)              Server              User B (Desktop)
  |                            |                        |
  |-- Edit file.txt v3 ------>|                        |
  |   (base: v2)              |-- v3 stored            |
  |                            |                        |
  |                            |<-- Edit file.txt ------|
  |                            |    (base: v2)          |
  |                            |                        |
  |                            |  CONFLICT: two edits   |
  |                            |  based on same v2      |

Resolution Strategies

StrategyDescriptionUsed By
Last writer winsLatest timestamp overwritesSimple but lossy
Keep bothSave conflicted copy with suffixDropbox
MergeAttempt automatic merge (text files)Google Docs
LockPrevent concurrent editsEnterprise systems

Dropbox Approach

1. Server accepts first write (User A's edit) as v3
2. Server detects conflict for User B's edit (also based on v2)
3. Server saves User B's version as "file (User B's conflicted copy).txt"
4. Both users are notified of the conflict
5. Manual resolution by users

Conflict-Free for Most Cases

Optimistic locking with version numbers:

  Client sends: { file_id: "f_123", base_version: 2, new_chunks: [...] }

  Server checks:
    current_version == 2? --> Accept, create v3
    current_version > 2?  --> Conflict! Reject, client must pull latest first

7. File Metadata Database

Schema

sql
users: user_id BIGINT PRIMARY KEY email VARCHAR(255) UNIQUE quota_bytes BIGINT DEFAULT 2147483648 -- 2GB free used_bytes BIGINT DEFAULT 0 files: file_id BIGINT PRIMARY KEY user_id BIGINT REFERENCES users parent_id BIGINT REFERENCES files(file_id) -- folder hierarchy name VARCHAR(255) is_folder BOOLEAN size_bytes BIGINT mime_type VARCHAR(100) current_ver INT created_at TIMESTAMP modified_at TIMESTAMP deleted_at TIMESTAMP NULL -- soft delete (trash) INDEX idx_parent ON files(user_id, parent_id, name) file_versions: file_id BIGINT version INT chunk_hashes TEXT[] -- ordered list of chunk hashes size_bytes BIGINT created_by BIGINT created_at TIMESTAMP PRIMARY KEY (file_id, version) chunks: chunk_hash CHAR(64) PRIMARY KEY -- SHA-256 s3_path VARCHAR(500) size_bytes INT ref_count INT -- number of file_versions referencing this chunk

Why Not Store Files in Database?

  • Databases optimized for structured data, not BLOBs
  • Object stores (S3) provide 11 nines durability
  • CDN integration works naturally with object storage
  • Cost: S3 = $0.023/GB/month vs EBS = $0.10/GB/month

8. Block Storage vs Object Storage

FeatureBlock Storage (EBS)Object Storage (S3)
Access patternRandom read/writeWhole-object read/write
LatencyLow (< 1ms)Medium (10-100ms)
Durability99.999%99.999999999% (11 nines)
Cost$0.10/GB/month$0.023/GB/month
ScalabilityLimited to volume sizeUnlimited
Use caseDatabases, OS disksFile storage, backups

Decision: Use S3 for chunk storage (durable, cheap, scalable).


9. Notification Service for Changes

Long Polling vs WebSocket

Long Polling:
  Client --> Server: "Any changes since version 42?"
  Server holds connection open until:
    a) Change detected --> respond with changes
    b) Timeout (60s) --> respond with "no changes", client reconnects

WebSocket:
  Client <--> Server: persistent bidirectional connection
  Server pushes changes instantly
MethodProsCons
Long pollingSimple, firewall-friendlyHigher latency, more connections
WebSocketReal-time, efficientStateful, harder to scale
SSESimple push, HTTP-basedUnidirectional

Notification Flow

User A edits file --> Upload Service --> Metadata DB updated
                                              |
                                         Kafka event:
                                         { file_id, user_id, action: "modified" }
                                              |
                                              v
                                    +-------- +--------+
                                    |                  |
                              Sync Service        Push Notification
                              (WebSocket to        (APNs/FCM for
                               online clients)      offline clients)
                                    |
                              +-----+-----+
                              |           |
                         User B       User C
                         (online:     (offline:
                          WS push)    push notif)

Sync Protocol

Client                          Sync Service
  |                                  |
  |-- "My state: {file_id: v3}" --->|
  |                                  |
  |                            Compare with server state:
  |                            file_id is at v5 on server
  |                                  |
  |<-- "Update: file_id v3->v5" ----|
  |    { changed_chunks: [2, 4] }   |
  |                                  |
  |-- Download chunks 2, 4 -------->|
  |<-- Chunk data ------------------|
  |                                  |
  |-- "Ack: file_id now at v5" ---->|

10. Bandwidth Optimization: Delta Sync

Problem

User edits 1 line in a 100MB file. Re-uploading 100MB wastes bandwidth.

Solution: Only Upload Changed Chunks

Version 2: [chunk_A, chunk_B, chunk_C, chunk_D]
                                  ^
                                  | (user edited this region)
Version 3: [chunk_A, chunk_B, chunk_C', chunk_D]

Only chunk_C' is new. Upload only chunk_C' (4MB instead of 100MB).

rsync-Style Delta Compression (Advanced)

For changes within a single chunk:

Old chunk (4MB): [......AAAA...BBB......CCC.......]
New chunk (4MB): [......AAAA...BBX......CCC.......]
                                  ^ (1 byte changed)

Delta: { offset: 1234, old: "B", new: "X" }
Upload delta (tiny) instead of full 4MB chunk

Dropbox uses this approach for text-heavy files. The client computes a rolling hash (similar to rsync) to find changed regions within chunks.


11. Versioning and History

Version Chain

file.txt:
  v1 (Jan 1)  --> chunks: [A, B, C]
  v2 (Jan 5)  --> chunks: [A, B', C]     (B changed)
  v3 (Jan 10) --> chunks: [A, B', C, D]  (D added)
  v4 (Jan 15) --> chunks: [A, B'', C, D] (B changed again)

Storage Efficiency

  • Chunks are immutable and shared across versions
  • v1 and v2 share chunks A and C (no duplication)
  • Only changed/new chunks consume additional storage
  • Old versions can be garbage collected after retention period

Restore Flow

User requests: "Restore file.txt to v2"
  1. Look up file_versions for v2: chunks = [A, B', C]
  2. All chunks already in block store
  3. Update files.current_ver = 5 (new version pointing to v2 chunks)
  4. Sync new version to all devices

12. Sharing Model

ACL-Based Sharing

sql
file_shares: file_id BIGINT shared_with BIGINT -- user_id or group_id permission ENUM('viewer', 'commenter', 'editor', 'owner') created_at TIMESTAMP expires_at TIMESTAMP NULL PRIMARY KEY (file_id, shared_with)

Link Sharing

sql
share_links: link_id CHAR(32) PRIMARY KEY -- random token file_id BIGINT permission ENUM('viewer', 'editor') password VARCHAR(255) NULL -- optional password protection expires_at TIMESTAMP NULL created_by BIGINT view_count INT DEFAULT 0

Permission Inheritance

Folder A (shared with Bob: editor)
  |-- File 1.txt    (Bob inherits editor access)
  |-- Subfolder B   (Bob inherits editor access)
       |-- File 2.txt (Bob inherits editor access)

13. Offline Support

Client-Side Architecture

+----------------------------------+
|         Desktop Client           |
+----------------------------------+
| File Watcher (inotify/FSEvents)  |
|    - Detect local file changes   |
+----------------------------------+
| Local Metadata DB (SQLite)       |
|    - Track file states & versions|
+----------------------------------+
| Chunk Engine                     |
|    - Split, hash, delta compute  |
+----------------------------------+
| Sync Queue                       |
|    - Queue changes while offline |
|    - Replay when back online     |
+----------------------------------+
| Network Layer                    |
|    - WebSocket for notifications |
|    - HTTPS for upload/download   |
+----------------------------------+

Offline Workflow

1. User edits file offline
2. File watcher detects change
3. Chunk engine computes new chunks
4. Change queued in local sync queue
5. When network returns:
   a. Pull remote changes first (server state)
   b. Detect conflicts if any
   c. Push local changes
   d. Resolve conflicts

14. Complete System Diagram

+----------+    +----------+    +----------+
| Desktop  |    | Mobile   |    | Web      |
| Client   |    | Client   |    | Client   |
+----+-----+    +----+-----+    +----+-----+
     |               |               |
     +-------+-------+-------+-------+
             |               |
      +------+------+  +----+--------+
      | API Gateway |  | CDN         |
      | (Auth, Rate |  | (static,    |
      |  Limit)     |  |  thumbnails)|
      +------+------+  +-------------+
             |
    +--------+--------+---------+
    |        |        |         |
+---+---+ +--+---+ +--+----+ +-+-------+
|Upload | |Down- | |Sync   | |Share    |
|Svc    | |load  | |Svc    | |Svc      |
|       | |Svc   | |(WS)   | |         |
+---+---+ +--+---+ +--+----+ +---------+
    |         |        |
    v         v        v
+---+---------+--------+---+
|       Block Server       |
| (chunk dedup, routing)   |
+---+----------------------+
    |
+---+---+---+
|       |   |
v       v   v
+---+ +---+ +----------+
|S3 | |S3 | |S3        |
|R1 | |R2 | |(archive) |
+---+ +---+ +----------+

+------------------+    +------------------+
| Metadata DB      |    | Kafka            |
| (MySQL, sharded  |    | (sync events,    |
|  by user_id)     |    |  notifications)  |
+------------------+    +--------+---------+
                                 |
                        +--------+---------+
                        | Notification     |
                        | Service          |
                        | (WS + Push)      |
                        +------------------+

15. Trade-Off Summary

DecisionOption AOption BRecommendation
Chunk size1MB (more dedup)4MB (less overhead)4MB standard
Dedup scopePer-userGlobal (all users)Global (60%+ savings)
Conflict resolutionLast write winsKeep both copiesKeep both (Dropbox)
Sync notificationLong pollingWebSocketWebSocket for desktop, long poll for web
StorageBlock storeObject store (S3)S3 for chunks
Delta syncFull chunk re-uploadrsync deltaDelta for text, full for binary

16. Interview Tips

  1. Chunking is the foundation: Everything else builds on splitting files into chunks
  2. Content-based hashing enables dedup: SHA-256 hash as chunk identifier
  3. Delta sync saves bandwidth: Only upload changed chunks, not the whole file
  4. Conflict resolution is essential: Show you understand optimistic locking + conflict copies
  5. Separate metadata from data: MySQL for metadata, S3 for chunk storage
  6. Notification model matters: WebSocket for real-time sync, long polling as fallback
  7. Versioning is cheap: Chunks are immutable, versions just reference different chunk lists
  8. Offline-first design: The client is a full local system with SQLite and sync queue

17. Resources

  • Alex Xu - System Design Interview Vol. 1, Chapter 15: Design Google Drive
  • Dropbox Engineering Blog: "How We've Scaled Dropbox" (2012)
  • "Building Dropbox's Sync Engine" (Dropbox Tech Blog)
  • "Delta Sync at Dropbox" - rsync-based approach
  • Google Drive Architecture (GCP documentation)
  • Martin Kleppmann - Designing Data-Intensive Applications, Chapter 5 (Replication)
  • Tushy Protocol (tus.io) - Resumable uploads standard

Previous: 36 - Design Twitter | Next: 38 - Design Web Crawler