41 - Design Ticket Booking System

Previous: 40 - Design Distributed Cache | Next: 42 - Design Payment System


1. Problem Statement

Design a ticket/seat booking system for events (concerts, flights, movies). The core challenge: prevent double bookings under extreme concurrency while maintaining a responsive user experience. Think Ticketmaster, BookMyShow, or airline reservation systems.


2. Requirements

Functional

RequirementDetail
Browse eventsSearch, filter, view event details and venue map
View seat availabilityReal-time (or near-real-time) seat map with status
Select & hold seatsTemporary reservation while user completes payment
Payment & confirmationIntegrate with payment gateway, issue ticket/confirmation
Cancellation & refundUser can cancel within policy window
Waiting queueQueue for sold-out popular events

Non-Functional

RequirementTarget
Availability99.99%
ConsistencyStrong for booking (no double-sell), eventual for availability view
Latency< 500ms for seat selection, < 2s for booking confirmation
ScaleHandle 10K+ concurrent users for a single popular event (flash sale)

3. The Concurrency Challenge

The fundamental problem: two users click the same seat at the same instant.

Timeline:
  t=0   User A reads seat S1 -> status: AVAILABLE
  t=0   User B reads seat S1 -> status: AVAILABLE
  t=1   User A books seat S1 -> SUCCESS
  t=1   User B books seat S1 -> SUCCESS ???  <-- DOUBLE BOOKING!

This is a classic lost update / write-write conflict problem.


4. Seat Selection Strategies

Strategy 1: Pessimistic Locking (Database Row Lock)

sql
BEGIN TRANSACTION; SELECT * FROM seats WHERE seat_id = 'S1' AND event_id = 'E1' FOR UPDATE; -- Row is now locked, other transactions block here UPDATE seats SET status = 'HELD', held_by = 'user_A', held_until = NOW() + INTERVAL '10 min' WHERE seat_id = 'S1' AND status = 'AVAILABLE'; COMMIT;
ProsCons
Simple, strong guaranteeBlocks concurrent requests (latency spikes)
Database handles correctnessDeadlock risk with multi-seat selection
Battle-testedPoor throughput under contention

Strategy 2: Optimistic Locking (Version-Based)

sql
-- Read SELECT seat_id, status, version FROM seats WHERE seat_id = 'S1'; -- version = 5, status = AVAILABLE -- Write (only succeeds if version unchanged) UPDATE seats SET status = 'HELD', held_by = 'user_A', version = 6, held_until = NOW() + INTERVAL '10 min' WHERE seat_id = 'S1' AND version = 5 AND status = 'AVAILABLE'; -- If affected_rows = 0, someone else got it first -> retry or show "unavailable"
ProsCons
No blocking (non-locking)Retry storms under high contention
Higher throughput for low-contentionUser experience: "someone took your seat"
Simple implementationNeeds retry logic in application

Strategy 3: Redis-Based Distributed Lock

-- Atomic SET-if-not-exists with TTL
SET seat:E1:S1 user_A NX EX 600
-- Returns OK if set (seat claimed), nil if already taken

-- Release on payment timeout
DEL seat:E1:S1
ProsCons
Sub-millisecond latencyRedis failure = lock system failure
TTL handles abandoned holdsRequires careful Redis HA setup
No DB contentionExtra infrastructure component

Comparison

CriteriaPessimisticOptimisticRedis Lock
ThroughputLowMediumHigh
ComplexityLowMediumMedium
ConsistencyStrongStrong (with retry)Strong (with NX)
Best forLow concurrencyMedium concurrencyFlash sales, high concurrency

5. Reservation with TTL (Hold-Then-Pay)

State Machine for a Seat:

  AVAILABLE --[user selects]--> HELD (TTL: 10 min)
      ^                            |
      |                     +------+------+
      |                     |             |
  [TTL expires]      [payment OK]   [payment fail]
      |                     |             |
      |                     v             |
      +---- AVAILABLE <----+    BOOKED    |
                                          |
                            AVAILABLE <---+

TTL Hold Mechanism

1. User selects seat -> SET seat status = HELD, held_until = now + 10min
2. User proceeds to payment page
3. If payment succeeds within 10min -> status = BOOKED
4. If payment fails or user abandons:
   - Background job sweeps expired holds every 30s
   - OR: Redis key expires automatically (if using Redis locks)
   - Seat returns to AVAILABLE

Interview Tip

The 10-minute hold window is a business decision, not a technical one. Mention that Ticketmaster uses ~8 minutes, airlines use ~20 minutes. Shorter hold = more inventory turnover. Longer hold = better user experience.


6. Full System Architecture

                         +--------------------+
                         |   CDN (static      |
                         |   assets, seat map) |
                         +--------+-----------+
                                  |
                         +--------v-----------+
                         |   Load Balancer    |
                         +--------+-----------+
                                  |
              +-------------------+-------------------+
              |                   |                   |
     +--------v------+  +--------v------+  +--------v------+
     | API Gateway   |  | API Gateway   |  | API Gateway   |
     | (rate limit,  |  |               |  |               |
     |  auth, queue) |  |               |  |               |
     +--------+------+  +-------+-------+  +-------+------+
              |                  |                  |
              +------------------+------------------+
                                 |
              +------------------+------------------+
              |                  |                  |
     +--------v------+  +-------v-------+  +-------v-------+
     | Event Service |  | Booking Svc   |  | Payment Svc   |
     | (catalog,     |  | (seat lock,   |  | (PSP integ,   |
     |  search)      |  |  reservation) |  |  webhook)     |
     +--------+------+  +-------+-------+  +-------+-------+
              |                  |                  |
     +--------v------+  +-------v-------+          |
     | Event DB      |  | Redis Cluster |          |
     | (PostgreSQL)  |  | (seat locks,  |  +-------v-------+
     +---------------+  |  hold TTLs)   |  | Payment PSP   |
                        +-------+-------+  | (Stripe)      |
                                |          +---------------+
                        +-------v-------+
                        | Booking DB    |
                        | (PostgreSQL)  |
                        | - reservations|
                        | - tickets     |
                        +---------------+

     +------------------+     +------------------+
     | Notification Svc |     | Queue Service    |
     | (email, SMS,     |     | (waiting list    |
     |  push)           |     |  for sold-out)   |
     +------------------+     +------------------+

7. Payment Integration Flow

  User clicks "Pay"
       |
       v
  +----+----+
  | Booking |  1. Validate hold is still active
  | Service |  2. Create payment intent
  +----+----+
       |
       v
  +----+----+
  | Payment |  3. Call PSP (Stripe) to authorize
  | Service |  4. On success: mark seat BOOKED, generate ticket
  +----+----+  5. On failure: release hold, notify user
       |
       v
  +----+----+
  | PSP     |  Stripe/PayPal processes card
  | (Stripe)|  Returns: success/failure/pending
  +----+----+
       |
       v
  Webhook callback to Payment Service
  (confirms final status)

Handling Failures Mid-Booking

Failure PointRecovery
Payment times outPoll PSP for status, retry once, then release hold
PSP returns "pending"Hold seat, wait for webhook confirmation
Network failure after PSP chargeIdempotency key ensures no double charge; poll PSP
Booking DB write fails after paymentRetry DB write; worst case: refund and release seat
User closes browser mid-paymentHold TTL expires, seat released; if PSP charged, auto-refund

8. Waiting Queue for Popular Events

For sold-out events or flash sales, a virtual queue prevents system overload.

User arrives
     |
     v
+----+--------+
| Queue Gate  |  Assign queue position, estimated wait time
| (Redis      |  Token-bucket: admit N users/min to booking flow
|  sorted set)|
+----+--------+
     |
     | when position reached
     v
+----+--------+
| Booking     |  User has limited time window (e.g., 5 min)
| Flow        |  to select seats and complete booking
+-------------+

Queue implementation:
  ZADD queue:event_123 <timestamp> <user_id>
  ZRANK queue:event_123 <user_id>          -- position in queue
  ZPOPMIN queue:event_123                   -- admit next user

Interview Tip

Mention Ticketmaster's "Smart Queue" or Cloudflare's Waiting Room as real-world precedents. The queue protects the booking system from thundering herd during on-sale moments.


9. Seat Map Rendering

Seat Map Data Model:

Venue:
  sections: [Section]

Section:
  section_id, name, price_tier
  rows: [Row]

Row:
  row_id, label ("A", "B", ...)
  seats: [Seat]

Seat:
  seat_id, number, status (AVAILABLE | HELD | BOOKED | BLOCKED)
  price, accessibility_flag

Availability Broadcast

Option 1: Polling
  Client polls GET /events/{id}/seats every 5-10 seconds
  Simple but wasteful

Option 2: Server-Sent Events (SSE)
  Server pushes seat status changes to connected clients
  Efficient, one-directional

Option 3: WebSocket
  Full-duplex, real-time seat map updates
  Best UX but highest server resource cost

Recommendation: SSE for seat map (server -> client only)

10. Scaling for Flash Sales

When Taylor Swift tickets go on sale, 10K+ users target the same event simultaneously.

TechniqueHow
Virtual queueGate admission to booking flow (see section 8)
Redis clusterSeat locks in Redis, horizontally scaled
Read replicasServe seat availability from replicas (eventual consistency OK for display)
Pre-compute seat mapCache full seat map in CDN, update via SSE delta
Shard by sectionDifferent booking servers handle different venue sections
Rate limitingPer-user rate limits to prevent bots
Bot detectionCAPTCHA, browser fingerprinting, behavioral analysis
OverprovisioningAuto-scale booking service before announced on-sale time

11. Database Schema (Simplified)

sql
-- Events CREATE TABLE events ( event_id UUID PRIMARY KEY, name TEXT NOT NULL, venue_id UUID REFERENCES venues(venue_id), event_date TIMESTAMPTZ, status TEXT DEFAULT 'UPCOMING' -- UPCOMING, ON_SALE, SOLD_OUT, COMPLETED ); -- Seats (per event, denormalized for performance) CREATE TABLE event_seats ( event_id UUID REFERENCES events(event_id), seat_id UUID, section TEXT, row_label TEXT, seat_number INT, price DECIMAL(10,2), status TEXT DEFAULT 'AVAILABLE', -- AVAILABLE, HELD, BOOKED held_by UUID, held_until TIMESTAMPTZ, version INT DEFAULT 0, -- for optimistic locking PRIMARY KEY (event_id, seat_id) ); -- Bookings CREATE TABLE bookings ( booking_id UUID PRIMARY KEY, user_id UUID NOT NULL, event_id UUID NOT NULL, total_amount DECIMAL(10,2), status TEXT DEFAULT 'PENDING', -- PENDING, CONFIRMED, CANCELLED, REFUNDED payment_id TEXT, created_at TIMESTAMPTZ DEFAULT NOW() ); -- Booking items (seats in a booking) CREATE TABLE booking_items ( booking_id UUID REFERENCES bookings(booking_id), seat_id UUID, event_id UUID, price DECIMAL(10,2), PRIMARY KEY (booking_id, seat_id) );

12. Eventual Consistency for Seat Availability View

                 Strong Consistency Path
User selects seat --> Redis lock (NX) --> DB update --> BOOKED
                      |
                      | if lock acquired, seat is guaranteed yours
                      |
                 Eventual Consistency Path  
Seat map display <-- Read replica <-- async replication <-- Primary DB
                      |
                      | may show a seat as AVAILABLE for a few seconds
                      | after it was actually booked (acceptable for display)
                      | user will get "seat taken" error on selection attempt

Interview Tip

Clearly separate the consistency requirements: writes (booking) need strong consistency, reads (seat map display) can be eventually consistent. This duality is key to scaling the system.


13. Key Trade-offs Discussion

DecisionOption AOption B
LockingPessimistic DB lock (simple, low throughput)Redis NX lock (fast, extra infra)
Seat map updatesPolling (simple)WebSocket/SSE (real-time, complex)
Hold TTLShort 5min (more turnover)Long 15min (better UX)
QueueVirtual queue (fair, predictable)No queue (faster for low traffic)
AvailabilityStrong consistency (slower)Eventual for display (scalable)
PaymentSync (simple flow)Async with webhooks (resilient)

14. Capacity Estimation

Assumptions:
- Popular event: 50,000 seats
- On-sale moment: 100K users in queue, 10K concurrent in booking flow
- Average booking: 2.5 seats per transaction

Seat lock operations:
- 10K users * 2.5 seats = 25K lock attempts
- Redis handles 100K+ ops/sec -> single Redis cluster sufficient

Database writes:
- ~20K bookings over first 30 minutes
- ~50K seat status updates
- PostgreSQL on good hardware: 10K+ TPS -> sufficient

API requests:
- Seat map polling: 100K users * 1 req/5sec = 20K QPS (serve from cache/CDN)
- Booking flow: 10K users making selections = ~5K QPS (serve from API)

15. Interview Checklist

  • Identified the core challenge: double-booking prevention under concurrency
  • Compared locking strategies (pessimistic vs optimistic vs Redis)
  • Designed hold-then-pay flow with TTL and state machine
  • Payment integration with failure handling and idempotency
  • Virtual queue for flash sale scenarios
  • Separated strong consistency (booking) from eventual consistency (display)
  • Seat map rendering and real-time update strategy
  • Scaling techniques for high-concurrency events
  • Bot prevention and rate limiting

16. Resources

  • System Design Interview (Alex Xu, Vol 2) -- Hotel Reservation System chapter
  • Designing Data-Intensive Applications (Kleppmann) -- Chapter 7: Transactions
  • Ticketmaster Engineering Blog -- Scaling for On-Sales
  • YouTube: System Design Interview -- Design Ticket Master
  • YouTube: Gaurav Sen -- Booking System Design
  • Paper: "Scalable Reservation Systems" -- techniques for high-contention scenarios

Previous: 40 - Design Distributed Cache | Next: 42 - Design Payment System