22 - Time-Series & Analytics Databases

Previous: 21 - Search Systems | Next: 23 - Distributed Task Scheduling


Time-Series Data Characteristics

Time-series data is fundamentally different from transactional data. Understanding these differences drives every architectural decision.

Example time-series record:
  { metric: "cpu_usage", host: "web-01", value: 73.2, timestamp: 1704067200 }

Characteristics:
  - Append-heavy (write-once, read-many)
  - Naturally ordered by time
  - High cardinality (many unique tag combinations)
  - Recent data queried far more than old data
  - Aggregation queries dominate (avg, sum, percentile over time ranges)
  - Individual records rarely updated or deleted
PropertyOLTP (Transactional)Time-Series / OLAP
Write patternRandom inserts/updatesSequential appends
Read patternPoint lookups by PKRange scans over time
Query typeSingle-row CRUDAggregations over millions of rows
Data lifecycleRetained indefinitelyTTL / downsampled over time
SchemaNormalized (3NF)Denormalized, wide tables
Index strategyB-tree on PK/FKTime-partitioned, columnar

OLTP vs OLAP

                OLTP                              OLAP
  +------------------------------+  +------------------------------+
  | User places order            |  | "What was total revenue      |
  | UPDATE inventory SET qty=..  |  |  by region last quarter?"    |
  | INSERT INTO orders ...       |  |                              |
  |                              |  | SELECT region, SUM(revenue)  |
  | Latency: < 10ms             |  | FROM sales                   |
  | Rows affected: 1-10         |  | WHERE date BETWEEN ...       |
  | Concurrency: thousands      |  | GROUP BY region              |
  +------------------------------+  |                              |
                                    | Latency: seconds to minutes  |
                                    | Rows scanned: millions       |
                                    | Concurrency: tens            |
                                    +------------------------------+
DimensionOLTPOLAP
UsersApplication usersAnalysts, dashboards
DataCurrent stateHistorical
SchemaNormalizedStar/snowflake
Optimized forWrites, point readsReads, aggregations
StorageRow-orientedColumn-oriented
ExamplesPostgreSQL, MySQLClickHouse, BigQuery, Redshift

Columnar Storage: Why It Is Fast for Analytics

Row vs Column Storage Layout

Row-oriented (PostgreSQL, MySQL):
  Row 1: | user_id=1 | name="Alice" | age=30 | city="NYC"    |
  Row 2: | user_id=2 | name="Bob"   | age=25 | city="London" |
  Row 3: | user_id=3 | name="Carol" | age=35 | city="NYC"    |

  Disk: [1,Alice,30,NYC] [2,Bob,25,London] [3,Carol,35,NYC]

Column-oriented (ClickHouse, Parquet):
  user_id: [1, 2, 3]
  name:    ["Alice", "Bob", "Carol"]
  age:     [30, 25, 35]
  city:    ["NYC", "London", "NYC"]

  Disk: [1,2,3] [Alice,Bob,Carol] [30,25,35] [NYC,London,NYC]

Why Columnar Wins for Analytics

Query: SELECT AVG(age) FROM users WHERE city = 'NYC'

Row store: Must read ALL columns for EVERY row
  Read: [1,Alice,30,NYC] [2,Bob,25,London] [3,Carol,35,NYC]
  IO: 3 full rows * all columns = lots of wasted reads

Column store: Only read the 2 columns needed
  Read: city = [NYC, London, NYC]   --> filter: rows 0, 2
  Read: age  = [30, 35]             --> AVG = 32.5
  IO: only 2 columns, skip the rest

Three key advantages:

  1. Read amplification reduction: Only read columns referenced in the query
  2. Compression: Same-type values compress extremely well (run-length encoding, dictionary encoding, delta encoding)
  3. Vectorized execution: Process entire column vectors using SIMD instructions

Compression Techniques

TechniqueBest ForExample
Dictionary encodingLow-cardinality stringscity: {0: "NYC", 1: "London"} -> [0, 1, 0]
Run-length encodingSorted repeated values[NYC, NYC, NYC, London] -> [(NYC, 3), (London, 1)]
Delta encodingTimestamps, sequences[1000, 1001, 1002] -> [1000, +1, +1]
Bit packingSmall integersStore 3-bit values packed into bytes
LZ4 / ZSTDGeneral blocksBlock-level compression after encoding

Real-world compression ratios: 5-10x is typical for columnar stores. ClickHouse often achieves 10-20x on real workloads.


Time-Series Databases

InfluxDB

Purpose-built for metrics and events.

Write (line protocol):
  cpu,host=web01,region=us-east value=73.2 1704067200000000000

Query (Flux language):
  from(bucket: "metrics")
    |> range(start: -1h)
    |> filter(fn: (r) => r._measurement == "cpu" and r.host == "web01")
    |> aggregateWindow(every: 5m, fn: mean)
FeatureDetail
Storage engineTSM (Time-Structured Merge Tree)
Data modelMeasurement + tags + fields + timestamp
Retention policiesAuto-delete data after TTL
DownsamplingContinuous queries / tasks
Best forDevOps metrics, IoT sensor data

TimescaleDB

PostgreSQL extension for time-series. Full SQL compatibility.

sql
-- Create hypertable (auto-partitioned by time) SELECT create_hypertable('metrics', 'time'); -- Insert like normal SQL INSERT INTO metrics (time, host, cpu) VALUES (NOW(), 'web01', 73.2); -- Query with full SQL power SELECT time_bucket('5 minutes', time) AS bucket, host, AVG(cpu) as avg_cpu FROM metrics WHERE time > NOW() - INTERVAL '1 hour' GROUP BY bucket, host ORDER BY bucket DESC;
FeatureDetail
EnginePostgreSQL + chunked hypertables
Data modelRegular SQL tables, auto-partitioned by time
AdvantageFull SQL, JOIN with relational data, extensions
CompressionNative columnar compression (90%+ reduction)
Best forWhen you need SQL + time-series in one system

Prometheus

Pull-based monitoring system with its own TSDB.

Architecture:
  +-------------+      +------------+      +-----------+
  | Application | <--- | Prometheus | ---> | Grafana   |
  | /metrics    | pull | TSDB       | query| Dashboard |
  +-------------+      +------------+      +-----------+
                             |
                        [Alert Manager]
promql
# Average CPU over 5 minutes, by instance avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) # 99th percentile request latency histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))
FeatureDetail
CollectionPull-based (scrapes HTTP endpoints)
StorageLocal TSDB (WAL + blocks, 2-hour chunks)
QueryPromQL (purpose-built query language)
RetentionDefault 15 days (local), longer via Thanos/Cortex
Best forInfrastructure monitoring, alerting

Analytics Databases

ClickHouse

Open-source columnar OLAP database. Extremely fast for analytical queries.

sql
-- Create a MergeTree table (ClickHouse's core engine) CREATE TABLE events ( event_date Date, event_time DateTime, user_id UInt64, event_type LowCardinality(String), properties String ) ENGINE = MergeTree() PARTITION BY toYYYYMM(event_date) ORDER BY (event_type, user_id, event_time); -- Query: count events by type in the last hour SELECT event_type, count() FROM events WHERE event_time > now() - INTERVAL 1 HOUR GROUP BY event_type ORDER BY count() DESC;
FeatureDetail
StorageColumnar, compressed, sorted by primary key
Engine typesMergeTree family (ReplacingMergeTree, AggregatingMergeTree, etc.)
PerformanceScans billions of rows/sec on commodity hardware
ReplicationZooKeeper-based (or ClickHouse Keeper)
Best forReal-time analytics, event tracking, log analysis

Why ClickHouse is fast:

  • Columnar storage with aggressive compression
  • Vectorized query execution (processes data in batches, SIMD)
  • Sparse primary index (granule-based, not per-row)
  • Parallel query execution across cores and shards

BigQuery (Google)

Serverless analytics warehouse. No infrastructure to manage.

sql
SELECT DATE(created_at) as day, COUNT(DISTINCT user_id) as dau FROM `project.dataset.events` WHERE created_at >= '2024-01-01' GROUP BY day ORDER BY day;
FeatureDetail
ArchitectureSeparation of storage (Colossus) and compute (Dremel)
PricingPer-query (bytes scanned) + storage
PerformanceMassively parallel, auto-scales
Best forAd-hoc analytics, data warehousing, ML features

Redshift (AWS)

Managed columnar data warehouse based on ParAccel (PostgreSQL fork).

FeatureDetail
ArchitectureCluster of nodes (leader + compute nodes)
StorageColumnar, zone maps, sort keys
ScalingResize cluster or use Redshift Serverless
Distribution stylesKEY, EVEN, ALL (affects join performance)
Best forEnterprise data warehousing, BI dashboards

Data Warehousing Concepts

Star Schema

The most common analytical schema. Simple, fast to query.

              +----------------+
              | dim_date       |
              | date_key (PK)  |
              | year, month,   |
              | quarter, ...   |
              +-------+--------+
                      |
+---------------+     |     +----------------+
| dim_product   |     |     | dim_customer   |
| product_key   +--+  |  +--+ customer_key   |
| name, category|  |  |  |  | name, region   |
+---------------+  |  |  |  +----------------+
                   |  |  |
              +----+--+--+----+
              |  fact_sales    |
              |  date_key (FK) |
              |  product_key   |
              |  customer_key  |
              |  quantity       |
              |  revenue        |
              |  discount       |
              +----------------+

Properties:

  • Central fact table (measurable events: sales, clicks, orders)
  • Dimension tables radiate outward (who, what, when, where)
  • Denormalized dimensions (one JOIN to get any dimension attribute)
  • Optimized for aggregation queries

Snowflake Schema

Normalized dimensions. Reduces redundancy at the cost of more JOINs.

  dim_category          dim_product          fact_sales
  +------------+        +-------------+      +------------+
  | cat_key    | <----- | product_key | <--- | product_key|
  | cat_name   |        | name        |      | date_key   |
  +------------+        | cat_key (FK)|      | revenue    |
                        +-------------+      +------------+

Star vs Snowflake

PropertyStarSnowflake
Query complexitySimpler (fewer JOINs)More JOINs
Query performanceFasterSlower
StorageMore redundancyLess redundancy
ETL complexitySimplerMore complex
RecommendationDefault choiceOnly if storage is a concern

Fact and Dimension Tables

AspectFact TableDimension Table
ContentMeasurements, metricsDescriptive attributes
SizeVery large (billions of rows)Small to medium
GrowthRapid (new events constantly)Slow (new products, users)
ExamplesSales, clicks, impressionsProducts, users, dates, regions
KeysForeign keys to dimensionsSurrogate keys (auto-increment)

Materialized Views and Pre-Aggregation

Materialized Views

Pre-computed query results stored as tables. Trade storage for query speed.

sql
-- PostgreSQL materialized view CREATE MATERIALIZED VIEW daily_revenue AS SELECT date_trunc('day', order_time) AS day, product_category, SUM(revenue) AS total_revenue, COUNT(*) AS order_count FROM orders GROUP BY 1, 2; -- Refresh (full recomputation) REFRESH MATERIALIZED VIEW daily_revenue; -- ClickHouse: use AggregatingMergeTree for incremental materialized views CREATE MATERIALIZED VIEW daily_revenue_mv ENGINE = AggregatingMergeTree() ORDER BY (day, product_category) AS SELECT toDate(order_time) AS day, product_category, sumState(revenue) AS total_revenue, countState() AS order_count FROM orders GROUP BY day, product_category;

Rollup and Downsampling

Reduce storage by aggregating old high-resolution data into lower resolution.

Raw data (1-second resolution):
  12:00:01  cpu=72.3
  12:00:02  cpu=73.1
  12:00:03  cpu=71.8
  ...
  (86,400 points per day per metric)

After 7 days --> downsample to 1-minute:
  12:00  cpu_avg=72.4, cpu_max=73.1, cpu_min=71.8
  12:01  cpu_avg=74.2, cpu_max=76.0, cpu_min=73.5
  (1,440 points per day -- 60x reduction)

After 30 days --> downsample to 1-hour:
  12:00  cpu_avg=73.8, cpu_max=82.1, cpu_min=68.2
  (24 points per day -- 3,600x reduction)

Retention policy example:

AgeResolutionStorage per metric/day
0-7 days1 second86,400 points
7-30 days1 minute1,440 points
30-365 days1 hour24 points
1+ years1 day1 point

Lambda vs Kappa Architecture

Lambda Architecture

Separate batch and real-time processing paths. Merge results at query time.

                  +-------------------+
                  |   Data Source      |
                  +--------+----------+
                           |
              +------------+------------+
              |                         |
     +--------+--------+      +--------+--------+
     |   Batch Layer    |      |  Speed Layer    |
     | (Hadoop/Spark)   |      | (Flink/Storm)   |
     | Complete, accurate|     | Low-latency,     |
     | Hours-old results |     | approximate      |
     +--------+---------+     +--------+---------+
              |                         |
     +--------+---------+     +--------+---------+
     |  Batch Views     |     |  Real-time Views |
     | (pre-computed)   |     | (incremental)    |
     +--------+---------+     +--------+---------+
              |                         |
              +------------+------------+
                           |
                  +--------+----------+
                  |   Serving Layer    |
                  | Merge batch +      |
                  | real-time results  |
                  +-------------------+

Pros: Batch layer is simple and recomputable; speed layer handles freshness. Cons: Two codebases, complex merging logic, operational overhead.

Kappa Architecture

Single stream processing path handles everything. Reprocessing = replay the stream.

  Data Source --> Kafka (immutable log) --> Stream Processor (Flink)
                      |                           |
                      |                    +------+------+
                      |                    | Serving     |
                      |                    | Layer       |
                      |                    +-------------+
                      |
                 To reprocess:
                 Reset consumer offset to beginning
                 Replay all events through new processor version

Pros: One codebase, simpler operations, single source of truth (the log). Cons: Reprocessing can be slow for large history; not all problems fit streaming.

When to Use Which

ScenarioRecommendation
Legacy batch + new real-time needLambda (add speed layer)
Greenfield analytics pipelineKappa (simpler)
Complex ML training + real-time servingLambda (batch for training)
Event-driven microservicesKappa (Kafka + Flink)
Both approaches are overkillJust use a good OLAP DB (ClickHouse, BigQuery)

Interview Tips

  1. Know when to reach for a columnar store. If the interviewer mentions dashboards, analytics, or "aggregate over millions of rows," that is your signal.
  2. Star schema is the default. Don't overcomplicate with snowflake schema unless storage is explicitly constrained.
  3. Explain why columnar compression works. Same data types in a column compress far better than mixed types in a row. This is the key insight.
  4. Downsampling is essential for time-series at scale. Show you understand retention policies and the storage/resolution trade-off.
  5. Lambda vs Kappa is a common follow-up in streaming questions. Know the trade-offs, and lean toward Kappa for newer systems.
  6. Mention separation of storage and compute. Modern warehouses (BigQuery, Snowflake, Redshift Serverless) separate these, enabling independent scaling.

Resources

  • DDIA (Kleppmann) -- Chapter 3: Storage and Retrieval (column-oriented storage), Chapter 10: Batch Processing, Chapter 11: Stream Processing
  • System Design Interview Vol. 2 (Alex Xu) -- Metrics monitoring and alerting system
  • ClickHouse Documentation: MergeTree engine family, query optimization
  • "The Lambda Architecture" -- Nathan Marz (original blog post)
  • "Questioning the Lambda Architecture" -- Jay Kreps (Kappa proposal)
  • Google Dremel Paper: "Dremel: Interactive Analysis of Web-Scale Datasets"
  • Prometheus Documentation: TSDB storage, PromQL
  • "The Data Warehouse Toolkit" (Ralph Kimball) -- Star schema, dimensional modeling
  • TimescaleDB Documentation: Hypertables, continuous aggregates
  • InfluxDB Documentation: TSM engine, Flux query language

Previous: 21 - Search Systems | Next: 23 - Distributed Task Scheduling