22 - Time-Series & Analytics Databases

Previous: 21 - Search Systems | Next: 23 - Distributed Task Scheduling

Time-Series Data Characteristics

Time-series data is fundamentally different from transactional data. Understanding these differences drives every architectural decision.

Example time-series record:
  { metric: "cpu_usage", host: "web-01", value: 73.2, timestamp: 1704067200 }

Characteristics:
  - Append-heavy (write-once, read-many)
  - Naturally ordered by time
  - High cardinality (many unique tag combinations)
  - Recent data queried far more than old data
  - Aggregation queries dominate (avg, sum, percentile over time ranges)
  - Individual records rarely updated or deleted

Property	OLTP (Transactional)	Time-Series / OLAP
Write pattern	Random inserts/updates	Sequential appends
Read pattern	Point lookups by PK	Range scans over time
Query type	Single-row CRUD	Aggregations over millions of rows
Data lifecycle	Retained indefinitely	TTL / downsampled over time
Schema	Normalized (3NF)	Denormalized, wide tables
Index strategy	B-tree on PK/FK	Time-partitioned, columnar

OLTP vs OLAP

                OLTP                              OLAP
  +------------------------------+  +------------------------------+
  | User places order            |  | "What was total revenue      |
  | UPDATE inventory SET qty=..  |  |  by region last quarter?"    |
  | INSERT INTO orders ...       |  |                              |
  |                              |  | SELECT region, SUM(revenue)  |
  | Latency: < 10ms             |  | FROM sales                   |
  | Rows affected: 1-10         |  | WHERE date BETWEEN ...       |
  | Concurrency: thousands      |  | GROUP BY region              |
  +------------------------------+  |                              |
                                    | Latency: seconds to minutes  |
                                    | Rows scanned: millions       |
                                    | Concurrency: tens            |
                                    +------------------------------+

Dimension	OLTP	OLAP
Users	Application users	Analysts, dashboards
Data	Current state	Historical
Schema	Normalized	Star/snowflake
Optimized for	Writes, point reads	Reads, aggregations
Storage	Row-oriented	Column-oriented
Examples	PostgreSQL, MySQL	ClickHouse, BigQuery, Redshift

Columnar Storage: Why It Is Fast for Analytics

Row vs Column Storage Layout

Row-oriented (PostgreSQL, MySQL):
  Row 1: | user_id=1 | name="Alice" | age=30 | city="NYC"    |
  Row 2: | user_id=2 | name="Bob"   | age=25 | city="London" |
  Row 3: | user_id=3 | name="Carol" | age=35 | city="NYC"    |

  Disk: [1,Alice,30,NYC] [2,Bob,25,London] [3,Carol,35,NYC]

Column-oriented (ClickHouse, Parquet):
  user_id: [1, 2, 3]
  name:    ["Alice", "Bob", "Carol"]
  age:     [30, 25, 35]
  city:    ["NYC", "London", "NYC"]

  Disk: [1,2,3] [Alice,Bob,Carol] [30,25,35] [NYC,London,NYC]

Why Columnar Wins for Analytics

Query: SELECT AVG(age) FROM users WHERE city = 'NYC'

Row store: Must read ALL columns for EVERY row
  Read: [1,Alice,30,NYC] [2,Bob,25,London] [3,Carol,35,NYC]
  IO: 3 full rows * all columns = lots of wasted reads

Column store: Only read the 2 columns needed
  Read: city = [NYC, London, NYC]   --> filter: rows 0, 2
  Read: age  = [30, 35]             --> AVG = 32.5
  IO: only 2 columns, skip the rest

Three key advantages:

Read amplification reduction: Only read columns referenced in the query
Compression: Same-type values compress extremely well (run-length encoding, dictionary encoding, delta encoding)
Vectorized execution: Process entire column vectors using SIMD instructions

Compression Techniques

Technique	Best For	Example
Dictionary encoding	Low-cardinality strings	city: {0: "NYC", 1: "London"} -> [0, 1, 0]
Run-length encoding	Sorted repeated values	[NYC, NYC, NYC, London] -> [(NYC, 3), (London, 1)]
Delta encoding	Timestamps, sequences	[1000, 1001, 1002] -> [1000, +1, +1]
Bit packing	Small integers	Store 3-bit values packed into bytes
LZ4 / ZSTD	General blocks	Block-level compression after encoding

Real-world compression ratios: 5-10x is typical for columnar stores. ClickHouse often achieves 10-20x on real workloads.

Time-Series Databases

InfluxDB

Purpose-built for metrics and events.

Write (line protocol):
  cpu,host=web01,region=us-east value=73.2 1704067200000000000

Query (Flux language):
  from(bucket: "metrics")
    |> range(start: -1h)
    |> filter(fn: (r) => r._measurement == "cpu" and r.host == "web01")
    |> aggregateWindow(every: 5m, fn: mean)

Feature	Detail
Storage engine	TSM (Time-Structured Merge Tree)
Data model	Measurement + tags + fields + timestamp
Retention policies	Auto-delete data after TTL
Downsampling	Continuous queries / tasks
Best for	DevOps metrics, IoT sensor data

TimescaleDB

PostgreSQL extension for time-series. Full SQL compatibility.

sql
-- Create hypertable (auto-partitioned by time)
SELECT create_hypertable('metrics', 'time');

-- Insert like normal SQL
INSERT INTO metrics (time, host, cpu) VALUES (NOW(), 'web01', 73.2);

-- Query with full SQL power
SELECT time_bucket('5 minutes', time) AS bucket,
       host,
       AVG(cpu) as avg_cpu
FROM metrics
WHERE time > NOW() - INTERVAL '1 hour'
GROUP BY bucket, host
ORDER BY bucket DESC;

Feature	Detail
Engine	PostgreSQL + chunked hypertables
Data model	Regular SQL tables, auto-partitioned by time
Advantage	Full SQL, JOIN with relational data, extensions
Compression	Native columnar compression (90%+ reduction)
Best for	When you need SQL + time-series in one system

Prometheus

Pull-based monitoring system with its own TSDB.

Architecture:
  +-------------+      +------------+      +-----------+
  | Application | <--- | Prometheus | ---> | Grafana   |
  | /metrics    | pull | TSDB       | query| Dashboard |
  +-------------+      +------------+      +-----------+
                             |
                        [Alert Manager]

promql
# Average CPU over 5 minutes, by instance
avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m]))

# 99th percentile request latency
histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))

Feature	Detail
Collection	Pull-based (scrapes HTTP endpoints)
Storage	Local TSDB (WAL + blocks, 2-hour chunks)
Query	PromQL (purpose-built query language)
Retention	Default 15 days (local), longer via Thanos/Cortex
Best for	Infrastructure monitoring, alerting

Analytics Databases

ClickHouse

Open-source columnar OLAP database. Extremely fast for analytical queries.

sql
-- Create a MergeTree table (ClickHouse's core engine)
CREATE TABLE events (
    event_date Date,
    event_time DateTime,
    user_id UInt64,
    event_type LowCardinality(String),
    properties String
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(event_date)
ORDER BY (event_type, user_id, event_time);

-- Query: count events by type in the last hour
SELECT event_type, count()
FROM events
WHERE event_time > now() - INTERVAL 1 HOUR
GROUP BY event_type
ORDER BY count() DESC;

Feature	Detail
Storage	Columnar, compressed, sorted by primary key
Engine types	MergeTree family (ReplacingMergeTree, AggregatingMergeTree, etc.)
Performance	Scans billions of rows/sec on commodity hardware
Replication	ZooKeeper-based (or ClickHouse Keeper)
Best for	Real-time analytics, event tracking, log analysis

Why ClickHouse is fast:

Columnar storage with aggressive compression
Vectorized query execution (processes data in batches, SIMD)
Sparse primary index (granule-based, not per-row)
Parallel query execution across cores and shards

BigQuery (Google)

Serverless analytics warehouse. No infrastructure to manage.

sql
SELECT
  DATE(created_at) as day,
  COUNT(DISTINCT user_id) as dau
FROM `project.dataset.events`
WHERE created_at >= '2024-01-01'
GROUP BY day
ORDER BY day;

Feature	Detail
Architecture	Separation of storage (Colossus) and compute (Dremel)
Pricing	Per-query (bytes scanned) + storage
Performance	Massively parallel, auto-scales
Best for	Ad-hoc analytics, data warehousing, ML features

Redshift (AWS)

Managed columnar data warehouse based on ParAccel (PostgreSQL fork).

Feature	Detail
Architecture	Cluster of nodes (leader + compute nodes)
Storage	Columnar, zone maps, sort keys
Scaling	Resize cluster or use Redshift Serverless
Distribution styles	KEY, EVEN, ALL (affects join performance)
Best for	Enterprise data warehousing, BI dashboards

Data Warehousing Concepts

Star Schema

The most common analytical schema. Simple, fast to query.

              +----------------+
              | dim_date       |
              | date_key (PK)  |
              | year, month,   |
              | quarter, ...   |
              +-------+--------+
                      |
+---------------+     |     +----------------+
| dim_product   |     |     | dim_customer   |
| product_key   +--+  |  +--+ customer_key   |
| name, category|  |  |  |  | name, region   |
+---------------+  |  |  |  +----------------+
                   |  |  |
              +----+--+--+----+
              |  fact_sales    |
              |  date_key (FK) |
              |  product_key   |
              |  customer_key  |
              |  quantity       |
              |  revenue        |
              |  discount       |
              +----------------+

Properties:

Central fact table (measurable events: sales, clicks, orders)
Dimension tables radiate outward (who, what, when, where)
Denormalized dimensions (one JOIN to get any dimension attribute)
Optimized for aggregation queries

Snowflake Schema

Normalized dimensions. Reduces redundancy at the cost of more JOINs.

  dim_category          dim_product          fact_sales
  +------------+        +-------------+      +------------+
  | cat_key    | <----- | product_key | <--- | product_key|
  | cat_name   |        | name        |      | date_key   |
  +------------+        | cat_key (FK)|      | revenue    |
                        +-------------+      +------------+

Star vs Snowflake

Property	Star	Snowflake
Query complexity	Simpler (fewer JOINs)	More JOINs
Query performance	Faster	Slower
Storage	More redundancy	Less redundancy
ETL complexity	Simpler	More complex
Recommendation	Default choice	Only if storage is a concern

Fact and Dimension Tables

Aspect	Fact Table	Dimension Table
Content	Measurements, metrics	Descriptive attributes
Size	Very large (billions of rows)	Small to medium
Growth	Rapid (new events constantly)	Slow (new products, users)
Examples	Sales, clicks, impressions	Products, users, dates, regions
Keys	Foreign keys to dimensions	Surrogate keys (auto-increment)

Materialized Views and Pre-Aggregation

Materialized Views

Pre-computed query results stored as tables. Trade storage for query speed.

sql
-- PostgreSQL materialized view
CREATE MATERIALIZED VIEW daily_revenue AS
SELECT
  date_trunc('day', order_time) AS day,
  product_category,
  SUM(revenue) AS total_revenue,
  COUNT(*) AS order_count
FROM orders
GROUP BY 1, 2;

-- Refresh (full recomputation)
REFRESH MATERIALIZED VIEW daily_revenue;

-- ClickHouse: use AggregatingMergeTree for incremental materialized views
CREATE MATERIALIZED VIEW daily_revenue_mv
ENGINE = AggregatingMergeTree()
ORDER BY (day, product_category)
AS SELECT
  toDate(order_time) AS day,
  product_category,
  sumState(revenue) AS total_revenue,
  countState() AS order_count
FROM orders
GROUP BY day, product_category;

Rollup and Downsampling

Reduce storage by aggregating old high-resolution data into lower resolution.

Raw data (1-second resolution):
  12:00:01  cpu=72.3
  12:00:02  cpu=73.1
  12:00:03  cpu=71.8
  ...
  (86,400 points per day per metric)

After 7 days --> downsample to 1-minute:
  12:00  cpu_avg=72.4, cpu_max=73.1, cpu_min=71.8
  12:01  cpu_avg=74.2, cpu_max=76.0, cpu_min=73.5
  (1,440 points per day -- 60x reduction)

After 30 days --> downsample to 1-hour:
  12:00  cpu_avg=73.8, cpu_max=82.1, cpu_min=68.2
  (24 points per day -- 3,600x reduction)

Retention policy example:

Age	Resolution	Storage per metric/day
0-7 days	1 second	86,400 points
7-30 days	1 minute	1,440 points
30-365 days	1 hour	24 points
1+ years	1 day	1 point

Lambda vs Kappa Architecture

Lambda Architecture

Separate batch and real-time processing paths. Merge results at query time.

                  +-------------------+
                  |   Data Source      |
                  +--------+----------+
                           |
              +------------+------------+
              |                         |
     +--------+--------+      +--------+--------+
     |   Batch Layer    |      |  Speed Layer    |
     | (Hadoop/Spark)   |      | (Flink/Storm)   |
     | Complete, accurate|     | Low-latency,     |
     | Hours-old results |     | approximate      |
     +--------+---------+     +--------+---------+
              |                         |
     +--------+---------+     +--------+---------+
     |  Batch Views     |     |  Real-time Views |
     | (pre-computed)   |     | (incremental)    |
     +--------+---------+     +--------+---------+
              |                         |
              +------------+------------+
                           |
                  +--------+----------+
                  |   Serving Layer    |
                  | Merge batch +      |
                  | real-time results  |
                  +-------------------+

Pros: Batch layer is simple and recomputable; speed layer handles freshness. Cons: Two codebases, complex merging logic, operational overhead.

Kappa Architecture

Single stream processing path handles everything. Reprocessing = replay the stream.

  Data Source --> Kafka (immutable log) --> Stream Processor (Flink)
                      |                           |
                      |                    +------+------+
                      |                    | Serving     |
                      |                    | Layer       |
                      |                    +-------------+
                      |
                 To reprocess:
                 Reset consumer offset to beginning
                 Replay all events through new processor version

Pros: One codebase, simpler operations, single source of truth (the log). Cons: Reprocessing can be slow for large history; not all problems fit streaming.

When to Use Which

Scenario	Recommendation
Legacy batch + new real-time need	Lambda (add speed layer)
Greenfield analytics pipeline	Kappa (simpler)
Complex ML training + real-time serving	Lambda (batch for training)
Event-driven microservices	Kappa (Kafka + Flink)
Both approaches are overkill	Just use a good OLAP DB (ClickHouse, BigQuery)

Interview Tips

Know when to reach for a columnar store. If the interviewer mentions dashboards, analytics, or "aggregate over millions of rows," that is your signal.
Star schema is the default. Don't overcomplicate with snowflake schema unless storage is explicitly constrained.
Explain why columnar compression works. Same data types in a column compress far better than mixed types in a row. This is the key insight.
Downsampling is essential for time-series at scale. Show you understand retention policies and the storage/resolution trade-off.
Lambda vs Kappa is a common follow-up in streaming questions. Know the trade-offs, and lean toward Kappa for newer systems.
Mention separation of storage and compute. Modern warehouses (BigQuery, Snowflake, Redshift Serverless) separate these, enabling independent scaling.

Resources

DDIA (Kleppmann) -- Chapter 3: Storage and Retrieval (column-oriented storage), Chapter 10: Batch Processing, Chapter 11: Stream Processing
System Design Interview Vol. 2 (Alex Xu) -- Metrics monitoring and alerting system
ClickHouse Documentation: MergeTree engine family, query optimization
"The Lambda Architecture" -- Nathan Marz (original blog post)
"Questioning the Lambda Architecture" -- Jay Kreps (Kappa proposal)
Google Dremel Paper: "Dremel: Interactive Analysis of Web-Scale Datasets"
Prometheus Documentation: TSDB storage, PromQL
"The Data Warehouse Toolkit" (Ralph Kimball) -- Star schema, dimensional modeling
TimescaleDB Documentation: Hypertables, continuous aggregates
InfluxDB Documentation: TSM engine, Flux query language

Previous: 21 - Search Systems | Next: 23 - Distributed Task Scheduling