Microservices Architecture

Why This Matters

Every FAANG company runs microservices at scale. You'll be expected to decompose systems into services, handle inter-service communication, and understand the trade-offs.


Monolith vs Microservices

Monolith

[Single deployable unit]
├── User module
├── Order module
├── Payment module
├── Notification module
└── Shared database

Pros: Simple deployment, no network overhead, easy debugging, ACID transactions Cons: Scales as one unit, long deployment cycles, team bottlenecks, technology lock-in

Microservices

[User Service]      [Order Service]      [Payment Service]
  ↕ own DB            ↕ own DB             ↕ own DB
       ↕                   ↕                    ↕
  [Message Bus / API Gateway / Service Mesh]

Pros: Independent deployment, team autonomy, technology diversity, targeted scaling Cons: Network complexity, distributed transactions, operational overhead, debugging is harder

When to Start with Microservices?

  • Don't. Start monolith, extract services when you have clear domain boundaries
  • Extract when: team size grows, deployment conflicts increase, different scaling needs emerge
  • "Monolith first" — Martin Fowler

Service Decomposition

Domain-Driven Design (DDD)

Decompose by bounded contexts — areas of the business with clear boundaries.

E-commerce:
  Bounded Context: Catalog (products, categories, search)
  Bounded Context: Orders (cart, checkout, order history)
  Bounded Context: Payments (charges, refunds, invoices)
  Bounded Context: Shipping (tracking, carriers, addresses)
  Bounded Context: Users (auth, profiles, preferences)

Each bounded context becomes a service (or group of services).

Single Responsibility

Each service should:

  • Own one business capability
  • Own its data (no shared databases!)
  • Be deployable independently
  • Be owned by one team

Size Guidelines

  • "Can be rewritten in 2 weeks" (Amazon guideline)
  • "Two-pizza team" can own it (6-10 people)
  • If it needs constant coordination with another service → maybe merge them

API Gateway

Client → API Gateway → User Service
                     → Order Service
                     → Payment Service

Responsibilities

FeatureDescription
RoutingRoute /users/* to User Service, /orders/* to Order Service
AuthenticationVerify JWT/OAuth tokens before forwarding
Rate limitingProtect backends from abuse
Load balancingDistribute across service instances
Response aggregationCombine multiple service responses
Protocol translationREST ↔ gRPC, HTTP ↔ WebSocket
CachingCache GET responses
Circuit breakingStop forwarding to failing services

BFF (Backend for Frontend)

Mobile App → Mobile BFF → Services (optimized payloads)
Web App    → Web BFF    → Services (different data needs)

Each frontend gets a dedicated API gateway optimized for its needs.

Technologies

  • Kong, AWS API Gateway, Envoy, Traefik, Netflix Zuul, NGINX

Service Mesh

What Is It?

Infrastructure layer that handles service-to-service communication via sidecar proxies.

[Service A] ↔ [Sidecar Proxy A] ↔ [Sidecar Proxy B] ↔ [Service B]
                     ↕
              [Control Plane]

What It Handles (So Your Code Doesn't Have To)

  • Service discovery — find other services
  • Load balancing — distribute traffic
  • Encryption — mTLS between services
  • Observability — metrics, traces, logs
  • Circuit breaking — stop cascading failures
  • Retry & timeout — automatic retry logic
  • Traffic splitting — canary, A/B testing

Technologies

  • Istio (most popular, complex)
  • Linkerd (simpler, lighter)
  • Envoy (proxy used by both Istio and others)
  • AWS App Mesh

When to Use Service Mesh

  • Large number of services (50+)
  • Need consistent observability/security across services
  • Multiple teams, want to standardize communication patterns
  • Don't use for small systems (overhead not worth it)

Service Discovery

Problem

Services are dynamic — IPs change, instances scale up/down. How does Service A find Service B?

Client-Side Discovery

Service A → Service Registry → "Service B is at 10.0.1.5:8080, 10.0.1.6:8080"
Service A → Load balance locally → 10.0.1.5:8080
  • Client queries registry, does its own load balancing
  • Used by: Netflix Eureka + Ribbon

Server-Side Discovery

Service A → Load Balancer → Service B (LB knows instances)
                           (LB queries registry)
  • Client sends to LB, LB routes to correct instance
  • Used by: AWS ALB + ECS, Kubernetes Services

Service Registries

  • etcd — distributed KV store (Kubernetes uses it)
  • Consul — service discovery + health checks + KV
  • ZooKeeper — coordination service (older)
  • Kubernetes DNS — built-in service discovery via DNS

Inter-Service Communication Patterns

Synchronous

Order Service --HTTP/gRPC-→ Payment Service
               (waits for response)
  • Simple, intuitive
  • Creates coupling and latency chain
  • Use for: queries, simple request-response

Asynchronous (Event-Driven)

Order Service --event-→ Message Broker --event-→ Payment Service
               (fire and forget)               (processes when ready)
  • Decoupled, resilient, scalable
  • Harder to debug, eventual consistency
  • Use for: commands, notifications, data sync

Request-Reply over Messages

Order Service → Request Queue → Payment Service
Payment Service → Reply Queue → Order Service
  • Async but with response
  • Correlation ID links request to reply

Data Management in Microservices

Database per Service (Critical!)

User Service → User DB (PostgreSQL)
Order Service → Order DB (MySQL)
Search Service → Search Index (Elasticsearch)
Analytics Service → Analytics DB (ClickHouse)

Never share databases between services. This creates coupling.

Data Consistency

  • Use Saga pattern for distributed transactions (see 13 - Distributed Transactions)
  • Accept eventual consistency where possible
  • Use events to propagate changes between services

CQRS for Complex Reads

Commands → Write Service → Write DB → Events → Read Service → Read DB (denormalized)
Queries  → Read Service  → Read DB (optimized for queries)

Common Anti-Patterns

Anti-PatternProblemFix
Distributed monolithServices tightly coupled, must deploy togetherProper bounded contexts, async communication
Shared databaseAny service can read/write any tableDatabase per service
Chatty servicesToo many inter-service calls per requestAggregate APIs, BFF, caching
Mega serviceOne service does too muchDecompose by domain
Nano servicesToo many tiny servicesMerge related services
No API versioningBreaking changes cascadeVersion APIs, backwards compatibility

Observability in Microservices

The Three Pillars

  1. Logs — structured logging with correlation IDs
  2. Metrics — request rate, error rate, latency (RED method)
  3. Traces — distributed tracing across services (Jaeger, Zipkin)

Correlation ID

Client → API Gateway (generates correlation-id: abc-123)
  → Service A (logs with correlation-id: abc-123)
    → Service B (logs with correlation-id: abc-123)
      → Service C (logs with correlation-id: abc-123)

One ID traces a request across all services.


Resources


Previous: 15 - Scaling Strategies | Next: 17 - Rate Limiting & Throttling