10 - Kubernetes Architecture

What Is Kubernetes?

Kubernetes (K8s) is an open-source container orchestration platform. It automates:

Deployment: rolling out containers across machines
Scaling: adjusting replicas based on load
Self-healing: restarting failed containers, rescheduling
Service discovery: finding services by name
Load balancing: distributing traffic across replicas
Storage orchestration: attaching persistent storage
Configuration management: managing secrets and config

Origin: Developed by Google (based on Borg/Omega), donated to CNCF in 2014.

High-Level Architecture

┌─── Kubernetes Cluster ─────────────────────────────────────────┐
│                                                                │
│  ┌─── Control Plane (Master) ───────────────────────────────┐  │
│  │                                                          │  │
│  │  ┌──────────┐  ┌─────────────────┐  ┌──────────────┐     │  │
│  │  │ API      │  │ Controller      │  │ Scheduler    │     │  │
│  │  │ Server   │  │ Manager         │  │              │     │  │
│  │  └────┬─────┘  └─────────────────┘  └──────────────┘     │  │
│  │       │                                                  │  │
│  │  ┌────┴─────┐  ┌─────────────────┐                       │  │
│  │  │ etcd     │  │ Cloud           │                       │  │
│  │  │ (store)  │  │ Controller Mgr  │                       │  │
│  │  └──────────┘  └─────────────────┘                       │  │
│  └──────────────────────────────────────────────────────────┘  │
│                          │                                     │
│                     kubectl / API                              │
│                          │                                     │
│  ┌─── Worker Node 1 ──────────┐  ┌─── Worker Node 2 ────────┐  │
│  │                            │  │                          │  │
│  │  ┌────────┐ ┌────────┐     │  │  ┌────────┐ ┌────────┐   │  │
│  │  │ Pod A  │ │ Pod B  │     │  │  │ Pod C  │ │ Pod D  │   │  │
│  │  └────────┘ └────────┘     │  │  └────────┘ └────────┘   │  │
│  │                            │  │                          │  │
│  │  ┌────────────────────┐    │  │  ┌────────────────────┐  │  │
│  │  │ kubelet            │    │  │  │ kubelet            │  │  │
│  │  ├────────────────────┤    │  │  ├────────────────────┤  │  │
│  │  │ kube-proxy         │    │  │  │ kube-proxy         │  │  │
│  │  ├────────────────────┤    │  │  ├────────────────────┤  │  │
│  │  │ Container Runtime  │    │  │  │ Container Runtime  │  │  │
│  │  └────────────────────┘    │  │  └────────────────────┘  │  │
│  └────────────────────────────┘  └──────────────────────────┘  │
└────────────────────────────────────────────────────────────────┘

Control Plane Components

1. API Server (`kube-apiserver`)

The front door to Kubernetes. Everything goes through it.

RESTful API -- all communication passes here
Authentication, authorization, admission control
The only component that talks to etcd
Stateless (can run multiple replicas)

bash
# Every kubectl command talks to the API server
kubectl get pods
# → GET https://api-server:6443/api/v1/namespaces/default/pods

# You can also use curl directly
curl -k https://api-server:6443/api/v1/pods \
  -H "Authorization: Bearer $TOKEN"

2. etcd

The brain of the cluster. A distributed key-value store.

Stores ALL cluster state (desired + actual)
Strongly consistent (Raft consensus)
Only the API server reads/writes to etcd
Must be backed up regularly

bash
# What's stored in etcd:
# /registry/pods/default/my-pod
# /registry/services/default/my-service
# /registry/deployments/default/my-app
# /registry/secrets/default/my-secret

# Backup etcd
etcdctl snapshot save /backup/etcd-snapshot.db
etcdctl snapshot restore /backup/etcd-snapshot.db

3. Controller Manager (`kube-controller-manager`)

Runs control loops that watch cluster state and make changes:

Controller	What It Does
ReplicaSet Controller	Ensures correct number of pod replicas
Deployment Controller	Manages rollouts and rollbacks
Node Controller	Detects and responds to node failures
Job Controller	Creates pods for one-time tasks
Service Account Controller	Creates default service accounts
Namespace Controller	Manages namespace lifecycle
Endpoint Controller	Populates Endpoints objects

Each controller follows the reconciliation loop:

Observe (current state) → Compare (desired state) → Act (make changes)

4. Scheduler (`kube-scheduler`)

Decides which node a pod runs on:

New Pod (unscheduled)
       ↓
  ┌─── Filtering ───────┐
  │ Has enough CPU?     │
  │ Has enough RAM?     │
  │ Node selectors?     │
  │ Taints/tolerations? │
  │ Affinity rules?     │
  └────────┬────────────┘
           ↓
    Feasible Nodes
           ↓
  ┌─── Scoring ──────┐
  │ Least resources? │
  │ Spread pods?     │
  │ Affinity score?  │
  └────────┬─────────┘
           ↓
    Best Node Selected
           ↓
    Pod bound to Node

5. Cloud Controller Manager

Integrates with cloud providers (AWS, GCP, Azure):

Creates load balancers for LoadBalancer services
Manages cloud-specific node lifecycle
Manages cloud routes and volumes

Worker Node Components

1. kubelet

The agent running on every node:

Registers the node with the API server
Watches for pod assignments
Starts/stops containers via container runtime
Reports pod and node status back to API server
Runs liveness and readiness probes

bash
# kubelet is a systemd service on each node
systemctl status kubelet

# It talks to the container runtime via CRI (Container Runtime Interface)
# kubelet → CRI → containerd → runc → container

2. kube-proxy

Manages network rules on each node:

Implements Services (ClusterIP, NodePort, LoadBalancer)
Routes traffic to the correct pod
Modes: iptables (default), IPVS (better for large clusters), nftables

bash
# iptables mode: creates NAT rules
iptables -t nat -L | grep my-service

# IPVS mode: virtual servers for load balancing
ipvsadm -Ln

3. Container Runtime

The software that runs containers:

containerd (most common, default in modern K8s)
CRI-O (lightweight, K8s-native)
Docker was removed as a runtime in K8s 1.24 (but Docker-built images still work)

How It All Works Together

Example: You deploy an app

bash
kubectl apply -f deployment.yaml

Step 1: kubectl → API Server
        "Create a Deployment with 3 replicas"

Step 2: API Server → etcd
        Stores the Deployment object

Step 3: Deployment Controller notices new Deployment
        Creates a ReplicaSet

Step 4: ReplicaSet Controller notices ReplicaSet needs 3 pods
        Creates 3 Pod objects (unscheduled)

Step 5: Scheduler notices 3 unscheduled pods
        Assigns each to a node (based on resources, constraints)

Step 6: kubelet on each node notices assigned pods
        Pulls images, creates containers via containerd

Step 7: kube-proxy on each node
        Updates iptables rules for the Service

Step 8: Pods are running!
        Controllers continuously reconcile desired vs actual state

Kubernetes API and Objects

API Structure

/api/v1/                          # Core API (pods, services, etc.)
/apis/apps/v1/                    # Apps API (deployments, statefulsets)
/apis/batch/v1/                   # Batch API (jobs, cronjobs)
/apis/networking.k8s.io/v1/       # Networking (ingress, network policies)
/apis/rbac.authorization.k8s.io/  # RBAC

Object Structure (Every K8s Resource)

yaml
apiVersion: apps/v1       # API group/version
kind: Deployment          # Resource type
metadata:                 # Identity
  name: my-app
  namespace: default
  labels:
    app: my-app
  annotations:
    description: "My application"
spec:                     # Desired state (you define)
  replicas: 3
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
        - name: app
          image: myapp:v1
status:                   # Actual state (K8s fills in)
  readyReplicas: 3
  availableReplicas: 3

Declarative vs Imperative

bash
# Imperative (tell K8s what to do step by step)
kubectl create deployment my-app --image=myapp:v1
kubectl scale deployment my-app --replicas=3
kubectl expose deployment my-app --port=80

# Declarative (tell K8s the desired end state)
kubectl apply -f deployment.yaml
# K8s figures out what changes are needed

Always prefer declarative -- it's idempotent, version-controllable, and reviewable.

Running Kubernetes Locally

Tool	Description	Best For
minikube	Full K8s in a VM or container	Learning, testing
kind	K8s in Docker containers	CI/CD, fast iterations
k3d	k3s (lightweight K8s) in Docker	Resource-constrained envs
Docker Desktop	Built-in K8s option	macOS/Windows dev
microk8s	Snap-based lightweight K8s	Ubuntu, IoT

bash
# minikube
minikube start
minikube dashboard

# kind (Kubernetes in Docker)
kind create cluster --name dev
kind delete cluster --name dev

# k3d
k3d cluster create mycluster
k3d cluster delete mycluster

kubectl Essentials

bash
# Cluster info
kubectl cluster-info
kubectl get nodes

# Context management (multiple clusters)
kubectl config get-contexts
kubectl config use-context production
kubectl config set-context --current --namespace=my-namespace

# Resource exploration
kubectl api-resources        # All resource types
kubectl explain pod          # Documentation for a resource
kubectl explain pod.spec.containers  # Drill into fields

FAANG Interview Angle

Common questions:

"Explain the Kubernetes architecture and control plane components"
"What happens when you run kubectl apply -f deployment.yaml?"
"What is etcd and why is it critical?"
"How does the scheduler decide where to place a pod?"
"What's the difference between imperative and declarative management?"

Key answers:

Control plane (API server, etcd, controller manager, scheduler) + worker nodes (kubelet, kube-proxy, runtime)
API server → etcd → controllers create pods → scheduler assigns nodes → kubelet runs containers
etcd is the single source of truth; losing it = losing cluster state
Scheduler: filter (feasible nodes) → score (best node) → bind
Declarative is idempotent, auditable, GitOps-friendly