10 - Kubernetes Architecture
What Is Kubernetes?
Kubernetes (K8s) is an open-source container orchestration platform. It automates:
- Deployment: rolling out containers across machines
- Scaling: adjusting replicas based on load
- Self-healing: restarting failed containers, rescheduling
- Service discovery: finding services by name
- Load balancing: distributing traffic across replicas
- Storage orchestration: attaching persistent storage
- Configuration management: managing secrets and config
Origin: Developed by Google (based on Borg/Omega), donated to CNCF in 2014.
High-Level Architecture
┌─── Kubernetes Cluster ─────────────────────────────────────────┐
│ │
│ ┌─── Control Plane (Master) ───────────────────────────────┐ │
│ │ │ │
│ │ ┌──────────┐ ┌─────────────────┐ ┌──────────────┐ │ │
│ │ │ API │ │ Controller │ │ Scheduler │ │ │
│ │ │ Server │ │ Manager │ │ │ │ │
│ │ └────┬─────┘ └─────────────────┘ └──────────────┘ │ │
│ │ │ │ │
│ │ ┌────┴─────┐ ┌─────────────────┐ │ │
│ │ │ etcd │ │ Cloud │ │ │
│ │ │ (store) │ │ Controller Mgr │ │ │
│ │ └──────────┘ └─────────────────┘ │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │ │
│ kubectl / API │
│ │ │
│ ┌─── Worker Node 1 ──────────┐ ┌─── Worker Node 2 ────────┐ │
│ │ │ │ │ │
│ │ ┌────────┐ ┌────────┐ │ │ ┌────────┐ ┌────────┐ │ │
│ │ │ Pod A │ │ Pod B │ │ │ │ Pod C │ │ Pod D │ │ │
│ │ └────────┘ └────────┘ │ │ └────────┘ └────────┘ │ │
│ │ │ │ │ │
│ │ ┌────────────────────┐ │ │ ┌────────────────────┐ │ │
│ │ │ kubelet │ │ │ │ kubelet │ │ │
│ │ ├────────────────────┤ │ │ ├────────────────────┤ │ │
│ │ │ kube-proxy │ │ │ │ kube-proxy │ │ │
│ │ ├────────────────────┤ │ │ ├────────────────────┤ │ │
│ │ │ Container Runtime │ │ │ │ Container Runtime │ │ │
│ │ └────────────────────┘ │ │ └────────────────────┘ │ │
│ └────────────────────────────┘ └──────────────────────────┘ │
└────────────────────────────────────────────────────────────────┘
Control Plane Components
1. API Server (kube-apiserver)
The front door to Kubernetes. Everything goes through it.
- RESTful API -- all communication passes here
- Authentication, authorization, admission control
- The only component that talks to etcd
- Stateless (can run multiple replicas)
bash# Every kubectl command talks to the API server kubectl get pods # → GET https://api-server:6443/api/v1/namespaces/default/pods # You can also use curl directly curl -k https://api-server:6443/api/v1/pods \ -H "Authorization: Bearer $TOKEN"
2. etcd
The brain of the cluster. A distributed key-value store.
- Stores ALL cluster state (desired + actual)
- Strongly consistent (Raft consensus)
- Only the API server reads/writes to etcd
- Must be backed up regularly
bash# What's stored in etcd: # /registry/pods/default/my-pod # /registry/services/default/my-service # /registry/deployments/default/my-app # /registry/secrets/default/my-secret # Backup etcd etcdctl snapshot save /backup/etcd-snapshot.db etcdctl snapshot restore /backup/etcd-snapshot.db
3. Controller Manager (kube-controller-manager)
Runs control loops that watch cluster state and make changes:
| Controller | What It Does |
|---|---|
| ReplicaSet Controller | Ensures correct number of pod replicas |
| Deployment Controller | Manages rollouts and rollbacks |
| Node Controller | Detects and responds to node failures |
| Job Controller | Creates pods for one-time tasks |
| Service Account Controller | Creates default service accounts |
| Namespace Controller | Manages namespace lifecycle |
| Endpoint Controller | Populates Endpoints objects |
Each controller follows the reconciliation loop:
Observe (current state) → Compare (desired state) → Act (make changes)
4. Scheduler (kube-scheduler)
Decides which node a pod runs on:
New Pod (unscheduled)
↓
┌─── Filtering ───────┐
│ Has enough CPU? │
│ Has enough RAM? │
│ Node selectors? │
│ Taints/tolerations? │
│ Affinity rules? │
└────────┬────────────┘
↓
Feasible Nodes
↓
┌─── Scoring ──────┐
│ Least resources? │
│ Spread pods? │
│ Affinity score? │
└────────┬─────────┘
↓
Best Node Selected
↓
Pod bound to Node
5. Cloud Controller Manager
Integrates with cloud providers (AWS, GCP, Azure):
- Creates load balancers for LoadBalancer services
- Manages cloud-specific node lifecycle
- Manages cloud routes and volumes
Worker Node Components
1. kubelet
The agent running on every node:
- Registers the node with the API server
- Watches for pod assignments
- Starts/stops containers via container runtime
- Reports pod and node status back to API server
- Runs liveness and readiness probes
bash# kubelet is a systemd service on each node systemctl status kubelet # It talks to the container runtime via CRI (Container Runtime Interface) # kubelet → CRI → containerd → runc → container
2. kube-proxy
Manages network rules on each node:
- Implements Services (ClusterIP, NodePort, LoadBalancer)
- Routes traffic to the correct pod
- Modes: iptables (default), IPVS (better for large clusters), nftables
bash# iptables mode: creates NAT rules iptables -t nat -L | grep my-service # IPVS mode: virtual servers for load balancing ipvsadm -Ln
3. Container Runtime
The software that runs containers:
- containerd (most common, default in modern K8s)
- CRI-O (lightweight, K8s-native)
- Docker was removed as a runtime in K8s 1.24 (but Docker-built images still work)
How It All Works Together
Example: You deploy an app
bashkubectl apply -f deployment.yaml
Step 1: kubectl → API Server
"Create a Deployment with 3 replicas"
Step 2: API Server → etcd
Stores the Deployment object
Step 3: Deployment Controller notices new Deployment
Creates a ReplicaSet
Step 4: ReplicaSet Controller notices ReplicaSet needs 3 pods
Creates 3 Pod objects (unscheduled)
Step 5: Scheduler notices 3 unscheduled pods
Assigns each to a node (based on resources, constraints)
Step 6: kubelet on each node notices assigned pods
Pulls images, creates containers via containerd
Step 7: kube-proxy on each node
Updates iptables rules for the Service
Step 8: Pods are running!
Controllers continuously reconcile desired vs actual state
Kubernetes API and Objects
API Structure
/api/v1/ # Core API (pods, services, etc.)
/apis/apps/v1/ # Apps API (deployments, statefulsets)
/apis/batch/v1/ # Batch API (jobs, cronjobs)
/apis/networking.k8s.io/v1/ # Networking (ingress, network policies)
/apis/rbac.authorization.k8s.io/ # RBAC
Object Structure (Every K8s Resource)
yamlapiVersion: apps/v1 # API group/version kind: Deployment # Resource type metadata: # Identity name: my-app namespace: default labels: app: my-app annotations: description: "My application" spec: # Desired state (you define) replicas: 3 selector: matchLabels: app: my-app template: metadata: labels: app: my-app spec: containers: - name: app image: myapp:v1 status: # Actual state (K8s fills in) readyReplicas: 3 availableReplicas: 3
Declarative vs Imperative
bash# Imperative (tell K8s what to do step by step) kubectl create deployment my-app --image=myapp:v1 kubectl scale deployment my-app --replicas=3 kubectl expose deployment my-app --port=80 # Declarative (tell K8s the desired end state) kubectl apply -f deployment.yaml # K8s figures out what changes are needed
Always prefer declarative -- it's idempotent, version-controllable, and reviewable.
Running Kubernetes Locally
| Tool | Description | Best For |
|---|---|---|
| minikube | Full K8s in a VM or container | Learning, testing |
| kind | K8s in Docker containers | CI/CD, fast iterations |
| k3d | k3s (lightweight K8s) in Docker | Resource-constrained envs |
| Docker Desktop | Built-in K8s option | macOS/Windows dev |
| microk8s | Snap-based lightweight K8s | Ubuntu, IoT |
bash# minikube minikube start minikube dashboard # kind (Kubernetes in Docker) kind create cluster --name dev kind delete cluster --name dev # k3d k3d cluster create mycluster k3d cluster delete mycluster
kubectl Essentials
bash# Cluster info kubectl cluster-info kubectl get nodes # Context management (multiple clusters) kubectl config get-contexts kubectl config use-context production kubectl config set-context --current --namespace=my-namespace # Resource exploration kubectl api-resources # All resource types kubectl explain pod # Documentation for a resource kubectl explain pod.spec.containers # Drill into fields
FAANG Interview Angle
Common questions:
- "Explain the Kubernetes architecture and control plane components"
- "What happens when you run
kubectl apply -f deployment.yaml?" - "What is etcd and why is it critical?"
- "How does the scheduler decide where to place a pod?"
- "What's the difference between imperative and declarative management?"
Key answers:
- Control plane (API server, etcd, controller manager, scheduler) + worker nodes (kubelet, kube-proxy, runtime)
- API server → etcd → controllers create pods → scheduler assigns nodes → kubelet runs containers
- etcd is the single source of truth; losing it = losing cluster state
- Scheduler: filter (feasible nodes) → score (best node) → bind
- Declarative is idempotent, auditable, GitOps-friendly