11 - Pods & Workloads

What Is a Pod?

A Pod is the smallest deployable unit in Kubernetes. It's a wrapper around one or more containers that:

Share the same network namespace (same IP, same localhost)
Share the same IPC namespace
Can share volumes
Are scheduled together on the same node
Have a shared lifecycle

┌─── Pod (IP: 10.244.1.5) ───────────────────┐
│                                            │
│  ┌─────────────┐    ┌─────────────┐        │
│  │ Container 1 │    │ Container 2 │        │
│  │ (main app)  │◄──►│ (sidecar)   │        │
│  │ :8080       │    │ :9090       │        │
│  └──────┬──────┘    └──────┬──────┘        │
│         │                  │               │
│         └──── localhost ───┘               │
│                                            │
│  ┌─────────────────────────────────┐       │
│  │ Shared Volume                   │       │
│  └─────────────────────────────────┘       │
└────────────────────────────────────────────┘

Pod Manifest

yaml
apiVersion: v1
kind: Pod
metadata:
  name: my-app
  namespace: default
  labels:
    app: my-app
    version: v1
  annotations:
    description: "Main application pod"
spec:
  # --- Init Containers (run before main containers, sequentially) ---
  initContainers:
    - name: init-db
      image: busybox:1.36
      command: ['sh', '-c', 'until nc -z db-service 5432; do echo waiting for db; sleep 2; done']
    
    - name: init-migrations
      image: myapp:v1
      command: ['./migrate', 'up']

  # --- Main Containers ---
  containers:
    - name: app
      image: myapp:v1
      ports:
        - containerPort: 8080
          name: http
          protocol: TCP
      
      # --- Environment ---
      env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: url
        - name: NODE_ENV
          value: "production"
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name  # Downward API
        - name: CPU_LIMIT
          valueFrom:
            resourceFieldRef:
              containerName: app
              resource: limits.cpu
      
      envFrom:
        - configMapRef:
            name: app-config
        - secretRef:
            name: app-secrets
      
      # --- Resources ---
      resources:
        requests:           # Minimum guaranteed
          cpu: "250m"       # 0.25 CPU cores
          memory: "256Mi"   # 256 MiB
        limits:             # Maximum allowed
          cpu: "1"          # 1 CPU core
          memory: "512Mi"   # OOM killed if exceeded
      
      # --- Probes ---
      startupProbe:
        httpGet:
          path: /health
          port: 8080
        failureThreshold: 30
        periodSeconds: 10
        # App has 300s to start before being killed
      
      livenessProbe:
        httpGet:
          path: /health
          port: 8080
        initialDelaySeconds: 0
        periodSeconds: 10
        timeoutSeconds: 3
        failureThreshold: 3
        # If 3 consecutive failures: restart container
      
      readinessProbe:
        httpGet:
          path: /ready
          port: 8080
        periodSeconds: 5
        timeoutSeconds: 2
        failureThreshold: 3
        # If failing: remove from Service endpoints (no traffic)
      
      # --- Volume Mounts ---
      volumeMounts:
        - name: data
          mountPath: /app/data
        - name: config
          mountPath: /etc/app/config
          readOnly: true
        - name: tmp
          mountPath: /tmp
      
      # --- Security ---
      securityContext:
        runAsNonRoot: true
        runAsUser: 1001
        readOnlyRootFilesystem: true
        allowPrivilegeEscalation: false
        capabilities:
          drop: ["ALL"]
    
    # --- Sidecar Container ---
    - name: log-shipper
      image: fluent/fluent-bit:2.2
      volumeMounts:
        - name: data
          mountPath: /app/data
          readOnly: true

  # --- Volumes ---
  volumes:
    - name: data
      persistentVolumeClaim:
        claimName: app-data-pvc
    - name: config
      configMap:
        name: app-config
    - name: tmp
      emptyDir: {}

  # --- Pod-level Settings ---
  restartPolicy: Always        # Always | OnFailure | Never
  terminationGracePeriodSeconds: 30
  serviceAccountName: my-app-sa
  
  # --- Scheduling ---
  nodeSelector:
    disktype: ssd
  
  tolerations:
    - key: "dedicated"
      operator: "Equal"
      value: "high-memory"
      effect: "NoSchedule"
  
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchExpressions:
                - key: app
                  operator: In
                  values: ["my-app"]
            topologyKey: kubernetes.io/hostname

Probes Deep Dive

Three Types of Probes

Probe	Purpose	On Failure
Startup	Is the app finished starting?	Kill and restart
Liveness	Is the app alive and healthy?	Kill and restart
Readiness	Can the app serve traffic?	Remove from Service

Probe Methods

yaml
# HTTP GET (most common for web apps)
livenessProbe:
  httpGet:
    path: /health
    port: 8080
    httpHeaders:
      - name: Accept
        value: application/json

# TCP Socket (for non-HTTP services)
livenessProbe:
  tcpSocket:
    port: 5432

# Exec Command (custom check)
livenessProbe:
  exec:
    command:
      - cat
      - /tmp/healthy

# gRPC (for gRPC services)
livenessProbe:
  grpc:
    port: 50051

Probe Timeline

Container Start
     │
     ▼
┌─ Startup Probe ─────────────────┐
│  Runs until success or timeout  │
│  (other probes are disabled)    │
└──────────────┬──────────────────┘
               │ success
               ▼
┌─ Liveness Probe (periodic) ─────┐
│  Is the container healthy?      │
│  Failure → restart container    │
└─────────────────────────────────┘
               +
┌─ Readiness Probe (periodic) ────┐
│  Can it handle traffic?         │
│  Failure → remove from Service  │
└─────────────────────────────────┘

Resource Requests and Limits

yaml
resources:
  requests:         # Scheduler uses this to find a node
    cpu: "250m"     # 250 millicores = 0.25 CPU
    memory: "256Mi" # 256 Mebibytes
  limits:           # Maximum the container can use
    cpu: "1"        # 1 full CPU core
    memory: "512Mi" # OOM killed if exceeded

Unit	CPU	Memory
Notation	millicores (m)	Mi, Gi (binary) or M, G (decimal)
Example	500m = 0.5 CPU	256Mi = ~268 MB
What happens on exceed	Throttled	OOM Killed

QoS Classes (determined by requests/limits):

Class	Condition	Eviction Priority
Guaranteed	requests == limits for all containers	Last to be evicted
Burstable	At least one request set, not Guaranteed	Middle
BestEffort	No requests or limits set	First to be evicted

Workload Types

1. Deployment (Stateless Apps)

Most common workload. Manages ReplicaSets and rolling updates.

yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web-app
  template:
    metadata:
      labels:
        app: web-app
    spec:
      containers:
        - name: app
          image: myapp:v2
          ports:
            - containerPort: 8080

2. ReplicaSet

Ensures N identical pods are running. Managed by Deployments -- rarely used directly.

3. Job (One-Time Tasks)

yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: data-migration
spec:
  backoffLimit: 3        # Retry up to 3 times
  activeDeadlineSeconds: 600  # Timeout after 10 minutes
  template:
    spec:
      containers:
        - name: migrate
          image: myapp:v1
          command: ["./migrate", "up"]
      restartPolicy: OnFailure

4. CronJob (Scheduled Tasks)

yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: nightly-backup
spec:
  schedule: "0 2 * * *"     # 2 AM daily
  concurrencyPolicy: Forbid  # Don't overlap
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 1
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: backup
              image: backup-tool:v1
              command: ["./backup.sh"]
          restartPolicy: OnFailure

5. DaemonSet (One Pod Per Node)

See 16 - StatefulSets & DaemonSets

6. StatefulSet (Stateful Apps)

See 16 - StatefulSets & DaemonSets

Multi-Container Patterns

Sidecar Pattern

A helper container that extends the main container:

yaml
spec:
  containers:
    - name: app
      image: myapp:v1
      volumeMounts:
        - name: logs
          mountPath: /var/log/app
    
    - name: log-shipper     # Sidecar
      image: fluent-bit:2.2
      volumeMounts:
        - name: logs
          mountPath: /var/log/app
          readOnly: true
  
  volumes:
    - name: logs
      emptyDir: {}

Ambassador Pattern

A proxy that handles network communication:

yaml
spec:
  containers:
    - name: app
      image: myapp:v1
      # App connects to localhost:5432
    
    - name: cloud-sql-proxy  # Ambassador
      image: gcr.io/cloud-sql-connectors/cloud-sql-proxy:2
      args:
        - "--port=5432"
        - "project:region:instance"

Adapter Pattern

Transforms output to a standard format:

yaml
spec:
  containers:
    - name: app
      image: legacy-app:v1
      # Produces custom metrics format
    
    - name: prometheus-adapter  # Adapter
      image: metrics-adapter:v1
      # Converts to Prometheus format
      ports:
        - containerPort: 9090

Pod Lifecycle

Pending → Running → Succeeded/Failed
   │         │
   │         └─ Containers running
   │
   └─ Scheduling, image pulling, init containers

Phase	Description
Pending	Accepted but not running (scheduling, pulling images)
Running	At least one container running
Succeeded	All containers exited with code 0
Failed	At least one container exited with non-zero code
Unknown	Pod status can't be determined (node communication issue)

kubectl Pod Commands

bash
# Create a pod
kubectl run nginx --image=nginx:alpine

# List pods
kubectl get pods
kubectl get pods -o wide          # More details
kubectl get pods -l app=myapp     # Filter by label
kubectl get pods --all-namespaces # All namespaces

# Describe (events, conditions, details)
kubectl describe pod my-pod

# Logs
kubectl logs my-pod
kubectl logs my-pod -c sidecar    # Specific container
kubectl logs my-pod --previous    # Previous crashed container
kubectl logs -f my-pod            # Stream logs
kubectl logs -l app=myapp         # All pods with label

# Exec
kubectl exec -it my-pod -- bash
kubectl exec -it my-pod -c app -- bash  # Specific container

# Port forward
kubectl port-forward my-pod 8080:80

# Copy files
kubectl cp my-pod:/app/logs ./logs
kubectl cp ./file.txt my-pod:/app/

# Delete
kubectl delete pod my-pod
kubectl delete pod my-pod --grace-period=0 --force  # Immediate

FAANG Interview Angle

Common questions:

"What's a Pod and why not just run containers directly?"
"Explain the three types of probes"
"What happens when a Pod exceeds its memory limit?"
"Describe the sidecar pattern and when you'd use it"
"What are QoS classes and how do they affect eviction?"

Key answers:

Pod groups tightly coupled containers sharing network/storage; K8s schedules and manages pods, not containers
Startup (wait for init), Liveness (is it alive → restart), Readiness (can it serve → remove from LB)
Memory limit exceeded → OOM Kill → container restarted based on restartPolicy
Sidecar: log shipping, service mesh proxy, config sync
Guaranteed (requests=limits, last evicted), Burstable (some resources), BestEffort (first evicted)