16 - StatefulSets & DaemonSets

StatefulSets

For stateful applications that need:

  • Stable, unique network identities (pod-0, pod-1, pod-2)
  • Stable persistent storage (each pod gets its own PVC)
  • Ordered deployment and scaling (pod-0 before pod-1 before pod-2)
  • Ordered rolling updates (pod-2 before pod-1 before pod-0)

Deployment vs StatefulSet

FeatureDeploymentStatefulSet
Pod namesRandom (web-abc123)Ordinal (web-0, web-1)
Pod identityInterchangeableUnique, stable
StorageShared or no persistenceDedicated PVC per pod
Scaling orderParallelSequential (0, 1, 2...)
Update orderAny orderReverse order (...2, 1, 0)
Use caseStateless appsDatabases, queues

StatefulSet Manifest

yaml
apiVersion: apps/v1 kind: StatefulSet metadata: name: postgres spec: serviceName: postgres-headless # Required: headless service name replicas: 3 selector: matchLabels: app: postgres # Pod template template: metadata: labels: app: postgres spec: containers: - name: postgres image: postgres:16 ports: - containerPort: 5432 env: - name: POSTGRES_PASSWORD valueFrom: secretKeyRef: name: pg-secret key: password volumeMounts: - name: data mountPath: /var/lib/postgresql/data # Each pod gets its own PVC (not shared) volumeClaimTemplates: - metadata: name: data spec: accessModes: ["ReadWriteOnce"] storageClassName: fast-ssd resources: requests: storage: 50Gi --- # Required headless service apiVersion: v1 kind: Service metadata: name: postgres-headless spec: clusterIP: None # Headless! selector: app: postgres ports: - port: 5432

How StatefulSet Pods Are Named

StatefulSet: postgres, replicas: 3

Pod names:           DNS entries:
postgres-0           postgres-0.postgres-headless.default.svc.cluster.local
postgres-1           postgres-1.postgres-headless.default.svc.cluster.local
postgres-2           postgres-2.postgres-headless.default.svc.cluster.local

PVCs (auto-created):
data-postgres-0      # Bound to postgres-0
data-postgres-1      # Bound to postgres-1
data-postgres-2      # Bound to postgres-2

Scaling Behavior

Scale up (3 → 5):
postgres-0 ✓ (exists)
postgres-1 ✓ (exists)
postgres-2 ✓ (exists)
postgres-3 ← created (waits for postgres-2 to be ready)
postgres-4 ← created (waits for postgres-3 to be ready)

Scale down (5 → 3):
postgres-4 ← deleted first
postgres-3 ← deleted second
(PVCs are NOT deleted -- data preserved)

Update Strategies

yaml
spec: updateStrategy: type: RollingUpdate # Default rollingUpdate: partition: 1 # Only update pods >= 1 (canary)
  • RollingUpdate: Updates pods in reverse ordinal order (2, 1, 0)
  • OnDelete: Only updates when you manually delete a pod
  • Partition: Only updates pods with ordinal >= partition value (canary updates)

Real-World Example: MongoDB Replica Set

yaml
apiVersion: apps/v1 kind: StatefulSet metadata: name: mongo spec: serviceName: mongo-headless replicas: 3 selector: matchLabels: app: mongo template: metadata: labels: app: mongo spec: containers: - name: mongo image: mongo:7 command: ["mongod", "--replSet", "rs0", "--bind_ip_all"] ports: - containerPort: 27017 volumeMounts: - name: data mountPath: /data/db initContainers: - name: init-rs image: mongo:7 command: - bash - -c - | # Initialize replica set on first pod only if $(hostname) == "mongo-0" ; then mongosh --eval ' rs.initiate({ _id: "rs0", members: [ {_id: 0, host: "mongo-0.mongo-headless:27017"}, {_id: 1, host: "mongo-1.mongo-headless:27017"}, {_id: 2, host: "mongo-2.mongo-headless:27017"} ] }) ' fi volumeClaimTemplates: - metadata: name: data spec: accessModes: ["ReadWriteOnce"] resources: requests: storage: 100Gi

DaemonSets

Run exactly one pod on every node (or a subset of nodes). Used for:

  • Log collection (Fluentd, Filebeat)
  • Monitoring agents (Prometheus Node Exporter, Datadog)
  • Network plugins (Calico, Cilium)
  • Storage daemons (Ceph, GlusterFS)

DaemonSet Manifest

yaml
apiVersion: apps/v1 kind: DaemonSet metadata: name: node-exporter namespace: monitoring spec: selector: matchLabels: app: node-exporter template: metadata: labels: app: node-exporter spec: containers: - name: node-exporter image: prom/node-exporter:v1.7.0 ports: - containerPort: 9100 hostPort: 9100 resources: requests: cpu: 100m memory: 128Mi limits: cpu: 250m memory: 256Mi volumeMounts: - name: proc mountPath: /host/proc readOnly: true - name: sys mountPath: /host/sys readOnly: true # Tolerate all taints (run on ALL nodes, including control plane) tolerations: - operator: Exists volumes: - name: proc hostPath: path: /proc - name: sys hostPath: path: /sys

Running DaemonSet on Specific Nodes

yaml
spec: template: spec: # Only on nodes with this label nodeSelector: disktype: ssd # Or use affinity for more complex rules affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/os operator: In values: ["linux"]

DaemonSet Update Strategy

yaml
spec: updateStrategy: type: RollingUpdate # or OnDelete rollingUpdate: maxUnavailable: 1 # Update 1 node at a time maxSurge: 0 # Don't create extra pods

Real-World Example: Fluentd Log Collector

yaml
apiVersion: apps/v1 kind: DaemonSet metadata: name: fluentd namespace: logging spec: selector: matchLabels: app: fluentd template: metadata: labels: app: fluentd spec: serviceAccountName: fluentd tolerations: - key: node-role.kubernetes.io/control-plane effect: NoSchedule containers: - name: fluentd image: fluent/fluentd-kubernetes-daemonset:v1.16 env: - name: FLUENT_ELASTICSEARCH_HOST value: "elasticsearch.logging" volumeMounts: - name: varlog mountPath: /var/log - name: containers mountPath: /var/lib/docker/containers readOnly: true volumes: - name: varlog hostPath: path: /var/log - name: containers hostPath: path: /var/lib/docker/containers

kubectl Commands

bash
# StatefulSets kubectl get statefulsets kubectl get sts kubectl describe sts postgres kubectl scale sts postgres --replicas=5 kubectl rollout status sts postgres kubectl rollout undo sts postgres # DaemonSets kubectl get daemonsets kubectl get ds kubectl describe ds node-exporter kubectl rollout status ds node-exporter # See which pods are on which nodes kubectl get pods -o wide

FAANG Interview Angle

Common questions:

  1. "When would you use a StatefulSet vs Deployment?"
  2. "How does persistent storage work with StatefulSets?"
  3. "What guarantees does a StatefulSet provide?"
  4. "What is a DaemonSet and when would you use it?"
  5. "How do you update a StatefulSet safely?"

Key answers:

  • StatefulSet for databases, message queues -- anything needing stable identity and storage
  • Each pod gets its own PVC via volumeClaimTemplates; PVCs survive pod deletion
  • Ordered deploy/scale/update, stable DNS names, dedicated persistent storage
  • DaemonSet ensures one pod per node; used for logging, monitoring, networking
  • Partition-based canary: set partition=N to only update pods with ordinal >= N

Official Links