Securing etcd: Encryption at Rest, TLS, and Access Control

February 22, 2026

Security

Etcd-Security-Hardening, Encryption-at-Rest, Certificate-Management, Backup-Security

Etcd, Encryption, Tls, Secrets, Backup-Security

Kubectl, Etcdctl, Kubeadm, Openssl

Securing etcd#

etcd is the single most critical component in a Kubernetes cluster. It stores everything: pod specs, secrets, configmaps, RBAC rules, service account tokens, and all cluster state. By default, Kubernetes secrets are stored in etcd as base64-encoded plaintext. Anyone with read access to etcd has read access to every secret in the cluster. Securing etcd is not optional.

Why etcd Is the Crown Jewel#

Run this against an unencrypted etcd and you will see why:

Upgrading Kubernetes Clusters Safely

February 22, 2026

Kubernetes

Intermediate

Cluster-Upgrade-Planning, Api-Deprecation-Detection, Etcd-Backup, Node-Pool-Management

Upgrades, Version-Skew, Kubeadm, Eks, Aks, Gke, Cluster-Maintenance

Kubectl, Kubeadm, Etcdctl, Aws, Az, Gcloud

Upgrading Kubernetes Clusters Safely#

Kubernetes releases a new minor version roughly every four months. Staying current is not optional – clusters more than three versions behind lose security patches, and skipping versions during upgrade is not supported. Every upgrade must step through each minor version sequentially.

Version Skew Policy#

The version skew policy defines which component version combinations are supported:

kube-apiserver instances within an HA cluster can differ by at most 1 minor version.
kubelet can be up to 3 minor versions older than kube-apiserver (changed from 2 in Kubernetes 1.28+), but never newer.
kube-controller-manager, kube-scheduler, and kube-proxy must not be newer than kube-apiserver and can be up to 1 minor version older.
kubectl is supported within 1 minor version (older or newer) of kube-apiserver.

The practical consequence: always upgrade the control plane first, then node pools. Never upgrade nodes past the control plane version.

Upgrading Self-Managed Kubernetes Clusters with kubeadm: Step-by-Step

February 22, 2026

Kubernetes

Intermediate

Cluster-Upgrade-Execution, Etcd-Backup-Restore, Node-Drain, Version-Management, Rollback-Planning

Kubeadm, Upgrade, Self-Managed, Etcd, Backup, Rollback, Version-Skew, Control-Plane, Worker-Nodes, Drain

Kubeadm, Kubectl, Etcdctl, Systemctl, Apt-Get, Crictl

Upgrading Self-Managed Kubernetes Clusters with kubeadm#

Upgrading a kubeadm-managed cluster is a multi-step procedure that must be executed in a precise order. The control plane upgrades first, then worker nodes one at a time. Skipping steps or upgrading in the wrong order causes version skew violations that can break cluster communication.

This article provides the complete operational sequence. Execute each step in order. Do not skip ahead.

Version Skew Policy#

Kubernetes enforces strict version compatibility rules between components. Violating these rules results in undefined behavior – sometimes things work, sometimes the API server rejects requests, sometimes components silently fail.

Kubernetes Disaster Recovery: Runbooks for Common Incidents

February 21, 2026

Kubernetes

Intermediate

Incident-Response, Etcd-Recovery, Certificate-Renewal, Deployment-Rollback, Backup-Restore

Disaster-Recovery, Runbooks, Incident-Response, Etcd, Certificates, Rollback, Velero

Kubectl, Etcdctl, Kubeadm, Velero, Openssl

Kubernetes Disaster Recovery Runbooks#

These runbooks cover the incidents you will encounter in production Kubernetes environments. Each follows the same structure: detection, diagnosis, recovery, and prevention. Print these out, bookmark them, put them in your on-call wiki. When the alert fires at 2 AM, you want a checklist, not a tutorial.

Incident Response Framework#

Every incident follows the same cycle:

Detect – monitoring alert, user report, or kubectl showing unhealthy state
Assess – determine scope and severity. Is it one pod, one node, or the entire cluster?
Contain – stop the bleeding. Prevent the issue from spreading
Recover – restore normal operation
Post-mortem – document what happened, why, and how to prevent it

Runbook 1: Node Goes NotReady#

Detection: Node condition changes to Ready=False. Pods on the node are rescheduled (if using Deployments). Monitoring alerts on node status.

Managed Kubernetes vs Self-Managed: EKS/AKS/GKE vs kubeadm vs k3s vs RKE

February 21, 2026

Kubernetes

Intermediate

Cluster-Provisioning, Architecture-Decisions, Infrastructure-Planning, Cost-Analysis

Eks, Aks, Gke, Kubeadm, K3s, Rke2, Talos, Managed-Kubernetes, Self-Managed, Bare-Metal

Kubeadm, K3s, Eksctl, Az, Gcloud, Talosctl

Managed Kubernetes vs Self-Managed#

The fundamental tradeoff is straightforward: managed Kubernetes trades control for reduced operational burden, while self-managed Kubernetes gives you full control at the cost of owning everything – etcd, certificates, upgrades, high availability, and recovery.

This decision has cascading effects on team structure, hiring, on-call burden, and long-term maintenance cost. Choose deliberately.

Managed Kubernetes (EKS, AKS, GKE)#

The cloud provider runs the control plane: API server, etcd, controller manager, scheduler. They handle patching, scaling, and high availability for these components. You manage worker nodes and workloads.