Helm Gotchas: --reuse-values, Revisions, Rollback, and Disaster Recovery

May 7, 2026

Helm-Upgrade-Debugging, Helm-Rollback, Values-Snapshot, Release-History-Inspection

Helm, Rollback, Revisions, Upgrade, Disaster-Recovery, Debugging

A Helm operator runs an upgrade with --reuse-values -f new-values.yaml. Helm reports success, increments the revision counter, and returns STATUS: deployed. The cluster behavior does not change. The new values file might as well not exist. This is a silent no-op upgrade — the load-bearing failure mode of --reuse-values — and it is one of several Day-2 Helm operations where the verbs look correct but the semantics are not what most operators assume. This article covers the flag combinations that bite, how to inspect any past revision, how rollback actually works, and the snapshot-before-upgrade discipline that turns Helm’s revision storage into a real disaster-recovery backstop.

Minikube docker-env: Building Images Directly into the Cluster Runtime

May 7, 2026

Kubernetes

Intermediate

Minikube-Image-Workflow, Container-Runtime-Debugging, Arm64-Image-Builds

Minikube, Docker, Image-Builds, Local-Development, Imagepullpolicy

Minikube, Docker, Kubectl

eval $(minikube docker-env) repoints the shell’s Docker client at the daemon running inside the minikube VM. A docker build afterwards lands the image directly in the cluster’s container store, so pods can pull it without a registry. The pattern is correct but unforgiving: every failure mode looks like a different problem (image pull error, runtime crash, stale pod) and only a handful of them actually point back to the env-var setup.

Running Kubernetes on Apple Silicon: Setup, Gotchas, Recovery

May 7, 2026

Kubernetes

Intermediate

Minikube-on-Mac-Setup, Docker-Desktop-Memory-Tuning, K8s-Cluster-Recovery

Minikube, Apple-Silicon, Arm64, Docker-Desktop, Macos, Jetsam, Disaster-Recovery

Minikube, Docker-Desktop, Kubectl, Helm

A minikube cluster on Apple Silicon looks like a pure Kubernetes problem until the first Docker Desktop crash. The failure modes that bite hardest on M-series Macs live one layer below the cluster: in Docker Desktop’s memory allocator, in QEMU’s address-space layout, and in the destructive default of minikube delete. None of these are mentioned in the standard minikube setup guide, and all three will eat real workload state when they fire. This is the operational layer on top of minikube setup and drivers and ARM64 K8s images — the host-side discipline that keeps the cluster alive.

AKS Identity and Security: Entra ID, Workload Identity, and Policy

February 22, 2026

Kubernetes

Intermediate

Aks-Identity-Management, Workload-Identity-Configuration, Secret-Management, Policy-Enforcement

Aks, Azure, Security, Entra-Id, Workload-Identity, Key-Vault, Azure-Policy, Rbac

Az, Kubectl, Kubelogin

AKS Identity and Security#

AKS identity operates at three levels: who can access the cluster API (authentication), what they can do inside it (authorization), and how pods authenticate to Azure services (workload identity). Each level has Azure-specific mechanisms that replace or extend vanilla Kubernetes patterns.

Entra ID Integration (Azure AD)#

AKS supports two Entra ID integration modes.

AKS-managed Azure AD: Enable with --enable-aad at cluster creation. AKS handles the app registrations and token validation. This is the recommended approach.

AKS Networking and Ingress Deep Dive

February 22, 2026

Kubernetes

Intermediate

Aks-Networking, Ingress-Configuration, Load-Balancer-Management, Dns-Integration

Aks, Azure, Networking, Ingress, Load-Balancer, Agic, Private-Link, Dns

Az, Kubectl, Helm

AKS Networking and Ingress#

AKS networking involves three layers: how pods communicate (CNI plugin), how traffic enters the cluster (load balancers and ingress controllers), and how the cluster connects to other Azure resources (VNet integration, private endpoints). Each layer has Azure-specific behavior that differs from generic Kubernetes.

Azure Load Balancer for Services#

When you create a Service of type LoadBalancer in AKS, Azure provisions a Standard SKU Azure Load Balancer. AKS manages the load balancer rules and health probes automatically.

AKS Setup and Configuration: Clusters, Node Pools, and Networking

February 22, 2026

Kubernetes

Intermediate

Aks-Cluster-Creation, Node-Pool-Management, Azure-Networking

Aks, Azure, Terraform, Bicep, Node-Pools, Azure-Cni, Managed-Identity

Az, Terraform, Kubectl

AKS Setup and Configuration#

Azure Kubernetes Service handles the control plane for you – you pay nothing for it. What you configure is node pools, networking, identity, and add-ons. Getting these right at cluster creation matters because several choices (networking model, managed identity) cannot be changed later without rebuilding the cluster.

Creating a Cluster with az CLI#

The minimal command that produces a production-usable cluster:

az aks create \
  --resource-group myapp-rg \
  --name myapp-aks \
  --location eastus2 \
  --node-count 3 \
  --node-vm-size Standard_D4s_v5 \
  --network-plugin azure \
  --network-plugin-mode overlay \
  --vnet-subnet-id /subscriptions/<sub>/resourceGroups/myapp-rg/providers/Microsoft.Network/virtualNetworks/myapp-vnet/subnets/aks-subnet \
  --enable-managed-identity \
  --enable-aad \
  --aad-admin-group-object-ids <admin-group-id> \
  --generate-ssh-keys \
  --tier standard

Key flags: --network-plugin azure --network-plugin-mode overlay gives you Azure CNI Overlay, which avoids the IP exhaustion problems of classic Azure CNI. --tier standard enables the financially-backed SLA and uptime guarantees (the free tier has no SLA). --enable-aad integrates Entra ID (formerly Azure AD) for authentication.

AKS Troubleshooting: Diagnosing Common Azure Kubernetes Problems

February 22, 2026

Kubernetes

Intermediate

Aks-Troubleshooting, Node-Debugging, Storage-Diagnosis, Auth-Debugging

Aks, Azure, Troubleshooting, Debugging, Node-Pools, Storage, Authentication

Az, Kubectl, Kubelogin

AKS Troubleshooting#

AKS problems fall into categories: node pool operations stuck or failed, pods not scheduling, storage not provisioning, authentication broken, and ingress not working. Each has Azure-specific causes that generic Kubernetes debugging will not surface.

Node Pool Stuck in Updating or Failed#

Node pool operations (scaling, upgrading, changing settings) can get stuck. The AKS API reports the pool as “Updating” indefinitely or transitions to “Failed.”

# Check node pool provisioning state
az aks nodepool show \
  --resource-group myapp-rg \
  --cluster-name myapp-aks \
  --name workload \
  --query provisioningState

# Check the activity log for errors
az monitor activity-log list \
  --resource-group myapp-rg \
  --query "[?contains(operationName.value, 'Microsoft.ContainerService')].{op:operationName.value, status:status.value, msg:properties.statusMessage}" \
  --output table

Common causes and fixes:

cert-manager and external-dns: Automatic TLS and DNS on Kubernetes

February 22, 2026

Kubernetes

Intermediate

Cert-Manager-Setup, External-Dns-Configuration, Tls-Automation, Dns-Automation

Cert-Manager, External-Dns, Tls, Dns, Lets-Encrypt, Ingress

Kubectl, Helm

cert-manager and external-dns#

These two controllers solve the two most tedious parts of exposing services on Kubernetes: getting TLS certificates and creating DNS records. Together, they make it so that creating an Ingress resource automatically provisions a DNS record pointing to your cluster and a valid TLS certificate for the hostname.

cert-manager#

cert-manager watches for Certificate resources and Ingress annotations, then obtains and renews TLS certificates automatically.

Installation#

helm repo add jetstack https://charts.jetstack.io
helm install cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --set crds.enabled=true

The crds.enabled=true flag installs the CRDs as part of the Helm release. Verify with kubectl get pods -n cert-manager – you should see cert-manager, cert-manager-cainjector, and cert-manager-webhook all Running.

Choosing a CNI Plugin: Calico vs Cilium vs Flannel vs Cloud-Native CNI

February 22, 2026

Kubernetes

Intermediate

Network-Architecture, Cni-Selection, Network-Policy-Design

Cni, Calico, Cilium, Flannel, Networking, Network-Policy, Ebpf, Decision-Framework

Kubectl, Helm

Choosing a CNI Plugin#

The Container Network Interface (CNI) plugin is one of the most consequential infrastructure decisions in a Kubernetes cluster. It determines how pods get IP addresses, how traffic flows between them, whether network policies are enforced, and what observability you get into network behavior. Changing CNI after deployment is painful – it typically requires draining and rebuilding nodes, or rebuilding the cluster entirely. Choose carefully up front.

Choosing an Autoscaling Strategy: HPA vs VPA vs KEDA vs Karpenter/Cluster Autoscaler

February 22, 2026

Kubernetes

Intermediate

Autoscaling-Selection, Capacity-Planning, Cost-Optimization

Autoscaling, Hpa, Vpa, Keda, Karpenter, Cluster-Autoscaler, Decision-Framework

Kubectl, Helm

Choosing an Autoscaling Strategy#

Kubernetes autoscaling operates at two distinct layers: pod-level scaling changes how many pods run or how large they are, while node-level scaling changes how many nodes exist in the cluster to host those pods. Getting the right combination of tools at each layer is the key to a system that responds to demand without wasting resources.

The Two Scaling Layers#

Understanding which layer a tool operates on prevents the most common misconfiguration – expecting pod-level scaling to solve node-level capacity problems, or vice versa.