A Helm operator runs an upgrade with --reuse-values -f new-values.yaml. Helm reports success, increments the revision counter, and returns STATUS: deployed. The cluster behavior does not change. The new values file might as well not exist. This is a silent no-op upgrade — the load-bearing failure mode of --reuse-values — and it is one of several Day-2 Helm operations where the verbs look correct but the semantics are not what most operators assume. This article covers the flag combinations that bite, how to inspect any past revision, how rollback actually works, and the snapshot-before-upgrade discipline that turns Helm’s revision storage into a real disaster-recovery backstop.
Minikube docker-env: Building Images Directly into the Cluster Runtime
eval $(minikube docker-env) repoints the shell’s Docker client at the daemon running inside the minikube VM. A docker build afterwards lands the image directly in the cluster’s container store, so pods can pull it without a registry. The pattern is correct but unforgiving: every failure mode looks like a different problem (image pull error, runtime crash, stale pod) and only a handful of them actually point back to the env-var setup.
Running Kubernetes on Apple Silicon: Setup, Gotchas, Recovery
A minikube cluster on Apple Silicon looks like a pure Kubernetes problem until the first Docker Desktop crash. The failure modes that bite hardest on M-series Macs live one layer below the cluster: in Docker Desktop’s memory allocator, in QEMU’s address-space layout, and in the destructive default of minikube delete. None of these are mentioned in the standard minikube setup guide, and all three will eat real workload state when they fire. This is the operational layer on top of minikube setup and drivers and ARM64 K8s images — the host-side discipline that keeps the cluster alive.
AKS Identity and Security: Entra ID, Workload Identity, and Policy
AKS Identity and Security#
AKS identity operates at three levels: who can access the cluster API (authentication), what they can do inside it (authorization), and how pods authenticate to Azure services (workload identity). Each level has Azure-specific mechanisms that replace or extend vanilla Kubernetes patterns.
Entra ID Integration (Azure AD)#
AKS supports two Entra ID integration modes.
AKS-managed Azure AD: Enable with --enable-aad at cluster creation. AKS handles the app registrations and token validation. This is the recommended approach.
AKS Networking and Ingress Deep Dive
AKS Networking and Ingress#
AKS networking involves three layers: how pods communicate (CNI plugin), how traffic enters the cluster (load balancers and ingress controllers), and how the cluster connects to other Azure resources (VNet integration, private endpoints). Each layer has Azure-specific behavior that differs from generic Kubernetes.
Azure Load Balancer for Services#
When you create a Service of type LoadBalancer in AKS, Azure provisions a Standard SKU Azure Load Balancer. AKS manages the load balancer rules and health probes automatically.
AKS Setup and Configuration: Clusters, Node Pools, and Networking
AKS Setup and Configuration#
Azure Kubernetes Service handles the control plane for you – you pay nothing for it. What you configure is node pools, networking, identity, and add-ons. Getting these right at cluster creation matters because several choices (networking model, managed identity) cannot be changed later without rebuilding the cluster.
Creating a Cluster with az CLI#
The minimal command that produces a production-usable cluster:
az aks create \
--resource-group myapp-rg \
--name myapp-aks \
--location eastus2 \
--node-count 3 \
--node-vm-size Standard_D4s_v5 \
--network-plugin azure \
--network-plugin-mode overlay \
--vnet-subnet-id /subscriptions/<sub>/resourceGroups/myapp-rg/providers/Microsoft.Network/virtualNetworks/myapp-vnet/subnets/aks-subnet \
--enable-managed-identity \
--enable-aad \
--aad-admin-group-object-ids <admin-group-id> \
--generate-ssh-keys \
--tier standardKey flags: --network-plugin azure --network-plugin-mode overlay gives you Azure CNI Overlay, which avoids the IP exhaustion problems of classic Azure CNI. --tier standard enables the financially-backed SLA and uptime guarantees (the free tier has no SLA). --enable-aad integrates Entra ID (formerly Azure AD) for authentication.
AKS Troubleshooting: Diagnosing Common Azure Kubernetes Problems
AKS Troubleshooting#
AKS problems fall into categories: node pool operations stuck or failed, pods not scheduling, storage not provisioning, authentication broken, and ingress not working. Each has Azure-specific causes that generic Kubernetes debugging will not surface.
Node Pool Stuck in Updating or Failed#
Node pool operations (scaling, upgrading, changing settings) can get stuck. The AKS API reports the pool as “Updating” indefinitely or transitions to “Failed.”
# Check node pool provisioning state
az aks nodepool show \
--resource-group myapp-rg \
--cluster-name myapp-aks \
--name workload \
--query provisioningState
# Check the activity log for errors
az monitor activity-log list \
--resource-group myapp-rg \
--query "[?contains(operationName.value, 'Microsoft.ContainerService')].{op:operationName.value, status:status.value, msg:properties.statusMessage}" \
--output tableCommon causes and fixes:
cert-manager and external-dns: Automatic TLS and DNS on Kubernetes
cert-manager and external-dns#
These two controllers solve the two most tedious parts of exposing services on Kubernetes: getting TLS certificates and creating DNS records. Together, they make it so that creating an Ingress resource automatically provisions a DNS record pointing to your cluster and a valid TLS certificate for the hostname.
cert-manager#
cert-manager watches for Certificate resources and Ingress annotations, then obtains and renews TLS certificates automatically.
Installation#
helm repo add jetstack https://charts.jetstack.io
helm install cert-manager jetstack/cert-manager \
--namespace cert-manager \
--create-namespace \
--set crds.enabled=trueThe crds.enabled=true flag installs the CRDs as part of the Helm release. Verify with kubectl get pods -n cert-manager – you should see cert-manager, cert-manager-cainjector, and cert-manager-webhook all Running.
Choosing a CNI Plugin: Calico vs Cilium vs Flannel vs Cloud-Native CNI
Choosing a CNI Plugin#
The Container Network Interface (CNI) plugin is one of the most consequential infrastructure decisions in a Kubernetes cluster. It determines how pods get IP addresses, how traffic flows between them, whether network policies are enforced, and what observability you get into network behavior. Changing CNI after deployment is painful – it typically requires draining and rebuilding nodes, or rebuilding the cluster entirely. Choose carefully up front.
Choosing an Autoscaling Strategy: HPA vs VPA vs KEDA vs Karpenter/Cluster Autoscaler
Choosing an Autoscaling Strategy#
Kubernetes autoscaling operates at two distinct layers: pod-level scaling changes how many pods run or how large they are, while node-level scaling changes how many nodes exist in the cluster to host those pods. Getting the right combination of tools at each layer is the key to a system that responds to demand without wasting resources.
The Two Scaling Layers#
Understanding which layer a tool operates on prevents the most common misconfiguration – expecting pod-level scaling to solve node-level capacity problems, or vice versa.