Minikube Production Profile: Configuring a Local Cluster That Behaves Like Production

Why Production Parity Matters Locally#

The most expensive bugs are the ones you find after deploying to production. A minikube cluster with default settings lacks ingress, metrics, and resource enforcement – so your app works locally and breaks in staging. The goal is to configure minikube so that anything that works on it has a high probability of working on a real cluster.

Choosing Your Local Kubernetes Tool#

Before configuring minikube, decide if it is the right tool.

Prometheus and Grafana on Minikube: Production-Like Monitoring Without the Cost

Why Monitor a POC Cluster#

Monitoring on minikube serves two purposes. First, it catches resource problems early – your app might work in tests but OOM-kill under load, and you will not know without metrics. Second, it validates that your monitoring configuration works before you deploy it to production. If your ServiceMonitors, dashboards, and alert rules work on minikube, they will work on EKS or GKE.

The Right Chart: kube-prometheus-stack#

There are multiple Prometheus-related Helm charts. Use the right one:

ArgoCD on Minikube: GitOps Deployments from Day One

Why GitOps on a POC Cluster#

Setting up ArgoCD on minikube is not about automating deployments for a local cluster – you could just run kubectl apply. The point is to prove the deployment workflow before production. If your Git repo structure, Helm values, and sync policies work on minikube, they will work on EKS or GKE. If you skip this and bolt on GitOps later, you will spend days restructuring your repo and debugging sync failures under production pressure.

Minikube Application Deployment Patterns: Production-Ready Manifests for Four Common Workloads

Choosing the Right Workload Type#

Every application fits one of four deployment patterns. Choosing the wrong one creates problems that are hard to fix later – a database deployed as a Deployment loses data on reschedule, a batch job deployed as a Deployment wastes resources running 24/7.

Pattern Kubernetes Resource Use When
Stateless web app Deployment + Service + Ingress HTTP APIs, frontends, microservices
Stateful app StatefulSet + Headless Service + PVC Databases, caches with persistence, message brokers
Background worker Deployment (no Service) Queue consumers, event processors, stream readers
Batch processing CronJob Scheduled reports, data cleanup, periodic syncs

Pattern 1: Stateless Web App#

A web API that can be scaled horizontally with no persistent state. Any pod can handle any request.

Secrets Management Decision Framework: From POC to Production

The Secret Zero Problem#

Every secrets management system has the same fundamental challenge: you need a secret to access your secrets. Your Vault token is itself a secret. Your AWS credentials for SSM Parameter Store are themselves secrets. This is the “secret zero” problem – there is always one secret that must be bootstrapped outside the system.

Understanding this helps you make pragmatic choices. No tool eliminates all risk. The goal is to reduce the blast radius and make rotation possible.

AKS Identity and Security: Entra ID, Workload Identity, and Policy

AKS Identity and Security#

AKS identity operates at three levels: who can access the cluster API (authentication), what they can do inside it (authorization), and how pods authenticate to Azure services (workload identity). Each level has Azure-specific mechanisms that replace or extend vanilla Kubernetes patterns.

Entra ID Integration (Azure AD)#

AKS supports two Entra ID integration modes.

AKS-managed Azure AD: Enable with --enable-aad at cluster creation. AKS handles the app registrations and token validation. This is the recommended approach.

AKS Networking and Ingress Deep Dive

AKS Networking and Ingress#

AKS networking involves three layers: how pods communicate (CNI plugin), how traffic enters the cluster (load balancers and ingress controllers), and how the cluster connects to other Azure resources (VNet integration, private endpoints). Each layer has Azure-specific behavior that differs from generic Kubernetes.

Azure Load Balancer for Services#

When you create a Service of type LoadBalancer in AKS, Azure provisions a Standard SKU Azure Load Balancer. AKS manages the load balancer rules and health probes automatically.

AKS Setup and Configuration: Clusters, Node Pools, and Networking

AKS Setup and Configuration#

Azure Kubernetes Service handles the control plane for you – you pay nothing for it. What you configure is node pools, networking, identity, and add-ons. Getting these right at cluster creation matters because several choices (networking model, managed identity) cannot be changed later without rebuilding the cluster.

Creating a Cluster with az CLI#

The minimal command that produces a production-usable cluster:

az aks create \
  --resource-group myapp-rg \
  --name myapp-aks \
  --location eastus2 \
  --node-count 3 \
  --node-vm-size Standard_D4s_v5 \
  --network-plugin azure \
  --network-plugin-mode overlay \
  --vnet-subnet-id /subscriptions/<sub>/resourceGroups/myapp-rg/providers/Microsoft.Network/virtualNetworks/myapp-vnet/subnets/aks-subnet \
  --enable-managed-identity \
  --enable-aad \
  --aad-admin-group-object-ids <admin-group-id> \
  --generate-ssh-keys \
  --tier standard

Key flags: --network-plugin azure --network-plugin-mode overlay gives you Azure CNI Overlay, which avoids the IP exhaustion problems of classic Azure CNI. --tier standard enables the financially-backed SLA and uptime guarantees (the free tier has no SLA). --enable-aad integrates Entra ID (formerly Azure AD) for authentication.

AKS Troubleshooting: Diagnosing Common Azure Kubernetes Problems

AKS Troubleshooting#

AKS problems fall into categories: node pool operations stuck or failed, pods not scheduling, storage not provisioning, authentication broken, and ingress not working. Each has Azure-specific causes that generic Kubernetes debugging will not surface.

Node Pool Stuck in Updating or Failed#

Node pool operations (scaling, upgrading, changing settings) can get stuck. The AKS API reports the pool as “Updating” indefinitely or transitions to “Failed.”

# Check node pool provisioning state
az aks nodepool show \
  --resource-group myapp-rg \
  --cluster-name myapp-aks \
  --name workload \
  --query provisioningState

# Check the activity log for errors
az monitor activity-log list \
  --resource-group myapp-rg \
  --query "[?contains(operationName.value, 'Microsoft.ContainerService')].{op:operationName.value, status:status.value, msg:properties.statusMessage}" \
  --output table

Common causes and fixes:

cert-manager and external-dns: Automatic TLS and DNS on Kubernetes

cert-manager and external-dns#

These two controllers solve the two most tedious parts of exposing services on Kubernetes: getting TLS certificates and creating DNS records. Together, they make it so that creating an Ingress resource automatically provisions a DNS record pointing to your cluster and a valid TLS certificate for the hostname.

cert-manager#

cert-manager watches for Certificate resources and Ingress annotations, then obtains and renews TLS certificates automatically.

Installation#

helm repo add jetstack https://charts.jetstack.io
helm install cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --set crds.enabled=true

The crds.enabled=true flag installs the CRDs as part of the Helm release. Verify with kubectl get pods -n cert-manager – you should see cert-manager, cert-manager-cainjector, and cert-manager-webhook all Running.