AKS Identity and Security: Entra ID, Workload Identity, and Policy

AKS Identity and Security#

AKS identity operates at three levels: who can access the cluster API (authentication), what they can do inside it (authorization), and how pods authenticate to Azure services (workload identity). Each level has Azure-specific mechanisms that replace or extend vanilla Kubernetes patterns.

Entra ID Integration (Azure AD)#

AKS supports two Entra ID integration modes.

AKS-managed Azure AD: Enable with --enable-aad at cluster creation. AKS handles the app registrations and token validation. This is the recommended approach.

AKS Networking and Ingress Deep Dive

AKS Networking and Ingress#

AKS networking involves three layers: how pods communicate (CNI plugin), how traffic enters the cluster (load balancers and ingress controllers), and how the cluster connects to other Azure resources (VNet integration, private endpoints). Each layer has Azure-specific behavior that differs from generic Kubernetes.

Azure Load Balancer for Services#

When you create a Service of type LoadBalancer in AKS, Azure provisions a Standard SKU Azure Load Balancer. AKS manages the load balancer rules and health probes automatically.

AKS Setup and Configuration: Clusters, Node Pools, and Networking

AKS Setup and Configuration#

Azure Kubernetes Service handles the control plane for you – you pay nothing for it. What you configure is node pools, networking, identity, and add-ons. Getting these right at cluster creation matters because several choices (networking model, managed identity) cannot be changed later without rebuilding the cluster.

Creating a Cluster with az CLI#

The minimal command that produces a production-usable cluster:

az aks create \
  --resource-group myapp-rg \
  --name myapp-aks \
  --location eastus2 \
  --node-count 3 \
  --node-vm-size Standard_D4s_v5 \
  --network-plugin azure \
  --network-plugin-mode overlay \
  --vnet-subnet-id /subscriptions/<sub>/resourceGroups/myapp-rg/providers/Microsoft.Network/virtualNetworks/myapp-vnet/subnets/aks-subnet \
  --enable-managed-identity \
  --enable-aad \
  --aad-admin-group-object-ids <admin-group-id> \
  --generate-ssh-keys \
  --tier standard

Key flags: --network-plugin azure --network-plugin-mode overlay gives you Azure CNI Overlay, which avoids the IP exhaustion problems of classic Azure CNI. --tier standard enables the financially-backed SLA and uptime guarantees (the free tier has no SLA). --enable-aad integrates Entra ID (formerly Azure AD) for authentication.

AKS Troubleshooting: Diagnosing Common Azure Kubernetes Problems

AKS Troubleshooting#

AKS problems fall into categories: node pool operations stuck or failed, pods not scheduling, storage not provisioning, authentication broken, and ingress not working. Each has Azure-specific causes that generic Kubernetes debugging will not surface.

Node Pool Stuck in Updating or Failed#

Node pool operations (scaling, upgrading, changing settings) can get stuck. The AKS API reports the pool as “Updating” indefinitely or transitions to “Failed.”

# Check node pool provisioning state
az aks nodepool show \
  --resource-group myapp-rg \
  --cluster-name myapp-aks \
  --name workload \
  --query provisioningState

# Check the activity log for errors
az monitor activity-log list \
  --resource-group myapp-rg \
  --query "[?contains(operationName.value, 'Microsoft.ContainerService')].{op:operationName.value, status:status.value, msg:properties.statusMessage}" \
  --output table

Common causes and fixes:

Building Machine Images with Packer: Templates, Builders, Provisioners, and CI/CD

Building Machine Images with Packer#

Machine images (AMIs, Azure Managed Images, GCP Images) are the foundation of immutable infrastructure. Instead of provisioning a base OS and configuring it at boot, you build a pre-configured image and launch instances from it. Packer automates this process: it launches a temporary instance, runs provisioners to configure it, creates an image from the result, and destroys the temporary instance.

This operational sequence walks through building, testing, and managing machine images with Packer from template creation through CI/CD integration.

Comparing Serverless Platforms: Cloud Run, Azure Functions, Lambda, and Cloudflare Workers

Comparing Serverless Platforms#

Choosing a serverless platform is not about which one is “best.” Each platform makes different tradeoffs around cold start latency, execution limits, pricing granularity, and ecosystem integration. The right choice depends on what you are building, what cloud you already use, and which constraints matter most.

This framework compares the four major serverless compute platforms as of early 2026: AWS Lambda, Google Cloud Run, Azure Functions, and Cloudflare Workers.

Ephemeral Cloud Clusters: Create, Validate, Destroy Sequences for EKS, GKE, and AKS

Ephemeral Cloud Clusters#

Ephemeral clusters exist for one purpose: validate something, then disappear. They are not staging environments, not shared dev clusters, not long-lived resources that someone forgets to turn off. The operational model is strict – create, validate, destroy – and the entire sequence must be automated so that destruction cannot be forgotten.

The cost of getting this wrong is real. A three-node EKS cluster left running over a weekend costs roughly $15. Left running for a month, $200. Multiply by the number of developers or CI pipelines that create clusters, and forgotten ephemeral infrastructure becomes a significant budget line item. Every template in this article includes auto-destroy mechanisms to prevent this.

Multi-Cloud Networking Patterns

Multi-Cloud Networking Patterns#

Multi-cloud networking connects workloads across two or more cloud providers into a coherent network. The motivations vary – vendor redundancy, best-of-breed service selection, regulatory requirements – but the challenges are the same: private connectivity between isolated networks, consistent service discovery, and traffic routing that handles failures.

VPN Tunnels Between Clouds#

IPsec VPN tunnels are the simplest way to connect two cloud networks. Each provider offers managed VPN gateways that terminate IPsec tunnels, encrypting traffic between VPCs without exposing it to the public internet.

Running Windows Workloads on Kubernetes: Node Pools, Scheduling, and Gotchas

Running Windows Workloads on Kubernetes#

Kubernetes supports Windows worker nodes alongside Linux worker nodes in the same cluster. This enables running Windows-native applications – .NET Framework services, IIS-hosted applications, Windows-specific middleware – on Kubernetes without rewriting them for Linux. However, Windows nodes are not interchangeable with Linux nodes. There are fundamental differences in networking, storage, container runtime behavior, and resource management that you must account for.

Core Constraints#

Before adding Windows nodes, understand what is and is not supported:

Service Account Security: Tokens, RBAC Binding, and Workload Identity

Service Account Security#

Every pod in Kubernetes runs as a service account. By default, that is the default service account in the pod’s namespace, with an auto-mounted API token that never expires. This default configuration is overly permissive for most workloads. Hardening service accounts is one of the highest-impact security improvements you can make in a Kubernetes cluster.

The Default Problem#

When a pod starts without specifying a service account, Kubernetes does three things: