AKS Identity and Security: Entra ID, Workload Identity, and Policy

AKS Identity and Security#

AKS identity operates at three levels: who can access the cluster API (authentication), what they can do inside it (authorization), and how pods authenticate to Azure services (workload identity). Each level has Azure-specific mechanisms that replace or extend vanilla Kubernetes patterns.

Entra ID Integration (Azure AD)#

AKS supports two Entra ID integration modes.

AKS-managed Azure AD: Enable with --enable-aad at cluster creation. AKS handles the app registrations and token validation. This is the recommended approach.

AKS Networking and Ingress Deep Dive

AKS Networking and Ingress#

AKS networking involves three layers: how pods communicate (CNI plugin), how traffic enters the cluster (load balancers and ingress controllers), and how the cluster connects to other Azure resources (VNet integration, private endpoints). Each layer has Azure-specific behavior that differs from generic Kubernetes.

Azure Load Balancer for Services#

When you create a Service of type LoadBalancer in AKS, Azure provisions a Standard SKU Azure Load Balancer. AKS manages the load balancer rules and health probes automatically.

AKS Setup and Configuration: Clusters, Node Pools, and Networking

AKS Setup and Configuration#

Azure Kubernetes Service handles the control plane for you – you pay nothing for it. What you configure is node pools, networking, identity, and add-ons. Getting these right at cluster creation matters because several choices (networking model, managed identity) cannot be changed later without rebuilding the cluster.

Creating a Cluster with az CLI#

The minimal command that produces a production-usable cluster:

az aks create \
  --resource-group myapp-rg \
  --name myapp-aks \
  --location eastus2 \
  --node-count 3 \
  --node-vm-size Standard_D4s_v5 \
  --network-plugin azure \
  --network-plugin-mode overlay \
  --vnet-subnet-id /subscriptions/<sub>/resourceGroups/myapp-rg/providers/Microsoft.Network/virtualNetworks/myapp-vnet/subnets/aks-subnet \
  --enable-managed-identity \
  --enable-aad \
  --aad-admin-group-object-ids <admin-group-id> \
  --generate-ssh-keys \
  --tier standard

Key flags: --network-plugin azure --network-plugin-mode overlay gives you Azure CNI Overlay, which avoids the IP exhaustion problems of classic Azure CNI. --tier standard enables the financially-backed SLA and uptime guarantees (the free tier has no SLA). --enable-aad integrates Entra ID (formerly Azure AD) for authentication.

AKS Troubleshooting: Diagnosing Common Azure Kubernetes Problems

AKS Troubleshooting#

AKS problems fall into categories: node pool operations stuck or failed, pods not scheduling, storage not provisioning, authentication broken, and ingress not working. Each has Azure-specific causes that generic Kubernetes debugging will not surface.

Node Pool Stuck in Updating or Failed#

Node pool operations (scaling, upgrading, changing settings) can get stuck. The AKS API reports the pool as “Updating” indefinitely or transitions to “Failed.”

# Check node pool provisioning state
az aks nodepool show \
  --resource-group myapp-rg \
  --cluster-name myapp-aks \
  --name workload \
  --query provisioningState

# Check the activity log for errors
az monitor activity-log list \
  --resource-group myapp-rg \
  --query "[?contains(operationName.value, 'Microsoft.ContainerService')].{op:operationName.value, status:status.value, msg:properties.statusMessage}" \
  --output table

Common causes and fixes:

API Gateway Patterns: Selection, Configuration, and Routing

API Gateway Patterns#

An API gateway sits between clients and your backend services. It handles cross-cutting concerns – authentication, rate limiting, request transformation, routing – so your services do not have to. Choosing the right gateway and configuring it correctly is one of the first decisions in any microservices architecture.

Gateway Responsibilities#

Before selecting a gateway, clarify which responsibilities it should own:

  • Routing – directing requests to the correct backend service based on path, headers, or method.
  • Authentication and authorization – validating tokens, API keys, or certificates before requests reach backends.
  • Rate limiting – protecting backends from traffic spikes and enforcing usage quotas.
  • Request/response transformation – modifying headers, rewriting paths, converting between formats.
  • Load balancing – distributing traffic across service instances.
  • Observability – emitting metrics, logs, and traces for every request that passes through.
  • TLS termination – handling HTTPS so backends can speak plain HTTP internally.

No gateway does everything equally well. The right choice depends on which of these responsibilities matter most in your environment.

ArgoCD Image Updater: Automatic Image Tag Updates Without Git Commits

ArgoCD Image Updater#

ArgoCD Image Updater watches container registries for new image tags and automatically updates ArgoCD Applications to use them. In a standard GitOps workflow, updating an image tag requires a Git commit that changes the tag in a values file or manifest. Image Updater automates that step.

The Problem It Solves#

Standard GitOps image update flow:

CI builds image → pushes myapp:v1.2.3 to registry
    → Developer (or CI) commits "update image tag to v1.2.3" to Git
    → ArgoCD detects Git change
    → ArgoCD syncs new tag to cluster

That middle step – committing the tag update – is friction. CI pipelines need Git write access, commit messages are noise (“bump image to v1.2.4”, “bump image to v1.2.5”), and the delay between image push and deployment depends on how fast the commit pipeline runs.

ArgoCD Multi-Cluster Management: Hub-Spoke Patterns, Cluster Registration, and Fleet Operations

ArgoCD Multi-Cluster Management#

A single ArgoCD instance can manage deployments across dozens of Kubernetes clusters. This is one of ArgoCD’s strongest features and the standard approach for organizations with multiple environments, regions, or cloud providers.

Hub-Spoke Architecture#

The standard multi-cluster pattern runs ArgoCD on one “hub” cluster that deploys to multiple “spoke” clusters:

Hub Cluster (management)
├── ArgoCD control plane
├── Application/ApplicationSet definitions
├── RBAC policies
└── Cluster credentials (Secrets)
    │
    ├──→ Spoke Cluster: dev (us-east-1)
    ├──→ Spoke Cluster: staging (us-west-2)
    ├──→ Spoke Cluster: prod-us (us-east-1)
    ├──→ Spoke Cluster: prod-eu (eu-west-1)
    └──→ Spoke Cluster: prod-apac (ap-southeast-1)

ArgoCD on the hub cluster connects to each spoke cluster’s API server to apply manifests and check health. The spoke clusters do not need ArgoCD installed.

ArgoCD Notifications: Slack, Teams, Webhooks, and Custom Triggers

ArgoCD Notifications#

ArgoCD Notifications is a built-in component (since ArgoCD 2.5) that monitors applications and sends alerts when specific events occur – sync succeeded, sync failed, health degraded, new version deployed. Before notifications existed, teams polled the ArgoCD UI or built custom watchers. Notifications eliminates that.

Architecture#

ArgoCD Notifications runs as a controller alongside the ArgoCD application controller. It watches Application resources for state changes and matches them against triggers. When a trigger fires, it renders a template and sends it through a configured service (Slack, Teams, webhook, email, etc.).

ArgoCD Patterns: App of Apps, ApplicationSets, Multi-Environment Management, and Source Strategies

ArgoCD Patterns#

Once ArgoCD is running and you have a few applications deployed, you hit a scaling problem: managing dozens or hundreds of Application resources by hand is unsustainable. These patterns solve that.

App of Apps#

The App of Apps pattern uses one ArgoCD Application to manage other Application resources. You create a “root” application that points to a directory containing Application YAML files. When ArgoCD syncs the root app, it creates all the child applications.

ArgoCD Secrets Management: Sealed Secrets, External Secrets Operator, and SOPS

ArgoCD Secrets Management#

GitOps says everything should be in Git. Kubernetes Secrets are base64-encoded, not encrypted. Committing base64 secrets to Git is equivalent to committing plaintext – anyone with repo access can decode them. This is the fundamental tension of GitOps secrets management.

Three approaches solve this, each with different tradeoffs.

Approach 1: Sealed Secrets#

Sealed Secrets encrypts secrets client-side so the encrypted form can be safely committed to Git. Only the Sealed Secrets controller running in-cluster can decrypt them.