---
title: "AKS Setup and Configuration: Clusters, Node Pools, and Networking"
description: "How to create and configure Azure Kubernetes Service clusters using az CLI, Terraform, and Bicep, covering node pools, networking models, identity, and add-ons."
url: https://agent-zone.ai/knowledge/kubernetes/aks-setup-and-configuration/
section: knowledge
date: 2026-02-22
categories: ["kubernetes"]
tags: ["aks","azure","terraform","bicep","node-pools","azure-cni","managed-identity"]
skills: ["aks-cluster-creation","node-pool-management","azure-networking"]
tools: ["az","terraform","kubectl"]
levels: ["intermediate"]
word_count: 849
formats:
  json: https://agent-zone.ai/knowledge/kubernetes/aks-setup-and-configuration/index.json
  html: https://agent-zone.ai/knowledge/kubernetes/aks-setup-and-configuration/?format=html
  api: https://api.agent-zone.ai/api/v1/knowledge/search?q=AKS+Setup+and+Configuration%3A+Clusters%2C+Node+Pools%2C+and+Networking
---


# AKS Setup and Configuration

Azure Kubernetes Service handles the control plane for you -- you pay nothing for it. What you configure is node pools, networking, identity, and add-ons. Getting these right at cluster creation matters because several choices (networking model, managed identity) cannot be changed later without rebuilding the cluster.

## Creating a Cluster with az CLI

The minimal command that produces a production-usable cluster:

```bash
az aks create \
  --resource-group myapp-rg \
  --name myapp-aks \
  --location eastus2 \
  --node-count 3 \
  --node-vm-size Standard_D4s_v5 \
  --network-plugin azure \
  --network-plugin-mode overlay \
  --vnet-subnet-id /subscriptions/<sub>/resourceGroups/myapp-rg/providers/Microsoft.Network/virtualNetworks/myapp-vnet/subnets/aks-subnet \
  --enable-managed-identity \
  --enable-aad \
  --aad-admin-group-object-ids <admin-group-id> \
  --generate-ssh-keys \
  --tier standard
```

Key flags: `--network-plugin azure --network-plugin-mode overlay` gives you Azure CNI Overlay, which avoids the IP exhaustion problems of classic Azure CNI. `--tier standard` enables the financially-backed SLA and uptime guarantees (the free tier has no SLA). `--enable-aad` integrates Entra ID (formerly Azure AD) for authentication.

## Terraform Approach

For repeatable infrastructure, Terraform with the `azurerm` provider is the standard:

```hcl
resource "azurerm_kubernetes_cluster" "aks" {
  name                = "myapp-aks"
  location            = azurerm_resource_group.rg.location
  resource_group_name = azurerm_resource_group.rg.name
  dns_prefix          = "myapp"
  sku_tier            = "Standard"

  default_node_pool {
    name                = "system"
    node_count          = 3
    vm_size             = "Standard_D4s_v5"
    vnet_subnet_id      = azurerm_subnet.aks.id
    only_critical_addons_enabled = true
  }

  identity {
    type = "SystemAssigned"
  }

  network_profile {
    network_plugin = "azure"
    network_plugin_mode = "overlay"
    pod_cidr       = "10.244.0.0/16"
  }

  azure_active_directory_role_based_access_control {
    azure_rbac_enabled     = true
    admin_group_object_ids = [var.admin_group_id]
  }
}
```

Setting `only_critical_addons_enabled = true` on the default (system) node pool taints it with `CriticalAddonsOnly=true:NoSchedule`, preventing application workloads from landing on system nodes. You then add a separate user node pool for your workloads.

## Node Pools: System vs User

AKS requires at least one system node pool for cluster-critical components (CoreDNS, metrics-server, kube-proxy). Separate user pools run your applications. This separation prevents your workloads from starving system components.

```bash
# Add a user pool for application workloads
az aks nodepool add \
  --resource-group myapp-rg \
  --cluster-name myapp-aks \
  --name workload \
  --node-count 5 \
  --node-vm-size Standard_D8s_v5 \
  --mode User \
  --labels environment=production tier=app

# Add a spot instance pool for batch/dev workloads (up to 90% cheaper)
az aks nodepool add \
  --resource-group myapp-rg \
  --cluster-name myapp-aks \
  --name spotnodes \
  --node-count 3 \
  --node-vm-size Standard_D4s_v5 \
  --mode User \
  --priority Spot \
  --eviction-policy Delete \
  --spot-max-price -1
```

Spot nodes can be evicted at any time. Use them for fault-tolerant workloads (batch jobs, CI runners, dev environments). Set `--spot-max-price -1` to accept any price up to the on-demand rate. Spot pools automatically get a `kubernetes.azure.com/scalesetpriority:spot:NoSchedule` taint, so only pods with a matching toleration will schedule there.

## Networking Models

AKS supports four networking configurations. This choice is permanent for the cluster.

**Kubenet:** Pods get IPs from a virtual network that is NAT'd. Simpler, uses fewer VNet IPs, but no direct pod-to-pod communication across VNets and no Azure Network Policy support.

**Azure CNI (traditional):** Every pod gets an IP from the Azure VNet subnet. Enables direct communication with other Azure resources. The problem: a subnet with a /24 CIDR only has 251 usable IPs. With 3 nodes running 30 pods each, you burn 90+ IPs. Subnets need to be sized for maximum pod count.

**Azure CNI Overlay:** Pods get IPs from a private CIDR (default 10.244.0.0/16) overlaid on top of the VNet. Nodes still get VNet IPs, but pods do not consume subnet addresses. This is the best default for most new clusters -- you get Azure CNI features without IP exhaustion.

**Azure CNI with Cilium:** Uses Cilium as the dataplane instead of the default Azure networking. Gives you Cilium network policies, Hubble observability, and eBPF-based networking. Enable it with `--network-dataplane cilium`.

## Managed Identity and Azure AD Integration

AKS uses a managed identity to interact with Azure APIs (pull images from ACR, manage load balancers, attach disks). Always use managed identity over service principals -- identities are auto-rotated and do not require secret management.

Attach ACR to your cluster so nodes can pull images without explicit credentials:

```bash
az aks update \
  --resource-group myapp-rg \
  --name myapp-aks \
  --attach-acr myappcr
```

This assigns the AcrPull role to the cluster's kubelet identity on your Azure Container Registry.

## Essential Add-Ons

Enable these at creation time or immediately after:

```bash
# Container Insights (monitoring)
az aks enable-addons --resource-group myapp-rg --name myapp-aks \
  --addons monitoring --workspace-resource-id /subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.OperationalInsights/workspaces/<workspace>

# Azure Policy (enforce guardrails)
az aks enable-addons --resource-group myapp-rg --name myapp-aks \
  --addons azure-policy

# Azure Key Vault Secrets Provider
az aks enable-addons --resource-group myapp-rg --name myapp-aks \
  --addons azure-keyvault-secrets-provider
```

The monitoring add-on deploys a containerized OMS agent that ships logs and metrics to a Log Analytics workspace. Azure Policy installs Gatekeeper and syncs Azure Policy definitions as constraint templates. The Key Vault provider installs the Secrets Store CSI driver configured for Azure Key Vault.

## Getting Credentials

After cluster creation, get kubectl credentials:

```bash
# Admin credentials (bypasses Azure AD)
az aks get-credentials --resource-group myapp-rg --name myapp-aks --admin

# User credentials (requires Azure AD login)
az aks get-credentials --resource-group myapp-rg --name myapp-aks

# With Azure AD, you also need kubelogin
az aks install-cli  # installs kubectl and kubelogin
kubelogin convert-kubeconfig -l azurecli
```

Use `--admin` only for initial setup or break-glass scenarios. For day-to-day use, Azure AD credentials ensure audit logging and RBAC enforcement.

