Azure DevOps Pipelines: YAML Pipelines, Templates, Service Connections, and AKS Integration

Azure DevOps Pipelines#

Azure DevOps Pipelines uses YAML files stored in your repository to define build and deployment workflows. The pipeline model has three levels: stages contain jobs, jobs contain steps. This hierarchy maps directly to how you think about CI/CD – build stage, test stage, deploy-to-staging stage, deploy-to-production stage – with each stage containing one or more parallel jobs.

Pipeline Structure#

A complete pipeline in azure-pipelines.yml:

trigger:
  branches:
    include:
      - main
      - release/*
  paths:
    exclude:
      - docs/**
      - README.md

pool:
  vmImage: 'ubuntu-latest'

variables:
  - group: common-vars
  - name: buildConfiguration
    value: 'Release'

stages:
  - stage: Build
    jobs:
      - job: BuildApp
        steps:
          - task: GoTool@0
            inputs:
              version: '1.22'
          - script: |
              go build -o $(Build.ArtifactStagingDirectory)/myapp ./cmd/myapp
            displayName: 'Build binary'
          - publish: $(Build.ArtifactStagingDirectory)
            artifact: drop

  - stage: Test
    dependsOn: Build
    jobs:
      - job: UnitTests
        steps:
          - task: GoTool@0
            inputs:
              version: '1.22'
          - script: go test ./... -v -coverprofile=coverage.out
            displayName: 'Run tests'
          - task: PublishCodeCoverageResults@2
            inputs:
              summaryFileLocation: coverage.out
              codecoverageTool: 'Cobertura'

  - stage: DeployStaging
    dependsOn: Test
    condition: and(succeeded(), eq(variables['Build.SourceBranch'], 'refs/heads/main'))
    jobs:
      - deployment: DeployToStaging
        environment: staging
        strategy:
          runOnce:
            deploy:
              steps:
                - download: current
                  artifact: drop
                - script: echo "Deploying to staging"

trigger controls which branches and paths trigger the pipeline. dependsOn creates stage ordering. condition adds logic – succeeded() checks the previous stage passed, and you can combine it with variable checks to restrict certain stages to specific branches.

AKS Identity and Security: Entra ID, Workload Identity, and Policy

AKS Identity and Security#

AKS identity operates at three levels: who can access the cluster API (authentication), what they can do inside it (authorization), and how pods authenticate to Azure services (workload identity). Each level has Azure-specific mechanisms that replace or extend vanilla Kubernetes patterns.

Entra ID Integration (Azure AD)#

AKS supports two Entra ID integration modes.

AKS-managed Azure AD: Enable with --enable-aad at cluster creation. AKS handles the app registrations and token validation. This is the recommended approach.

AKS Networking and Ingress Deep Dive

AKS Networking and Ingress#

AKS networking involves three layers: how pods communicate (CNI plugin), how traffic enters the cluster (load balancers and ingress controllers), and how the cluster connects to other Azure resources (VNet integration, private endpoints). Each layer has Azure-specific behavior that differs from generic Kubernetes.

Azure Load Balancer for Services#

When you create a Service of type LoadBalancer in AKS, Azure provisions a Standard SKU Azure Load Balancer. AKS manages the load balancer rules and health probes automatically.

AKS Setup and Configuration: Clusters, Node Pools, and Networking

AKS Setup and Configuration#

Azure Kubernetes Service handles the control plane for you – you pay nothing for it. What you configure is node pools, networking, identity, and add-ons. Getting these right at cluster creation matters because several choices (networking model, managed identity) cannot be changed later without rebuilding the cluster.

Creating a Cluster with az CLI#

The minimal command that produces a production-usable cluster:

az aks create \
  --resource-group myapp-rg \
  --name myapp-aks \
  --location eastus2 \
  --node-count 3 \
  --node-vm-size Standard_D4s_v5 \
  --network-plugin azure \
  --network-plugin-mode overlay \
  --vnet-subnet-id /subscriptions/<sub>/resourceGroups/myapp-rg/providers/Microsoft.Network/virtualNetworks/myapp-vnet/subnets/aks-subnet \
  --enable-managed-identity \
  --enable-aad \
  --aad-admin-group-object-ids <admin-group-id> \
  --generate-ssh-keys \
  --tier standard

Key flags: --network-plugin azure --network-plugin-mode overlay gives you Azure CNI Overlay, which avoids the IP exhaustion problems of classic Azure CNI. --tier standard enables the financially-backed SLA and uptime guarantees (the free tier has no SLA). --enable-aad integrates Entra ID (formerly Azure AD) for authentication.

AKS Troubleshooting: Diagnosing Common Azure Kubernetes Problems

AKS Troubleshooting#

AKS problems fall into categories: node pool operations stuck or failed, pods not scheduling, storage not provisioning, authentication broken, and ingress not working. Each has Azure-specific causes that generic Kubernetes debugging will not surface.

Node Pool Stuck in Updating or Failed#

Node pool operations (scaling, upgrading, changing settings) can get stuck. The AKS API reports the pool as “Updating” indefinitely or transitions to “Failed.”

# Check node pool provisioning state
az aks nodepool show \
  --resource-group myapp-rg \
  --cluster-name myapp-aks \
  --name workload \
  --query provisioningState

# Check the activity log for errors
az monitor activity-log list \
  --resource-group myapp-rg \
  --query "[?contains(operationName.value, 'Microsoft.ContainerService')].{op:operationName.value, status:status.value, msg:properties.statusMessage}" \
  --output table

Common causes and fixes:

Azure Fundamentals for Agents

Resource Groups and Subscriptions#

Azure organizes resources in a hierarchy: Management Groups > Subscriptions > Resource Groups > Resources. An agent interacts most frequently with resource groups and the resources inside them.

A resource group is a logical container. Every Azure resource belongs to exactly one resource group. Resource groups are regional, but can contain resources from any region. They serve as the deployment boundary, the access control boundary, and the lifecycle boundary – deleting a resource group deletes everything inside it.

Cloud-Native vs Portable Infrastructure: A Decision Framework

Cloud-Native vs Portable Infrastructure#

Every infrastructure decision sits on a spectrum between portability and fidelity. On one end, you have generic Kubernetes running on minikube or kind – it works everywhere, costs nothing, and captures the behavior of the Kubernetes API itself. On the other end, you have cloud-native managed services – EKS with IRSA and ALB Ingress Controller, GKE with Workload Identity and Cloud Load Balancing, AKS with Azure AD Pod Identity and Azure Load Balancer. These capture the behavior of the actual platform your workloads will run on.

Ephemeral Cloud Clusters: Create, Validate, Destroy Sequences for EKS, GKE, and AKS

Ephemeral Cloud Clusters#

Ephemeral clusters exist for one purpose: validate something, then disappear. They are not staging environments, not shared dev clusters, not long-lived resources that someone forgets to turn off. The operational model is strict – create, validate, destroy – and the entire sequence must be automated so that destruction cannot be forgotten.

The cost of getting this wrong is real. A three-node EKS cluster left running over a weekend costs roughly $15. Left running for a month, $200. Multiply by the number of developers or CI pipelines that create clusters, and forgotten ephemeral infrastructure becomes a significant budget line item. Every template in this article includes auto-destroy mechanisms to prevent this.

Minikube to Cloud Migration: 10 Things That Change on EKS, GKE, and AKS

Minikube to Cloud Migration Guide#

Minikube is excellent for learning and local development. But almost everything that “just works” on minikube requires explicit configuration on a cloud cluster. Here are the 10 things that change.

1. Ingress Controller Becomes a Cloud Load Balancer#

On minikube: You enable the NGINX ingress addon with minikube addons enable ingress. Traffic reaches your services through minikube tunnel or minikube service.

On cloud: The ingress controller must be deployed explicitly, and it provisions a real cloud load balancer. On AWS, the AWS Load Balancer Controller creates ALBs or NLBs from Ingress resources. On GKE, the built-in GCE ingress controller creates Google Cloud Load Balancers. You pay per load balancer.

Running Windows Workloads on Kubernetes: Node Pools, Scheduling, and Gotchas

Running Windows Workloads on Kubernetes#

Kubernetes supports Windows worker nodes alongside Linux worker nodes in the same cluster. This enables running Windows-native applications – .NET Framework services, IIS-hosted applications, Windows-specific middleware – on Kubernetes without rewriting them for Linux. However, Windows nodes are not interchangeable with Linux nodes. There are fundamental differences in networking, storage, container runtime behavior, and resource management that you must account for.

Core Constraints#

Before adding Windows nodes, understand what is and is not supported: