Secrets Management in CI/CD Pipelines: OIDC, Vault Integration, and Credential Hygiene

Secrets Management in CI/CD Pipelines#

Every CI/CD pipeline needs credentials: registry tokens, cloud provider keys, database passwords, API keys for third-party services. How you store, deliver, and scope those credentials determines whether a single compromised pipeline job can escalate into a full infrastructure breach. The difference between a mature and an immature pipeline is rarely in the build steps – it is in the secrets management.

The Problem with Static Secrets#

The default approach on every CI platform is storing secrets as encrypted variables: GitHub Actions secrets, GitLab CI variables, Jenkins credentials store. These work but create compounding risks:

AKS Identity and Security: Entra ID, Workload Identity, and Policy

AKS Identity and Security#

AKS identity operates at three levels: who can access the cluster API (authentication), what they can do inside it (authorization), and how pods authenticate to Azure services (workload identity). Each level has Azure-specific mechanisms that replace or extend vanilla Kubernetes patterns.

Entra ID Integration (Azure AD)#

AKS supports two Entra ID integration modes.

AKS-managed Azure AD: Enable with --enable-aad at cluster creation. AKS handles the app registrations and token validation. This is the recommended approach.

Azure Terraform Patterns: Resource Groups, AKS, Managed Identity, and Common Gotchas

Azure Terraform Patterns#

Azure’s Terraform provider (azurerm) has its own idioms, naming conventions, and gotchas that differ significantly from AWS. The biggest differences: everything lives in a Resource Group, identity management uses Managed Identity (not IAM roles), and many services require explicit Private DNS Zone configuration for private networking.

Resource Groups: Azure’s Organizational Unit#

Every Azure resource belongs to a Resource Group. This is the first thing you create and the last thing you delete.

Cloud Behavioral Divergence Guide: Where AWS, Azure, and GCP Actually Differ

Cloud Behavioral Divergence Guide#

Running the “same” workload on AWS, Azure, and GCP does not produce the same behavior. The Kubernetes API is portable, application containers are portable, and SQL queries are portable. Everything else – identity, networking, storage, load balancing, DNS, and managed service behavior – diverges in ways that matter for production reliability.

This guide documents the specific divergence points with practical examples. Use it when translating infrastructure from one cloud to another, when debugging behavior that differs between environments, or when assessing migration risk.

GCP Terraform Patterns: Projects, GKE, Workload Identity, Cloud SQL, and Common Gotchas

GCP Terraform Patterns#

GCP’s Terraform provider (google and google-beta) has patterns distinct from both AWS and Azure. The biggest differences: APIs must be explicitly enabled per project, IAM uses a binding model (not inline policies), and GKE requires secondary IP ranges for VPC-native networking. GCP resources also tend to have longer creation times with more eventual consistency.

Projects and API Enablement#

Before creating any resource in GCP, the corresponding API must be enabled in the project. This is the most common source of first-time failures.

GKE Security and Identity

GKE Security and Identity#

GKE security covers identity (who can do what), workload isolation (sandboxing untrusted code), supply chain integrity (ensuring only trusted images run), and data protection (encryption at rest). These features layer on top of standard Kubernetes RBAC and network policies.

Workload Identity Federation#

Workload Identity Federation is the successor to the original Workload Identity. It removes the need for a separate workload-pool flag and uses the standard GCP IAM federation model. The concept is the same: bind a Kubernetes service account to a Google Cloud service account so pods get GCP credentials without exported keys.

GKE Setup and Configuration

GKE Setup and Configuration#

GKE is Google’s managed Kubernetes service. The two major decisions when creating a cluster are the mode (Standard vs Autopilot) and the networking model (VPC-native is now the default and the only option for new clusters). Everything else – node pools, release channels, Workload Identity – layers on top of those choices.

Standard vs Autopilot#

Standard mode gives you full control over node pools, machine types, and node configuration. You manage capacity, pay per node (whether pods are using the resources or not), and can run DaemonSets, privileged containers, and host-network pods.

Service Account Security: Tokens, RBAC Binding, and Workload Identity

Service Account Security#

Every pod in Kubernetes runs as a service account. By default, that is the default service account in the pod’s namespace, with an auto-mounted API token that never expires. This default configuration is overly permissive for most workloads. Hardening service accounts is one of the highest-impact security improvements you can make in a Kubernetes cluster.

The Default Problem#

When a pod starts without specifying a service account, Kubernetes does three things: