Secrets Management in CI/CD Pipelines: OIDC, Vault Integration, and Credential Hygiene

Secrets Management in CI/CD Pipelines#

Every CI/CD pipeline needs credentials: registry tokens, cloud provider keys, database passwords, API keys for third-party services. How you store, deliver, and scope those credentials determines whether a single compromised pipeline job can escalate into a full infrastructure breach. The difference between a mature and an immature pipeline is rarely in the build steps – it is in the secrets management.

The Problem with Static Secrets#

The default approach on every CI platform is storing secrets as encrypted variables: GitHub Actions secrets, GitLab CI variables, Jenkins credentials store. These work but create compounding risks:

Self-Hosted CI Runners at Scale: GitHub Actions Runner Controller, GitLab Runners on K8s, and Autoscaling

Self-Hosted CI Runners at Scale#

GitHub-hosted and GitLab SaaS runners work until they do not. You hit limits when you need private network access to deploy to internal infrastructure, specific hardware like GPUs or ARM64 machines, compliance requirements that prohibit running code on shared infrastructure, or cost control when you are burning thousands of dollars per month on hosted runner minutes.

Self-hosted runners solve these problems but introduce operational complexity: you now own runner provisioning, scaling, security, image updates, and cost management.

CI/CD Anti-Patterns and Migration Strategies: From Snowflakes to Scalable Pipelines

CI/CD Anti-Patterns and Migration Strategies#

CI/CD pipelines accumulate technical debt faster than application code. Nobody refactors a Jenkinsfile. Nobody reviews pipeline YAML with the same rigor as production code. Over time, pipelines become slow, fragile, inconsistent, and actively hostile to developer productivity. Recognizing the anti-patterns is the first step. Migrating to better tooling is often the second.

Anti-Pattern: Snowflake Pipelines#

Every repository has a unique pipeline that someone wrote three years ago and nobody fully understands. Repository A uses Makefile targets, B uses bash scripts, C calls Python, and D has inline shell commands across 40 pipeline steps. There is no shared structure, no reusable components, and no way to make organization-wide changes.

Advanced GitHub Actions Patterns: Matrix Builds, OIDC, Composite Actions, and Self-Hosted Runners

Advanced GitHub Actions Patterns#

Once you understand the basics of GitHub Actions, these patterns solve the real-world problems: testing across multiple environments, authenticating to cloud providers without static secrets, building reusable action components, and scaling runners.

Matrix Builds#

Test across multiple OS versions, language versions, or configurations in parallel:

jobs:
  test:
    strategy:
      fail-fast: false
      matrix:
        os: [ubuntu-latest, macos-latest]
        go-version: ['1.22', '1.23']
    runs-on: ${{ matrix.os }}
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-go@v5
        with:
          go-version: ${{ matrix.go-version }}
      - run: go test ./...

This creates 4 jobs (2 OS x 2 Go versions) running in parallel. Set fail-fast: false so a failure in one combination does not cancel the others – you want to see all failures at once.

Building a Kubernetes Deployment Pipeline: From Code Push to Production

Building a Kubernetes Deployment Pipeline: From Code Push to Production#

A deployment pipeline connects a code commit to a running container in your cluster. This operational sequence walks through building one end-to-end, with decision points at each phase and verification steps to confirm the pipeline works before moving on.

Phase 1 – Source Control and CI#

Step 1: Repository Structure#

Every deployable service needs three things alongside its application code: a Dockerfile, deployment manifests, and a CI pipeline definition.

Debugging GitHub Actions: Triggers, Failures, Secrets, Caching, and Performance

Debugging GitHub Actions#

When a GitHub Actions workflow fails or does not behave as expected, the problem falls into a few predictable categories. This guide covers each one with the diagnostic steps and fixes.

Workflow Not Triggering#

The most common GitHub Actions “bug” is a workflow that never runs.

Check the event and branch filter. A push trigger with branches: [main] will not fire for pushes to feature/xyz. A pull_request trigger fires for the PR’s head branch, not the base branch:

GitHub Actions Fundamentals: Workflows, Triggers, Jobs, and Data Passing

GitHub Actions Fundamentals#

GitHub Actions is CI/CD built into GitHub. Workflows are YAML files in .github/workflows/. They run on GitHub-hosted or self-hosted machines in response to repository events. No external CI server required.

Workflow File Structure#

Every workflow has three levels: workflow (triggers and config), jobs (parallel units of work), and steps (sequential commands within a job).

name: CI

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-go@v5
        with:
          go-version: '1.23'
      - run: go test ./...

  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: golangci/golangci-lint-action@v6

Jobs run in parallel by default. Steps within a job run sequentially. Each job gets a fresh runner – no state carries over between jobs unless you explicitly pass it via artifacts or outputs.

GitHub Actions Kubernetes Pipeline: From Git Push to Helm Deploy

GitHub Actions Kubernetes Pipeline#

This guide builds a complete pipeline: push code, build a container image, validate the Helm chart, and deploy to Kubernetes. Each stage gates the next, so broken images never reach your cluster.

Pipeline Overview#

The pipeline has four stages:

  1. Build and push the container image to GitHub Container Registry (GHCR).
  2. Lint and validate the Helm chart with helm lint and kubeconform.
  3. Deploy to dev automatically on pushes to main.
  4. Promote to staging and production via manual approval.

Complete Workflow File#

# .github/workflows/deploy.yml
name: Build and Deploy

on:
  push:
    branches: [main]
  workflow_dispatch:
    inputs:
      environment:
        description: "Target environment"
        required: true
        type: choice
        options: [dev, staging, production]

env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}

jobs:
  build:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write
    outputs:
      image-tag: ${{ steps.meta.outputs.version }}
    steps:
      - uses: actions/checkout@v4

      - name: Log in to GHCR
        uses: docker/login-action@v3
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Extract metadata
        id: meta
        uses: docker/metadata-action@v5
        with:
          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
          tags: |
            type=sha,prefix=
            type=ref,event=branch

      - name: Build and push
        uses: docker/build-push-action@v6
        with:
          context: .
          push: true
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}

  validate:
    runs-on: ubuntu-latest
    needs: build
    steps:
      - uses: actions/checkout@v4

      - name: Install Helm
        uses: azure/setup-helm@v4

      - name: Helm lint
        run: helm lint ./charts/my-app -f charts/my-app/values.yaml

      - name: Install kubeconform
        run: |
          curl -sL https://github.com/yannh/kubeconform/releases/latest/download/kubeconform-linux-amd64.tar.gz \
            | tar xz -C /usr/local/bin

      - name: Validate rendered templates
        run: |
          helm template my-app ./charts/my-app \
            --set image.tag=${{ needs.build.outputs.image-tag }} \
            | kubeconform -strict -summary \
              -kubernetes-version 1.29.0

  deploy-dev:
    runs-on: ubuntu-latest
    needs: [build, validate]
    if: github.ref == 'refs/heads/main'
    environment: dev
    steps:
      - uses: actions/checkout@v4

      - name: Install Helm
        uses: azure/setup-helm@v4

      - name: Set up kubeconfig
        run: |
          mkdir -p ~/.kube
          echo "${{ secrets.KUBECONFIG_DEV }}" | base64 -d > ~/.kube/config
          chmod 600 ~/.kube/config

      - name: Deploy with Helm
        run: |
          helm upgrade --install my-app ./charts/my-app \
            --namespace my-app-dev \
            --create-namespace \
            -f charts/my-app/values-dev.yaml \
            --set image.tag=${{ needs.build.outputs.image-tag }} \
            --wait --timeout 300s

      - name: Verify deployment
        run: kubectl rollout status deployment/my-app -n my-app-dev --timeout=120s

  deploy-staging:
    runs-on: ubuntu-latest
    needs: [build, validate, deploy-dev]
    environment: staging
    steps:
      - uses: actions/checkout@v4

      - name: Install Helm
        uses: azure/setup-helm@v4

      - name: Set up kubeconfig
        run: |
          mkdir -p ~/.kube
          echo "${{ secrets.KUBECONFIG_STAGING }}" | base64 -d > ~/.kube/config
          chmod 600 ~/.kube/config

      - name: Deploy with Helm
        run: |
          helm upgrade --install my-app ./charts/my-app \
            --namespace my-app-staging \
            --create-namespace \
            -f charts/my-app/values-staging.yaml \
            --set image.tag=${{ needs.build.outputs.image-tag }} \
            --wait --timeout 300s

  deploy-production:
    runs-on: ubuntu-latest
    needs: [build, validate, deploy-staging]
    environment: production
    steps:
      - uses: actions/checkout@v4

      - name: Install Helm
        uses: azure/setup-helm@v4

      - name: Set up kubeconfig
        run: |
          mkdir -p ~/.kube
          echo "${{ secrets.KUBECONFIG_PROD }}" | base64 -d > ~/.kube/config
          chmod 600 ~/.kube/config

      - name: Deploy with Helm
        run: |
          helm upgrade --install my-app ./charts/my-app \
            --namespace my-app-prod \
            --create-namespace \
            -f charts/my-app/values-production.yaml \
            --set image.tag=${{ needs.build.outputs.image-tag }} \
            --wait --timeout 300s

Key Design Decisions#

Image Tagging with Git SHA#

The docker/metadata-action generates tags from the git SHA. This creates immutable, traceable image tags – you can always identify exactly which commit produced a given deployment.

Running Terraform in CI/CD Pipelines

The Core Pattern#

The standard CI/CD pattern for Terraform: run terraform plan on every pull request and post the output as a PR comment. Run terraform apply only when the PR merges to main. This gives reviewers visibility into what will change before approving.

GitHub Actions Workflow#

name: Terraform
on:
  pull_request:
    paths: ["infra/**"]
  push:
    branches: [main]
    paths: ["infra/**"]

permissions:
  id-token: write    # OIDC
  contents: read
  pull-requests: write  # PR comments

jobs:
  terraform:
    runs-on: ubuntu-latest
    defaults:
      run:
        working-directory: infra
    steps:
      - uses: actions/checkout@v4

      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789012:role/terraform-ci
          aws-region: us-east-1

      - uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: 1.7.0
          terraform_wrapper: true  # captures stdout for PR comments

      - name: Terraform Init
        run: terraform init -input=false

      - name: Terraform Format Check
        run: terraform fmt -check -recursive

      - name: Terraform Validate
        run: terraform validate

      - name: Terraform Plan
        id: plan
        run: terraform plan -input=false -no-color -out=tfplan
        continue-on-error: true

      - name: Post Plan to PR
        if: github.event_name == 'pull_request'
        uses: actions/github-script@v7
        with:
          script: |
            const plan = `${{ steps.plan.outputs.stdout }}`;
            const truncated = plan.length > 60000
              ? plan.substring(0, 60000) + "\n\n... truncated ..."
              : plan;
            await github.rest.issues.createComment({
              owner: context.repo.owner,
              repo: context.repo.repo,
              issue_number: context.issue.number,
              body: `#### Terraform Plan\n\`\`\`\n${truncated}\n\`\`\``
            });

      - name: Plan Status
        if: steps.plan.outcome == 'failure'
        run: exit 1

      - name: Terraform Apply
        if: github.ref == 'refs/heads/main' && github.event_name == 'push'
        run: terraform apply -input=false tfplan

The plan step uses continue-on-error: true so the PR comment step still runs. A separate step checks the actual plan outcome. Apply only runs on pushes to main.

Choosing a GitOps Tool: ArgoCD vs Flux vs Jenkins vs GitHub Actions for Kubernetes Deployments

Choosing a GitOps Tool#

The term “GitOps” is applied to everything from a simple kubectl apply in a GitHub Actions workflow to a fully reconciled, pull-based deployment architecture with drift detection. These are fundamentally different approaches. Choosing between them depends on your team’s operational maturity, cluster count, and tolerance for running controllers in your cluster.

What GitOps Actually Means#

GitOps, as defined by the OpenGitOps principles (a CNCF sandbox project), has four requirements: declarative desired state, state versioned in git, changes applied automatically, and continuous reconciliation with drift detection. The last two are what separate true GitOps from “CI/CD that uses git.”