GitHub Actions Fundamentals: Workflows, Triggers, Jobs, and Data Passing

GitHub Actions Fundamentals#

GitHub Actions is CI/CD built into GitHub. Workflows are YAML files in .github/workflows/. They run on GitHub-hosted or self-hosted machines in response to repository events. No external CI server required.

Workflow File Structure#

Every workflow has three levels: workflow (triggers and config), jobs (parallel units of work), and steps (sequential commands within a job).

name: CI

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-go@v5
        with:
          go-version: '1.23'
      - run: go test ./...

  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: golangci/golangci-lint-action@v6

Jobs run in parallel by default. Steps within a job run sequentially. Each job gets a fresh runner – no state carries over between jobs unless you explicitly pass it via artifacts or outputs.

GitHub Actions Kubernetes Pipeline: From Git Push to Helm Deploy

GitHub Actions Kubernetes Pipeline#

This guide builds a complete pipeline: push code, build a container image, validate the Helm chart, and deploy to Kubernetes. Each stage gates the next, so broken images never reach your cluster.

Pipeline Overview#

The pipeline has four stages:

  1. Build and push the container image to GitHub Container Registry (GHCR).
  2. Lint and validate the Helm chart with helm lint and kubeconform.
  3. Deploy to dev automatically on pushes to main.
  4. Promote to staging and production via manual approval.

Complete Workflow File#

# .github/workflows/deploy.yml
name: Build and Deploy

on:
  push:
    branches: [main]
  workflow_dispatch:
    inputs:
      environment:
        description: "Target environment"
        required: true
        type: choice
        options: [dev, staging, production]

env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}

jobs:
  build:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write
    outputs:
      image-tag: ${{ steps.meta.outputs.version }}
    steps:
      - uses: actions/checkout@v4

      - name: Log in to GHCR
        uses: docker/login-action@v3
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Extract metadata
        id: meta
        uses: docker/metadata-action@v5
        with:
          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
          tags: |
            type=sha,prefix=
            type=ref,event=branch

      - name: Build and push
        uses: docker/build-push-action@v6
        with:
          context: .
          push: true
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}

  validate:
    runs-on: ubuntu-latest
    needs: build
    steps:
      - uses: actions/checkout@v4

      - name: Install Helm
        uses: azure/setup-helm@v4

      - name: Helm lint
        run: helm lint ./charts/my-app -f charts/my-app/values.yaml

      - name: Install kubeconform
        run: |
          curl -sL https://github.com/yannh/kubeconform/releases/latest/download/kubeconform-linux-amd64.tar.gz \
            | tar xz -C /usr/local/bin

      - name: Validate rendered templates
        run: |
          helm template my-app ./charts/my-app \
            --set image.tag=${{ needs.build.outputs.image-tag }} \
            | kubeconform -strict -summary \
              -kubernetes-version 1.29.0

  deploy-dev:
    runs-on: ubuntu-latest
    needs: [build, validate]
    if: github.ref == 'refs/heads/main'
    environment: dev
    steps:
      - uses: actions/checkout@v4

      - name: Install Helm
        uses: azure/setup-helm@v4

      - name: Set up kubeconfig
        run: |
          mkdir -p ~/.kube
          echo "${{ secrets.KUBECONFIG_DEV }}" | base64 -d > ~/.kube/config
          chmod 600 ~/.kube/config

      - name: Deploy with Helm
        run: |
          helm upgrade --install my-app ./charts/my-app \
            --namespace my-app-dev \
            --create-namespace \
            -f charts/my-app/values-dev.yaml \
            --set image.tag=${{ needs.build.outputs.image-tag }} \
            --wait --timeout 300s

      - name: Verify deployment
        run: kubectl rollout status deployment/my-app -n my-app-dev --timeout=120s

  deploy-staging:
    runs-on: ubuntu-latest
    needs: [build, validate, deploy-dev]
    environment: staging
    steps:
      - uses: actions/checkout@v4

      - name: Install Helm
        uses: azure/setup-helm@v4

      - name: Set up kubeconfig
        run: |
          mkdir -p ~/.kube
          echo "${{ secrets.KUBECONFIG_STAGING }}" | base64 -d > ~/.kube/config
          chmod 600 ~/.kube/config

      - name: Deploy with Helm
        run: |
          helm upgrade --install my-app ./charts/my-app \
            --namespace my-app-staging \
            --create-namespace \
            -f charts/my-app/values-staging.yaml \
            --set image.tag=${{ needs.build.outputs.image-tag }} \
            --wait --timeout 300s

  deploy-production:
    runs-on: ubuntu-latest
    needs: [build, validate, deploy-staging]
    environment: production
    steps:
      - uses: actions/checkout@v4

      - name: Install Helm
        uses: azure/setup-helm@v4

      - name: Set up kubeconfig
        run: |
          mkdir -p ~/.kube
          echo "${{ secrets.KUBECONFIG_PROD }}" | base64 -d > ~/.kube/config
          chmod 600 ~/.kube/config

      - name: Deploy with Helm
        run: |
          helm upgrade --install my-app ./charts/my-app \
            --namespace my-app-prod \
            --create-namespace \
            -f charts/my-app/values-production.yaml \
            --set image.tag=${{ needs.build.outputs.image-tag }} \
            --wait --timeout 300s

Key Design Decisions#

Image Tagging with Git SHA#

The docker/metadata-action generates tags from the git SHA. This creates immutable, traceable image tags – you can always identify exactly which commit produced a given deployment.

GitLab CI/CD Pipeline Patterns: Stages, DAG Pipelines, Includes, and Registry Integration

GitLab CI/CD Pipeline Patterns#

GitLab CI/CD runs pipelines defined in a .gitlab-ci.yml file at the repository root. Every push, merge request, or tag triggers a pipeline consisting of stages that contain jobs. The pipeline configuration is version-controlled alongside your code, so the build process evolves with the application.

Basic .gitlab-ci.yml Structure#

A minimal pipeline defines stages and jobs. Stages run sequentially; jobs within the same stage run in parallel:

stages:
  - build
  - test
  - deploy

build-app:
  stage: build
  image: golang:1.22
  script:
    - go build -o myapp ./cmd/myapp
  artifacts:
    paths:
      - myapp
    expire_in: 1 hour

unit-tests:
  stage: test
  image: golang:1.22
  script:
    - go test ./... -v -coverprofile=coverage.out
  artifacts:
    reports:
      coverage_report:
        coverage_format: cobertura
        path: coverage.out

deploy-staging:
  stage: deploy
  image: bitnami/kubectl:latest
  script:
    - kubectl set image deployment/myapp myapp=$CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
  environment:
    name: staging
    url: https://staging.example.com
  rules:
    - if: $CI_COMMIT_BRANCH == "main"

Every job must have a stage and a script. The image field specifies the Docker image the job runs inside. If omitted, it falls back to the pipeline-level default image or the runner’s default.

GitOps and Infrastructure as Code: Reconciliation Patterns for Terraform, ArgoCD, and Crossplane

GitOps and Infrastructure as Code#

GitOps says: the desired state is in Git. A controller continuously reconciles the real state to match. Infrastructure as Code says: the desired state is in code. A human (or agent) runs apply to push changes.

These two paradigms overlap but do not align perfectly. Kubernetes resources fit the GitOps model well — ArgoCD/Flux watch Git, detect differences, and apply changes continuously. Cloud infrastructure (VPCs, databases, IAM roles) fits the IaC model better — Terraform tracks state, computes diffs, and applies on command.

GKE Networking

GKE Networking#

GKE networking centers on VPC-native clusters, where pods and services get IP addresses from VPC subnet ranges. This integrates Kubernetes networking directly into Google Cloud’s VPC, enabling native routing, firewall rules, and load balancing without extra overlays.

VPC-Native Clusters and Alias IP Ranges#

VPC-native clusters use alias IP ranges on the subnet. You allocate two secondary ranges: one for pods, one for services.

# Create subnet with secondary ranges
gcloud compute networks subnets create gke-subnet \
  --network my-vpc \
  --region us-central1 \
  --range 10.0.0.0/20 \
  --secondary-range pods=10.4.0.0/14,services=10.8.0.0/20

# Create cluster using those ranges
gcloud container clusters create my-cluster \
  --region us-central1 \
  --network my-vpc \
  --subnetwork gke-subnet \
  --cluster-secondary-range-name pods \
  --services-secondary-range-name services \
  --enable-ip-alias

The pod range needs to be large. A /14 gives about 262,000 pod IPs. Each node reserves a /24 from the pod range (256 IPs, 110 usable pods per node). If you have 100 nodes, that consumes 100 /24 blocks. Undersizing the pod range is a common cause of IP exhaustion – the cluster cannot add nodes even though VMs are available.

GKE Security and Identity

GKE Security and Identity#

GKE security covers identity (who can do what), workload isolation (sandboxing untrusted code), supply chain integrity (ensuring only trusted images run), and data protection (encryption at rest). These features layer on top of standard Kubernetes RBAC and network policies.

Workload Identity Federation#

Workload Identity Federation is the successor to the original Workload Identity. It removes the need for a separate workload-pool flag and uses the standard GCP IAM federation model. The concept is the same: bind a Kubernetes service account to a Google Cloud service account so pods get GCP credentials without exported keys.

GKE Setup and Configuration

GKE Setup and Configuration#

GKE is Google’s managed Kubernetes service. The two major decisions when creating a cluster are the mode (Standard vs Autopilot) and the networking model (VPC-native is now the default and the only option for new clusters). Everything else – node pools, release channels, Workload Identity – layers on top of those choices.

Standard vs Autopilot#

Standard mode gives you full control over node pools, machine types, and node configuration. You manage capacity, pay per node (whether pods are using the resources or not), and can run DaemonSets, privileged containers, and host-network pods.

GKE Troubleshooting

GKE Troubleshooting#

GKE adds a layer of Google Cloud infrastructure on top of Kubernetes, which means some problems are pure Kubernetes issues and others are GKE-specific. This guide covers the GKE-specific problems that trip people up.

Autopilot Resource Adjustment#

Autopilot automatically mutates pod resource requests to fit its scheduling model. If you request cpu: 100m and memory: 128Mi, Autopilot may bump the request to cpu: 250m and memory: 512Mi. This affects your billing (you pay per resource request) and can cause unexpected OOMKills if the limits were set relative to the original request.

GPU and ML Workloads on Kubernetes: Scheduling, Sharing, and Monitoring

GPU and ML Workloads on Kubernetes#

Running GPU workloads on Kubernetes requires hardware-aware scheduling that the default scheduler does not provide out of the box. GPUs are expensive – an NVIDIA A100 node costs $3-12/hour on cloud providers – so efficient utilization matters far more than with CPU workloads. This article covers the full stack from device plugin installation through GPU sharing and monitoring.

The NVIDIA Device Plugin#

Kubernetes has no native understanding of GPUs. The NVIDIA device plugin bridges that gap by exposing GPUs as a schedulable resource (nvidia.com/gpu). Without it, the scheduler has no idea which nodes have GPUs or how many are available.

Grafana Dashboards for Kubernetes Monitoring

Data Source Configuration#

Grafana connects to backend data stores through data sources. For a complete Kubernetes observability stack, you need three: Prometheus for metrics, Loki for logs, and Tempo for traces.

Provision data sources declaratively so they survive Grafana restarts and are version-controlled:

# grafana/provisioning/datasources/observability.yml
apiVersion: 1
datasources:
  - name: Prometheus
    type: prometheus
    access: proxy
    url: http://prometheus-operated:9090
    isDefault: true
    jsonData:
      timeInterval: "15s"
      exemplarTraceIdDestinations:
        - name: traceID
          datasourceUid: tempo

  - name: Loki
    type: loki
    access: proxy
    url: http://loki-gateway:3100
    jsonData:
      derivedFields:
        - name: TraceID
          matcherRegex: '"traceID":"(\w+)"'
          url: "$${__value.raw}"
          datasourceUid: tempo

  - name: Tempo
    type: tempo
    access: proxy
    url: http://tempo:3100
    jsonData:
      tracesToMetrics:
        datasourceUid: prometheus
        tags: [{key: "service.name", value: "job"}]
      serviceMap:
        datasourceUid: prometheus
      nodeGraph:
        enabled: true

The cross-linking configuration lets you click from a metric data point to the trace that generated it, and extract trace IDs from log lines to link to Tempo.