Pipeline Observability: CI/CD Metrics, DORA, OpenTelemetry, and Grafana Dashboards

Pipeline Observability#

You cannot improve what you do not measure. Most teams have detailed monitoring for their production applications but treat their CI/CD pipelines as black boxes. When builds are slow, flaky, or failing, the response is anecdotal – “builds feel slow lately” – rather than data-driven. Pipeline observability turns CI/CD from a cost center you tolerate into infrastructure you actively manage.

Core CI/CD Metrics#

Build Duration#

Total time from pipeline trigger to completion. Track this as a histogram, not an average, because averages hide bimodal distributions. A pipeline that takes 5 minutes for code-only changes and 25 minutes for dependency updates averages 15 minutes, which describes neither case accurately.

CI/CD Anti-Patterns and Migration Strategies: From Snowflakes to Scalable Pipelines

CI/CD Anti-Patterns and Migration Strategies#

CI/CD pipelines accumulate technical debt faster than application code. Nobody refactors a Jenkinsfile. Nobody reviews pipeline YAML with the same rigor as production code. Over time, pipelines become slow, fragile, inconsistent, and actively hostile to developer productivity. Recognizing the anti-patterns is the first step. Migrating to better tooling is often the second.

Anti-Pattern: Snowflake Pipelines#

Every repository has a unique pipeline that someone wrote three years ago and nobody fully understands. Repository A uses Makefile targets, B uses bash scripts, C calls Python, and D has inline shell commands across 40 pipeline steps. There is no shared structure, no reusable components, and no way to make organization-wide changes.

Advanced Git Operations: Rebase, Cherry-Pick, Bisect, and Repository Maintenance

Advanced Git Operations#

These are the commands that separate someone who uses Git from someone who understands it. Each one solves a specific problem that basic Git workflows cannot handle.

Interactive Rebase#

Interactive rebase rewrites commit history. Use it to clean up a branch before merging.

# Rebase the last 4 commits interactively
git rebase -i HEAD~4

This opens an editor with your commits listed oldest-first:

pick a1b2c3d feat: add user export endpoint
pick e4f5g6h WIP: export formatting
pick i7j8k9l fix typo in export
pick m0n1o2p feat: add CSV download button

Change the commands to reshape history:

Advanced GitHub Actions Patterns: Matrix Builds, OIDC, Composite Actions, and Self-Hosted Runners

Advanced GitHub Actions Patterns#

Once you understand the basics of GitHub Actions, these patterns solve the real-world problems: testing across multiple environments, authenticating to cloud providers without static secrets, building reusable action components, and scaling runners.

Matrix Builds#

Test across multiple OS versions, language versions, or configurations in parallel:

jobs:
  test:
    strategy:
      fail-fast: false
      matrix:
        os: [ubuntu-latest, macos-latest]
        go-version: ['1.22', '1.23']
    runs-on: ${{ matrix.os }}
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-go@v5
        with:
          go-version: ${{ matrix.go-version }}
      - run: go test ./...

This creates 4 jobs (2 OS x 2 Go versions) running in parallel. Set fail-fast: false so a failure in one combination does not cancel the others – you want to see all failures at once.

ArgoCD Image Updater: Automatic Image Tag Updates Without Git Commits

ArgoCD Image Updater#

ArgoCD Image Updater watches container registries for new image tags and automatically updates ArgoCD Applications to use them. In a standard GitOps workflow, updating an image tag requires a Git commit that changes the tag in a values file or manifest. Image Updater automates that step.

The Problem It Solves#

Standard GitOps image update flow:

CI builds image → pushes myapp:v1.2.3 to registry
    → Developer (or CI) commits "update image tag to v1.2.3" to Git
    → ArgoCD detects Git change
    → ArgoCD syncs new tag to cluster

That middle step – committing the tag update – is friction. CI pipelines need Git write access, commit messages are noise (“bump image to v1.2.4”, “bump image to v1.2.5”), and the delay between image push and deployment depends on how fast the commit pipeline runs.

ArgoCD Multi-Cluster Management: Hub-Spoke Patterns, Cluster Registration, and Fleet Operations

ArgoCD Multi-Cluster Management#

A single ArgoCD instance can manage deployments across dozens of Kubernetes clusters. This is one of ArgoCD’s strongest features and the standard approach for organizations with multiple environments, regions, or cloud providers.

Hub-Spoke Architecture#

The standard multi-cluster pattern runs ArgoCD on one “hub” cluster that deploys to multiple “spoke” clusters:

Hub Cluster (management)
├── ArgoCD control plane
├── Application/ApplicationSet definitions
├── RBAC policies
└── Cluster credentials (Secrets)
    │
    ├──→ Spoke Cluster: dev (us-east-1)
    ├──→ Spoke Cluster: staging (us-west-2)
    ├──→ Spoke Cluster: prod-us (us-east-1)
    ├──→ Spoke Cluster: prod-eu (eu-west-1)
    └──→ Spoke Cluster: prod-apac (ap-southeast-1)

ArgoCD on the hub cluster connects to each spoke cluster’s API server to apply manifests and check health. The spoke clusters do not need ArgoCD installed.

ArgoCD Notifications: Slack, Teams, Webhooks, and Custom Triggers

ArgoCD Notifications#

ArgoCD Notifications is a built-in component (since ArgoCD 2.5) that monitors applications and sends alerts when specific events occur – sync succeeded, sync failed, health degraded, new version deployed. Before notifications existed, teams polled the ArgoCD UI or built custom watchers. Notifications eliminates that.

Architecture#

ArgoCD Notifications runs as a controller alongside the ArgoCD application controller. It watches Application resources for state changes and matches them against triggers. When a trigger fires, it renders a template and sends it through a configured service (Slack, Teams, webhook, email, etc.).

ArgoCD Patterns: App of Apps, ApplicationSets, Multi-Environment Management, and Source Strategies

ArgoCD Patterns#

Once ArgoCD is running and you have a few applications deployed, you hit a scaling problem: managing dozens or hundreds of Application resources by hand is unsustainable. These patterns solve that.

App of Apps#

The App of Apps pattern uses one ArgoCD Application to manage other Application resources. You create a “root” application that points to a directory containing Application YAML files. When ArgoCD syncs the root app, it creates all the child applications.

ArgoCD Secrets Management: Sealed Secrets, External Secrets Operator, and SOPS

ArgoCD Secrets Management#

GitOps says everything should be in Git. Kubernetes Secrets are base64-encoded, not encrypted. Committing base64 secrets to Git is equivalent to committing plaintext – anyone with repo access can decode them. This is the fundamental tension of GitOps secrets management.

Three approaches solve this, each with different tradeoffs.

Approach 1: Sealed Secrets#

Sealed Secrets encrypts secrets client-side so the encrypted form can be safely committed to Git. Only the Sealed Secrets controller running in-cluster can decrypt them.

ArgoCD Setup and Basics: Installation, CLI, First Application, and Sync Policies

ArgoCD Setup and Basics#

ArgoCD is a declarative GitOps continuous delivery tool for Kubernetes. It watches Git repositories and ensures your cluster state matches what is declared in those repos. When someone changes a manifest in Git, ArgoCD detects the drift and either alerts you or automatically applies the change.

Installation with Plain Manifests#

The fastest path to a running ArgoCD instance:

kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

This installs the full ArgoCD stack: API server, repo server, application controller, Redis, and Dex (for SSO). For a minimal install without SSO and the UI, use namespace-install.yaml instead, which also scopes ArgoCD to a single namespace.