Prometheus and Grafana on Minikube: Production-Like Monitoring Without the Cost

Why Monitor a POC Cluster#

Monitoring on minikube serves two purposes. First, it catches resource problems early – your app might work in tests but OOM-kill under load, and you will not know without metrics. Second, it validates that your monitoring configuration works before you deploy it to production. If your ServiceMonitors, dashboards, and alert rules work on minikube, they will work on EKS or GKE.

The Right Chart: kube-prometheus-stack#

There are multiple Prometheus-related Helm charts. Use the right one:

ArgoCD on Minikube: GitOps Deployments from Day One

Why GitOps on a POC Cluster#

Setting up ArgoCD on minikube is not about automating deployments for a local cluster – you could just run kubectl apply. The point is to prove the deployment workflow before production. If your Git repo structure, Helm values, and sync policies work on minikube, they will work on EKS or GKE. If you skip this and bolt on GitOps later, you will spend days restructuring your repo and debugging sync failures under production pressure.

Builder Pool Naming: The (role, tier, replica) Coordinate Decouples Identity From Model

Builder Pool Naming: The (role, tier, replica) Coordinate#

Naming agent pools after the model they run today (kimi-N, deepseek-N, flash-N, lite-N) felt natural when each pool ran one model. It stopped feeling natural the third time a pool’s model churned — when the lite-tier swapped through qwen → gemma → gemini in six weeks and every rename cascaded through K8s manifests, secret names, MM bot accounts, Gitea identities, and helm values. The fix was to make pool names model-independent: builder-lite-0 runs whatever model the pool config says it runs today.

Golden Paths and Paved Roads

What Golden Paths Are#

A golden path is a pre-built, opinionated workflow that gets a developer from zero to a production-ready artifact with minimal decisions. The term comes from Spotify’s internal platform work. Netflix calls them “paved roads.” The idea is the same: provide a well-maintained, well-tested default path that handles 80% of use cases, while allowing teams to go off-road when they have legitimate reasons.

A golden path is not a mandate. It is a recommendation backed by automation. Create a new Go microservice using the golden path and you get a repository with CI/CD, Kubernetes manifests, observability, and a Backstage catalog entry — working in minutes. The golden path removes the 40+ decisions a developer would otherwise need to make.

Running Temporal Server on Minikube

Running Temporal Server on Minikube#

This guide deploys Temporal Server on a local Minikube cluster with PostgreSQL persistence. By the end, you will have the Temporal frontend, Web UI, and CLI all working against a real Kubernetes deployment.

If you need background on what Temporal is, start with Introduction to Temporal.

Prerequisites#

ToolMinimum VersionPurpose
minikube1.32+Local Kubernetes cluster
kubectl1.28+Kubernetes CLI
helm3.14+Package manager for Kubernetes
temporal1.0+Temporal CLI
docker24+Container runtime (minikube driver)

Your machine needs at least 4 CPU cores and 8 GB RAM available to Docker. For minikube driver details, see Minikube Setup and Drivers and Minikube Docker Driver.

Temporal High Availability: Multi-Component Cluster on Kubernetes

Temporal High Availability#

A single-replica Temporal deployment works for development, but any pod going down takes the workflow engine offline. This guide configures a multi-replica cluster with proper resource allocation, Elasticsearch visibility, and health monitoring.

For the single-replica setup this builds on, see Running Temporal Server on Minikube.

Why HA Matters#

ComponentWhat Breaks When It Goes Down
FrontendNo client can start, signal, query, or cancel workflows. Workers cannot poll.
HistoryRunning workflows stall. No state transitions. Timers do not fire.
MatchingTasks queue up but never dispatch. Workflows appear frozen.
WorkerInternal system workflows stop (archival, replication). Application workflows unaffected.

With multiple replicas, losing a pod triggers a brief rebalance (seconds), not an outage.

Azure DevOps Pipelines: YAML Pipelines, Templates, Service Connections, and AKS Integration

Azure DevOps Pipelines#

Azure DevOps Pipelines uses YAML files stored in your repository to define build and deployment workflows. The pipeline model has three levels: stages contain jobs, jobs contain steps. This hierarchy maps directly to how you think about CI/CD – build stage, test stage, deploy-to-staging stage, deploy-to-production stage – with each stage containing one or more parallel jobs.

Pipeline Structure#

A complete pipeline in azure-pipelines.yml:

trigger:
  branches:
    include:
      - main
      - release/*
  paths:
    exclude:
      - docs/**
      - README.md

pool:
  vmImage: 'ubuntu-latest'

variables:
  - group: common-vars
  - name: buildConfiguration
    value: 'Release'

stages:
  - stage: Build
    jobs:
      - job: BuildApp
        steps:
          - task: GoTool@0
            inputs:
              version: '1.22'
          - script: |
              go build -o $(Build.ArtifactStagingDirectory)/myapp ./cmd/myapp
            displayName: 'Build binary'
          - publish: $(Build.ArtifactStagingDirectory)
            artifact: drop

  - stage: Test
    dependsOn: Build
    jobs:
      - job: UnitTests
        steps:
          - task: GoTool@0
            inputs:
              version: '1.22'
          - script: go test ./... -v -coverprofile=coverage.out
            displayName: 'Run tests'
          - task: PublishCodeCoverageResults@2
            inputs:
              summaryFileLocation: coverage.out
              codecoverageTool: 'Cobertura'

  - stage: DeployStaging
    dependsOn: Test
    condition: and(succeeded(), eq(variables['Build.SourceBranch'], 'refs/heads/main'))
    jobs:
      - deployment: DeployToStaging
        environment: staging
        strategy:
          runOnce:
            deploy:
              steps:
                - download: current
                  artifact: drop
                - script: echo "Deploying to staging"

trigger controls which branches and paths trigger the pipeline. dependsOn creates stage ordering. condition adds logic – succeeded() checks the previous stage passed, and you can combine it with variable checks to restrict certain stages to specific branches.

Multiple Temporal Servers on Minikube: Multi-Cluster Setup

Multiple Temporal Servers on Minikube#

Running two independent Temporal Server instances locally lets you develop and test cross-cluster patterns – worker bridges, namespace replication, and multi-region failover – without cloud infrastructure. This article walks through deploying two Temporal clusters on minikube using profiles and connecting them over Docker networking.

All configuration files and Makefile targets reference the companion repository at github.com/statherm/temporal-examples in the multi-cluster/ directory.

Why Multiple Clusters?#

A single Temporal cluster handles most use cases. You need multiple clusters when:

Crossplane for Platform Abstractions

What Crossplane Does#

Crossplane extends Kubernetes to provision and manage cloud infrastructure using the Kubernetes API. Instead of writing Terraform and running apply, you write Kubernetes manifests and kubectl apply them. Crossplane controllers reconcile the desired state with the actual cloud resources.

The real value is not replacing Terraform — it is building abstractions. Platform teams define custom resource types (like DatabaseClaim) that developers consume without knowing whether they are getting RDS, CloudSQL, or Azure Database. The composition layer maps the simple claim to the actual cloud resources.

Self-Hosted CI Runners at Scale: GitHub Actions Runner Controller, GitLab Runners on K8s, and Autoscaling

Self-Hosted CI Runners at Scale#

GitHub-hosted and GitLab SaaS runners work until they do not. You hit limits when you need private network access to deploy to internal infrastructure, specific hardware like GPUs or ARM64 machines, compliance requirements that prohibit running code on shared infrastructure, or cost control when you are burning thousands of dollars per month on hosted runner minutes.

Self-hosted runners solve these problems but introduce operational complexity: you now own runner provisioning, scaling, security, image updates, and cost management.