Minikube Production Profile: Configuring a Local Cluster That Behaves Like Production

Why Production Parity Matters Locally#

The most expensive bugs are the ones you find after deploying to production. A minikube cluster with default settings lacks ingress, metrics, and resource enforcement – so your app works locally and breaks in staging. The goal is to configure minikube so that anything that works on it has a high probability of working on a real cluster.

Choosing Your Local Kubernetes Tool#

Before configuring minikube, decide if it is the right tool.

Minikube Application Deployment Patterns: Production-Ready Manifests for Four Common Workloads

Choosing the Right Workload Type#

Every application fits one of four deployment patterns. Choosing the wrong one creates problems that are hard to fix later – a database deployed as a Deployment loses data on reschedule, a batch job deployed as a Deployment wastes resources running 24/7.

PatternKubernetes ResourceUse When
Stateless web appDeployment + Service + IngressHTTP APIs, frontends, microservices
Stateful appStatefulSet + Headless Service + PVCDatabases, caches with persistence, message brokers
Background workerDeployment (no Service)Queue consumers, event processors, stream readers
Batch processingCronJobScheduled reports, data cleanup, periodic syncs

Pattern 1: Stateless Web App#

A web API that can be scaled horizontally with no persistent state. Any pod can handle any request.

Running Temporal Server on Minikube

Running Temporal Server on Minikube#

This guide deploys Temporal Server on a local Minikube cluster with PostgreSQL persistence. By the end, you will have the Temporal frontend, Web UI, and CLI all working against a real Kubernetes deployment.

If you need background on what Temporal is, start with Introduction to Temporal.

Prerequisites#

ToolMinimum VersionPurpose
minikube1.32+Local Kubernetes cluster
kubectl1.28+Kubernetes CLI
helm3.14+Package manager for Kubernetes
temporal1.0+Temporal CLI
docker24+Container runtime (minikube driver)

Your machine needs at least 4 CPU cores and 8 GB RAM available to Docker. For minikube driver details, see Minikube Setup and Drivers and Minikube Docker Driver.

Multiple Temporal Servers on Minikube: Multi-Cluster Setup

Multiple Temporal Servers on Minikube#

Running two independent Temporal Server instances locally lets you develop and test cross-cluster patterns – worker bridges, namespace replication, and multi-region failover – without cloud infrastructure. This article walks through deploying two Temporal clusters on minikube using profiles and connecting them over Docker networking.

All configuration files and Makefile targets reference the companion repository at github.com/statherm/temporal-examples in the multi-cluster/ directory.

Why Multiple Clusters?#

A single Temporal cluster handles most use cases. You need multiple clusters when:

Building a Temporal Worker Bridge: Cluster A Jobs Executed in Cluster B

Building a Temporal Worker Bridge#

The architecture article evaluated three cross-cluster communication patterns and identified the worker bridge as the best fit for most open-source Temporal deployments. This article builds the bridge.

The worker bridge is a single binary that holds connections to two Temporal clusters. It polls Cluster A for tasks on a dedicated queue and executes those tasks using Cluster B’s resources – its Temporal client, databases, APIs, and services. From Cluster A’s perspective, the bridge is just another worker. From Cluster B’s perspective, the bridge is just another client starting workflows.

Operational Pitfalls: Running Local LLMs Alongside Dev Clusters

Decision-first: One model per GPU (cloud-main + local-wake-filter for multi-model); unload-and-verify before every load; never lower the Docker Desktop VM cap; tunnel to loopback to dodge macOS Local Network Privacy; serialize loads and don’t download during inference.

Scope & freshness: Apple-Silicon Mac + minikube/Docker Desktop and a single-GPU LLM host (GB10), as of 2026-05-25. Incident patterns are durable; specific recovery commands assume kubectl/minikube/Docker Desktop.

A field runbook of failure modes seen running local LLMs next to development Kubernetes clusters. Each is a real incident pattern, not a hypothetical. (This whole doc is effectively a “what didn’t work” catalog — that’s the point.)

Serving LLMs on an Apple Silicon Mac That Also Runs a Dev Cluster

Decision-first: A Mac running a dev cluster is a lite-tier LLM host only (~8 GB models). It can’t hold even one large (~24 GB-resident) model alongside the cluster. Standardize on GGUF (Ollama can’t do MLX); don’t lower the Docker VM cap to “free RAM.”

Scope & freshness: 64 GB Apple-Silicon Mac running minikube/Docker Desktop, as of 2026-05-25. Numbers scale with your RAM and cluster size — re-measure, but the shape (cluster + one big model exhausts the box) holds.

Building ARM64 Container Images When Upstream Doesn't Ship Them

A pod is CrashLoopBackOff with no application stack trace. The container manifest’s image field references an upstream tag that “should” work. The Docker pull succeeded. The container starts, exits with no log output, and restarts. The cause is almost always architecture: the image was published linux/amd64 only, the host is ARM64 (Apple Silicon, Graviton, Ampere), and the runtime is silently emulating — or failing to emulate — the binary. When upstream publishes ARM64 source artifacts but no ARM64 image, the fix is to build one.

Minikube docker-env: Building Images Directly into the Cluster Runtime

eval $(minikube docker-env) repoints the shell’s Docker client at the daemon running inside the minikube VM. A docker build afterwards lands the image directly in the cluster’s container store, so pods can pull it without a registry. The pattern is correct but unforgiving: every failure mode looks like a different problem (image pull error, runtime crash, stale pod) and only a handful of them actually point back to the env-var setup.

Running 7 Helm-Managed Services on One Kubernetes Cluster: A Cross-Cutting Survey

A single-node Kubernetes cluster running seven Helm-managed services concurrently — Gitea, Mattermost, PostgreSQL, kube-prometheus-stack, Jenkins, Temporal, and NATS — looks tractable on paper. The charts are all upstream-maintained. The hardware is modest but adequate. The operational reality is that zero of the seven ran cleanly on out-of-the-box values. Every chart needed at least one customization to coexist with the others, and several needed substantial rewrites of the helm-values surface. This survey catalogs what those customizations are, why each was necessary, and what the common failure modes look like across the fleet.