Operational Pitfalls: Running Local LLMs Alongside Dev Clusters

Decision-first: One model per GPU (cloud-main + local-wake-filter for multi-model); unload-and-verify before every load; never lower the Docker Desktop VM cap; tunnel to loopback to dodge macOS Local Network Privacy; serialize loads and don’t download during inference.

Scope & freshness: Apple-Silicon Mac + minikube/Docker Desktop and a single-GPU LLM host (GB10), as of 2026-05-25. Incident patterns are durable; specific recovery commands assume kubectl/minikube/Docker Desktop.

A field runbook of failure modes seen running local LLMs next to development Kubernetes clusters. Each is a real incident pattern, not a hypothetical. (This whole doc is effectively a “what didn’t work” catalog — that’s the point.)

Serving LLMs on an Apple Silicon Mac That Also Runs a Dev Cluster

Decision-first: A Mac running a dev cluster is a lite-tier LLM host only (~8 GB models). It can’t hold even one large (~24 GB-resident) model alongside the cluster. Standardize on GGUF (Ollama can’t do MLX); don’t lower the Docker VM cap to “free RAM.”

Scope & freshness: 64 GB Apple-Silicon Mac running minikube/Docker Desktop, as of 2026-05-25. Numbers scale with your RAM and cluster size — re-measure, but the shape (cluster + one big model exhausts the box) holds.

Running Kubernetes on Apple Silicon: Setup, Gotchas, Recovery

A minikube cluster on Apple Silicon looks like a pure Kubernetes problem until the first Docker Desktop crash. The failure modes that bite hardest on M-series Macs live one layer below the cluster: in Docker Desktop’s memory allocator, in QEMU’s address-space layout, and in the destructive default of minikube delete. None of these are mentioned in the standard minikube setup guide, and all three will eat real workload state when they fire. This is the operational layer on top of minikube setup and drivers and ARM64 K8s images — the host-side discipline that keeps the cluster alive.

Single-Node Kubernetes Disaster Recovery: Backups That Survive a Wiped Docker VM

A single-node minikube cluster on Docker Desktop runs the entire control plane, kubelet, every PVC, every Secret, and the container image cache inside one VM whose disk is one file: ~/Library/Containers/com.docker.docker/Data/vms/0/data/Docker.raw on macOS. When that file is lost or corrupted, every piece of cluster state goes with it in a single event. There is no “node failure vs storage failure” distinction to design around. Every backup strategy that assumes those are separable does not apply.