Builder Pool Naming: The (role, tier, replica) Coordinate Decouples Identity From Model

Builder Pool Naming: The (role, tier, replica) Coordinate#

Naming agent pools after the model they run today (kimi-N, deepseek-N, flash-N, lite-N) felt natural when each pool ran one model. It stopped feeling natural the third time a pool’s model churned — when the lite-tier swapped through qwen → gemma → gemini in six weeks and every rename cascaded through K8s manifests, secret names, MM bot accounts, Gitea identities, and helm values. The fix was to make pool names model-independent: builder-lite-0 runs whatever model the pool config says it runs today.

Operating prometheus-stack Alertmanager: Operator Validation, Native Receivers, and Silence Discipline

A receiver YAML passes static review and the helm release reports deployed. The alertmanager pod is Running 1/1. A real critical alert fires and goes nowhere. The alertmanager pod logs are clean. The receiver works fine for a hand-rolled curl to the webhook URL. The trap is that the prometheus-operator generated a Secret containing the rendered config but flagged a sync error in its own logs — and the alertmanager pod kept serving the previous-good rendering, silently. This article assumes familiarity with the basic alertmanager routing tree, receivers, inhibition rules, and templating covered in alertmanager-configuration. It extends that material with the Day-2 operations of the kube-prometheus-stack chart specifically: where errors actually surface, what the native receiver schemas allow (and don’t), and the silence discipline that keeps the alert pipeline trustworthy.