---
title: "Running 7 Helm-Managed Services on One Kubernetes Cluster: A Cross-Cutting Survey"
description: "Patterns and gotchas from operating Gitea, Mattermost, PostgreSQL, kube-prometheus-stack, Jenkins, Temporal, and NATS together via Helm on a single-node K8s cluster."
url: https://agent-zone.ai/knowledge/platform-engineering/helm-managed-services-on-single-node-k8s-survey/
section: knowledge
date: 2026-05-07
categories: ["platform-engineering"]
tags: ["helm","kubernetes","single-node","minikube","arm64","operations","homelab"]
skills: ["helm-multi-service-operation","single-node-capacity-planning","helm-values-customization","k8s-debugging"]
tools: ["helm","kubectl","minikube","docker"]
levels: ["intermediate"]
word_count: 3092
formats:
  json: https://agent-zone.ai/knowledge/platform-engineering/helm-managed-services-on-single-node-k8s-survey/index.json
  html: https://agent-zone.ai/knowledge/platform-engineering/helm-managed-services-on-single-node-k8s-survey/?format=html
  api: https://api.agent-zone.ai/api/v1/knowledge/search?q=Running+7+Helm-Managed+Services+on+One+Kubernetes+Cluster%3A+A+Cross-Cutting+Survey
---


A single-node Kubernetes cluster running seven Helm-managed services concurrently — Gitea, Mattermost, PostgreSQL, kube-prometheus-stack, Jenkins, Temporal, and NATS — looks tractable on paper. The charts are all upstream-maintained. The hardware is modest but adequate. The operational reality is that **zero of the seven** ran cleanly on out-of-the-box values. Every chart needed at least one customization to coexist with the others, and several needed substantial rewrites of the helm-values surface. This survey catalogs what those customizations are, why each was necessary, and what the common failure modes look like across the fleet.

The frame is "production-shape services on a small cluster" rather than "production cluster, scaled down." The trade-offs are different. HA dependencies that are free on a multi-node cluster are pure overhead on a single node. Backup discipline that is automatic in a managed offering is a per-service homework assignment. Resource budgets that nobody thinks about at scale dominate every chart-version decision.

## A survey of 7 helm-managed services

The table is the article's anchor. Every column is load-bearing.

| # | Service | Helm chart | Chart version | ARM64-friendly? | Custom image needed? | Backup mechanism | Per-service customization required | Major gotchas hit |
|---|---|---|---|---|---|---|---|---|
| 1 | **Gitea** | `gitea-charts/gitea` | 12.5.3 | yes (rootless image is multi-arch) | no | dedicated cron-driven repo backup script | disable bundled redis-cluster + redis + postgresql + postgresql-ha; point at external PG; admin user inline in values | bundled HA deps light up by default and exhaust resources unless explicitly disabled |
| 2 | **Mattermost** | `mattermost/mattermost-team-edition` | 6.6.96 | **NO** — upstream is amd64-only | **YES** — built locally from MM ARM64 binary tarball; tag `your-registry/mattermost-team-edition:10.5.0-arm64` | none (PV snapshot only) | external PG via env vars + `configJSON`; bot accounts + access tokens enabled; `SiteURL` fixed | upstream image lacks ARM64 manifest; QEMU fallback runs but Go runtime crashes (`lfstack.push`); must bake own image |
| 3 | **PostgreSQL** | `bitnami/postgresql` | 18.6.2 (multiple revisions) | yes | no | **none yet** — backlog gap | `initdb.scripts.init-all-dbs.sql` creates 4 databases + 4 service roles on first boot; `architecture: standalone` | Bitnami chart wants HA by default; standalone + initdb is the multi-tenant pattern; PG15+ schema permissions trap |
| 4 | **kube-prometheus-stack** | `prometheus-community/kube-prometheus-stack` | 84.3.0 | yes | no | none (recreate from values) | disable `kubeEtcd` + `kubeControllerManager` (minikube static pods, no scrape endpoint); disable Grafana persistence (emptyDir for dev); disable `NodeClockNotSynchronising` + `AlertmanagerClusterCrashlooping` default rules; full alertmanager routing config inline | `mattermost_configs` receiver has no `title:` field — operator config validator silently rejects; chart-default rules produce single-node false-positives; `--reuse-values` silently ignores `-f` |
| 5 | **Jenkins** | `jenkins/jenkins` | 5.9.18 | yes | **YES** — `your-registry/jenkins:latest` with plugins pre-baked via `jenkins-plugin-cli` | PV snapshot only | `controller.installPlugins: []` to skip broken plugin-copy step; JCasC inline (Gitea server, credentials, shared library, organization folder, kubernetes cloud); Docker socket hostPath mount; runs as root for socket access | helm chart's `apply_config.sh` has a broken `yes n \| cp -i ...` line that crashes the pod when `installPlugins` is non-empty; pre-baking is the only stable path |
| 6 | **Temporal** | `temporalio/temporal` | 0.74.0 | yes | no | none (state in PG) | `configMapsToMount: "sprig"` + `setConfigFilePath: true` (chart default `dockerize` leaves `{{ .Env.* }}` unrendered); external PG with separate `temporal` + `temporal_visibility` DBs; disable bundled cassandra/mysql/postgresql/elasticsearch/prometheus/grafana | chart 0.74.0 ships server image 1.30.3 which uses sprig, not dockerize — default values produce `Cassandra.Hosts: zero value` error; chart 1.x restructures persistence keys (pinned to 0.74.0) |
| 7 | **NATS** | `nats/nats` | 2.12.6 | yes | no | none (ephemeral pub/sub) | `cluster.enabled: false`; `jetstream.enabled: false` (no persistence); single replica; tight memory cap | resource block lives under `container.merge.resources` not top-level `resources` — easy to miss |

Every customization in the rightmost column is mandatory for that chart to coexist with the others on a single node. The chart versions are pinned because the gotchas are version-specific — Temporal 1.x restructures persistence keys, the Jenkins `apply_config.sh` bug surfaces in 5.x, and the kube-prometheus-stack alertmanager schema validation tightened in the 80.x series. Re-verify against the chart you actually install.

## Cross-cutting patterns

### Every chart needs at least one customization

Of the seven surveyed, zero ran cleanly on default values. The reasons cluster into five categories:

- **HA dependencies defaulting on**: Gitea bundles redis-cluster + redis + postgresql + postgresql-ha. Mattermost bundles MySQL. Temporal bundles cassandra + mysql + elasticsearch + prometheus + grafana. Each one consumes 256-512 MiB of requests before doing any useful work.
- **Resource defaults too aggressive for a shared single node**: kube-prometheus-stack at defaults requests over 4 GiB by itself. Jenkins requests 2 GiB. Helm-default `resources:` blocks across these seven charts sum to over 16 GiB requests — the cluster won't schedule a single application pod.
- **Architecture mismatch**: Mattermost ships only amd64. QEMU user-mode emulation runs the binary but the Go runtime crashes on `lfstack.push`. The fix is rebuilding from the ARM64 binary tarball — see [building ARM64 container images when upstream doesn't ship them](../../kubernetes/building-arm64-container-images-when-upstream-doesnt-ship-them/) and [Kubernetes on Apple Silicon setup gotchas](../../kubernetes/kubernetes-on-apple-silicon-setup-gotchas/).
- **Chart bugs and broken install paths**: Jenkins's plugin install in `apply_config.sh` crashes the pod. Temporal 0.74.0 ships a server image that uses sprig templates while the chart's defaults assume dockerize. Both are workarounds-required, not fixable from the outside.
- **Single-node false-positive alerts**: kube-prometheus-stack ships rules for VM clock drift and alertmanager cluster crashlooping that fire constantly on a single-node minikube cluster. They have to be disabled at install time or alert fatigue arrives within hours.

**The one-line takeaway: plan a values file before you `helm install`.** Treating helm-defaults as a starting point on a small cluster guarantees a wedge state.

### Helm defaults assume multi-node production

Every "free" HA dependency in a helm chart is a memory tax on a single-node cluster. Bundled redis is free on a 5-node cluster because nothing else needs that 256 MiB. On a single node sharing memory with PostgreSQL, Prometheus, Grafana, the application pods themselves, and the kubelet, that 256 MiB has to come from somewhere. The shape of the trade-off is identical for every chart surveyed: the bundled dependency is convenient, defaults to on, and is the first thing to disable.

The principle generalizes: **read the values.yaml top to bottom before installing any chart on a constrained cluster.** Look for `enabled: true` on anything labeled redis, postgresql, mysql, cassandra, elasticsearch, or prometheus. Most of them want to be off.

### Resource sizing is a budget, not a target

Helm `resources:` blocks across these seven charts default to numbers that assume infinite headroom. The actual budget is finite and shared. The discipline is to set requests and limits per service that sum to less than the cluster's allocatable memory, with headroom for application workloads. See the [resource budgeting](#resource-budgeting-under-a-memory-cap) section below for concrete numbers.

### Authentication has three layers

The seven services together use three distinct credential patterns, and treating them uniformly leads to mistakes:

- **Helm-values inline admin credentials** (Gitea, Jenkins, Grafana). Convenient. Leaks into git history. Fine for dev clusters with a `<dev-password>` placeholder; use `existingSecret` references for any cluster reachable from outside localhost.
- **Per-service service users** (PostgreSQL roles, Mattermost user, Gitea user). Set via initdb scripts or chart `configJSON`. Survive helm upgrades. Don't change unless you intend to.
- **Per-app tokens** (Gitea API tokens, Mattermost bot tokens, Slack/Mattermost webhook URLs). Always live in K8s Secrets, mounted via `secrets:` (the alertmanager pattern) or env-from-secret. Never inline in helm-values, even in dev — they tend to outlive the dev cluster.

### Chart structure varies — read it before customizing

Each chart organizes its values differently, and the difference matters when overriding:

- **NATS** puts container resources under `container.merge.resources`, not top-level `resources`. Setting top-level does nothing.
- **Mattermost** uses both `extraEnvVars` AND `configJSON` for the same SQL settings. Both must agree or the pod refuses to start.
- **Temporal** nests every server component (`frontend`, `history`, `matching`, `worker`) with its own `replicaCount` and `resources`. Setting one doesn't set the others.
- **kube-prometheus-stack** has `prometheus.prometheusSpec.resources` (the CRD-mode operator-managed pod) AND `prometheusOperator.resources` (the operator pod itself). They are separate budgets.

`helm get values <release>` after install is the only reliable way to confirm that an override took effect. See [helm gotchas: reuse-values, revisions, rollback](../../kubernetes/helm-gotchas-reuse-values-revisions-rollback/) for why this matters.

### Single-node-specific overrides

A consistent set of overrides applies across charts when the target is single-node:

- Disable `kubeEtcd` and `kubeControllerManager` scrapes (minikube runs them as static pods with no scrape endpoint).
- Disable `NodeClockNotSynchronising` rule (minikube/Docker Desktop VMs drift constantly; the alert is a false positive).
- Disable `AlertmanagerClusterCrashlooping` (single replica means no cluster, the rule fires forever).
- Set `imagePullPolicy: Never` for any locally-built image.
- Disable `initChownData` for Grafana — minikube hostPath PVs don't need it.

These are not generic best practices; they're single-node-specific. Carry them forward as a checklist for any chart added to the same cluster.

## Resource budgeting under a memory cap

Pick a memory cap that matches the hardware. The example below uses 24 GiB — appropriate for a Mac mini-class workstation running Docker Desktop. The arithmetic is the same for any cap.

```text
service              requests (cpu / mem)   limits (mem)
gitea                100m / 128Mi           512Mi
mattermost           250m / 512Mi           1Gi
postgresql           250m / 512Mi           2Gi
prometheus           200m / 512Mi           1Gi
grafana              100m / 128Mi           512Mi
alertmanager          50m /  64Mi           256Mi
prom-operator        100m / 256Mi           512Mi
jenkins              250m / 1Gi             2Gi
temporal (4 svc)     400m / 768Mi (sum)     1.5Gi (sum)
nats                  50m /  64Mi           128Mi
TOTAL requests:    ~1750m CPU / ~4.2 GiB memory
TOTAL limits:      ~9.4 GiB memory
```

The 4.2 GiB requests sum is what the cluster scheduler reserves before any application workload arrives. With a 24 GiB cap, that leaves roughly 20 GiB for application pods plus minikube node overhead — enough headroom for a meaningful workload. **Going to helm-defaults on every chart blows past 16 GiB of requests alone**, before any application pod is scheduled. The cluster wedges.

The arithmetic forces three decisions early:

- **Requests must be tight.** The number that gets reserved is `requests`, not `limits`. Default `requests` are usually conservative for *production* and wasteful for *single-node*. Halve them and watch behavior under load before halving again.
- **Limits should reflect peak**, not average. PostgreSQL at idle uses 100 MiB; under a backfill query it uses 1.5 GiB. The `limits` slot exists to allow that peak without OOMKilling.
- **Multi-tenant > N instances.** A single Bitnami PostgreSQL with `initdb` creating four databases + four roles uses 512 MiB. Four chart-bundled PostgreSQL instances use 4 × 512 MiB. The math forces consolidation.

The multi-tenant PG pattern looks like:

```sql
-- initdb.scripts.init-all-dbs.sql (excerpt)
SELECT 'CREATE DATABASE temporal'
  WHERE NOT EXISTS (SELECT FROM pg_database WHERE datname = 'temporal')\gexec
-- ...repeated for temporal_visibility, mattermost, gitea, etc.

DO $$
BEGIN
  IF NOT EXISTS (SELECT FROM pg_roles WHERE rolname = 'temporal') THEN
    CREATE ROLE temporal WITH LOGIN PASSWORD '<dev-password>';
  END IF;
  -- ...repeated per service
END $$;

GRANT ALL PRIVILEGES ON DATABASE temporal TO temporal;
-- ...repeated
```

The trade-off accepted: shared PostgreSQL is also the single point of failure. That's acceptable for dev/homelab and unacceptable for any production posture. Plan the migration to per-service or HA PG before the cluster carries production traffic.

## Common failure modes and what they tell you

The same five failure signatures recur across the surveyed charts. Recognizing the signature shortcuts diagnosis.

### CrashLoopBackOff

```text
Back-off restarting failed container <name> in pod <pod>
```

Three common causes, each with a distinctive log signature:

**Image arch mismatch** (Mattermost-class). The pod starts. The runtime aborts deep in Go's lock-free stack:

```text
runtime: failed to create new OS thread (have N already; errno=22)
fatal error: lfstack.push
```

There is no fix from the outside. Build a native ARM64 image and reference it.

**OOMKilled**. The limit is too low for steady-state:

```text
State: Terminated, Reason: OOMKilled, ExitCode: 137
```

`kubectl describe pod` confirms the reason. Either raise the limit or find what's consuming the unexpected memory. PostgreSQL after a schema-change run, Jenkins after a long build queue, Prometheus after a series-cardinality spike — all common culprits.

**Config error from a chart bug**. Jenkins shows:

```text
apply_config.sh: line N: cp: cannot stat ...
```

— the broken plugin-copy path triggered by a non-empty `installPlugins`. Fix: empty the list, pre-bake plugins into the image. Temporal shows:

```text
Persistence.DataStores[default].Cassandra.Hosts: zero value
```

— the chart-default `dockerize` leaves `{{ .Env.CASSANDRA_HOSTS }}` unrendered because the server image uses sprig. Fix: `configMapsToMount: "sprig"` and `setConfigFilePath: true`.

### ImagePullBackOff and ErrImageNeverPull

```text
Failed to pull image "your-registry/mattermost-team-edition:10.5.0-arm64": ... not found
```

For locally-built images on minikube, two causes dominate:

- `imagePullPolicy` not set to `Never` (or `IfNotPresent`). Kubernetes tries the registry, gets nothing, fails.
- `eval $(minikube docker-env)` was not run before `docker build`. The image landed in the host Docker daemon, not minikube's. `docker images` from the wrong context confirms it.

### Pending pods, no schedulable node

```text
0/1 nodes are available: 1 Insufficient memory.
```

Single node + every chart at helm-default = wedge state. Diagnose with:

```bash
kubectl describe nodes | grep -A 10 "Allocated resources"
```

The fix is to lower requests, not raise the cap. Raising the cap pushes the same problem out by one dependency.

### Operator silent rejection of alertmanager config

The smoking gun is in the **operator** logs, not the alertmanager logs. Alerts never deliver, alertmanager looks healthy, the chart shows `deployed`. The operator is rejecting the config:

```text
Sync error: failed to apply alertmanager config: unknown field "title" in mattermost_configs
```

The alertmanager pod runs the *previous* valid config and accepts no updates. The fix is to remove the offending field — `title` does not exist in `mattermost_configs`; fold any title into the `text:` body. See [Prometheus stack alertmanager operations](../../observability/prometheus-stack-alertmanager-operations/) for the deeper dive.

### `helm upgrade --reuse-values` silently ignoring `-f`

```bash
helm upgrade <release> ... --reuse-values -f values.yaml   # WRONG: -f is silently ignored
helm upgrade <release> ... -f values.yaml                  # CORRECT
```

No warning printed. No error. The chart redeploys with the previous values. Always verify with:

```bash
helm get values <release> -n <namespace>
```

This trap accounts for a disproportionate share of "I changed the values and nothing happened" debugging sessions. See [helm gotchas: reuse-values, revisions, rollback](../../kubernetes/helm-gotchas-reuse-values-revisions-rollback/).

## Backup discipline as a per-service problem

Backup posture across the seven services is uneven, and the unevenness is itself worth naming.

| Service | Backup status |
|---|---|
| Gitea | dedicated cron-driven script (best in fleet) |
| PostgreSQL | **gap** — no scheduled dumps; PV snapshot only |
| Mattermost | gap — file uploads on PV, no off-cluster copy |
| Jenkins | gap — `JENKINS_HOME` on PV, plugins re-bakeable but jobs are not |
| Prometheus | acceptable — TSDB recoverable from rules |
| Temporal | partial — workflow state in PG (covered when PG is) |
| NATS | n/a — ephemeral |

The PostgreSQL gap is the most consequential. Five of the seven services depend on shared PostgreSQL for state. A PG loss takes Temporal workflow history, Mattermost messages, Gitea metadata, application data, and any service-specific data with it. **The best-case backup posture across the fleet is exactly as good as PostgreSQL's**, and PG has no scheduled dumps yet.

The general lesson: **backups are a per-service discipline, not a per-cluster one.** "We snapshot the volumes" papers over the question. Per-service it becomes "what is the recovery procedure for THIS service's state?" The PV-snapshot answer rarely survives that translation. See [single-node Kubernetes disaster recovery](../../sre/single-node-kubernetes-disaster-recovery/) for the recovery-procedure side of the same problem.

## When to vendor your own image

Five of the seven services run upstream images. Two — Mattermost and Jenkins — required vendoring. The decision pattern:

| Service | Choice | Why |
|---|---|---|
| Gitea | upstream (rootless) | publishes ARM64; rootless avoids permission grief on hostPath PVs |
| Mattermost | **vendor own** | no ARM64 image upstream; QEMU emulation crashes Go runtime; only path is rebuild from binary tarball |
| PostgreSQL | upstream (Bitnami) | publishes multi-arch; chart is mature; standalone mode well-supported |
| kube-prometheus-stack | upstream | massive chart with deep CRD coupling; forking would mean fork-forever |
| Jenkins | **vendor own** | bundled plugin-install step in `apply_config.sh` is broken; pre-baking via `jenkins-plugin-cli` is upstream-recommended for prod anyway |
| Temporal | upstream (pinned to 0.74.0) | chart works after `configMapsToMount: sprig` flip; 1.x major restructure deferred |
| NATS | upstream | small, simple, just works |

The vendor-own decision criterion has two halves: (a) upstream doesn't ship the architecture you need, OR (b) the chart's runtime install path is broken in a way that's not fixable from the outside. Mattermost is case (a). Jenkins is case (b). Both produce a Dockerfile that's measured in tens of lines, not hundreds:

```dockerfile
# Mattermost ARM64 (sketch)
FROM ubuntu:22.04
ARG MM_VERSION=10.5.0
RUN curl -L https://releases.mattermost.com/${MM_VERSION}/mattermost-${MM_VERSION}-linux-arm64.tar.gz \
    | tar xz -C /opt/
# ...user, entrypoint, etc.
```

```dockerfile
# Jenkins with pre-baked plugins
FROM jenkins/jenkins:lts
COPY plugins.txt /usr/share/jenkins/ref/
RUN jenkins-plugin-cli --plugin-file /usr/share/jenkins/ref/plugins.txt
```

**The decision NOT to fork the helm chart matters as much as the decision to vendor the image.** Every service except Mattermost and Jenkins fits in fewer than 60 lines of `values.yaml`. Forking trades a 50-line values file for a chart you now maintain. Chart-version drift outpaces a fork's value within two or three upstream releases.

## Anti-patterns

A handful of patterns recur often enough to be worth naming as anti-patterns:

- **"Use the helm chart's bundled PostgreSQL."** Fine for one service. Deadly across seven. Multi-tenant single PG with `initdb` creating per-service databases halves storage requests and gives a single backup target.
- **"Set `--reuse-values` because `-f` should be additive."** Silent override. Always verify with `helm get values`.
- **"Skip the ARM64 check, QEMU will handle it."** Works for shell utilities. Fails on Go binaries. The crash signature is `lfstack.push` deep in the Go runtime; there is no application-level fix.
- **"Install Jenkins plugins at runtime via the helm chart's `installPlugins:`."** The chart's `apply_config.sh` is broken. Pre-bake plugins into the image.
- **"Trust the alertmanager config validator."** The operator silently rejects unknown fields. Verify by tailing operator logs after every config change.
- **"Helm-default `resources:` are sane defaults."** They're sane *for production multi-node clusters*. On a single node they sum to a wedge state.

## Quotable lessons

- **Every Helm chart needs at least one customization on a single-node cluster.** Plan a values file before you `helm install`.
- **Helm defaults are written for production multi-node clusters.** On a small cluster every "free" HA dependency is a memory tax.
- **If a service publishes no ARM64 image, you'll be vendoring your own.** There is no QEMU shortcut for Go binaries.
- **A multi-tenant PostgreSQL with `initdb` scripts beats N bundled PG instances** by an order of magnitude in memory cost.
- **Backups are a per-service discipline, not a per-cluster one.** Track each service's plan separately or it slips.
- **When `helm upgrade` doesn't take effect, check `helm get values` first.** `--reuse-values` silently overrides `-f`.
- **Pre-bake Jenkins plugins.** The Helm chart's runtime install path is fragile and reduces every deploy to a coin flip.

## Where this article fits

This is the meta-survey: seven services side-by-side, the cross-cutting patterns that show up only when they're operated together, and the failure modes that span charts. For per-service depth:

- [Self-hosting Gitea on Kubernetes](../../cicd/self-hosting-gitea-on-kubernetes/) — chart 1, the rootless image and external-PG pattern.
- [Building ARM64 container images when upstream doesn't ship them](../../kubernetes/building-arm64-container-images-when-upstream-doesnt-ship-them/) — chart 2, the Mattermost custom-image build.
- [Prometheus stack alertmanager operations](../../observability/prometheus-stack-alertmanager-operations/) — chart 4, the alertmanager routing and validator-rejection trap.
- [Helm gotchas: reuse-values, revisions, rollback](../../kubernetes/helm-gotchas-reuse-values-revisions-rollback/) — the cross-cutting helm operational patterns.
- [Kubernetes on Apple Silicon setup gotchas](../../kubernetes/kubernetes-on-apple-silicon-setup-gotchas/) — the substrate this whole survey runs on.
- [Single-node Kubernetes disaster recovery](../../sre/single-node-kubernetes-disaster-recovery/) — the backup-and-recovery posture the gap analysis above demands.

Read this first to understand the shape of the problem; read the per-service articles when a specific chart needs depth.

