---
title: "Helm Gotchas: --reuse-values, Revisions, Rollback, and Disaster Recovery"
description: "How --reuse-values silently ignores -f, how to inspect any past revision, how to rollback safely, and the snapshot-before-upgrade habit that turns Helm revisions into a real DR backstop."
url: https://agent-zone.ai/knowledge/kubernetes/helm-gotchas-reuse-values-revisions-rollback/
section: knowledge
date: 2026-05-07
categories: ["kubernetes"]
tags: ["helm","rollback","revisions","upgrade","disaster-recovery","debugging"]
skills: ["helm-upgrade-debugging","helm-rollback","values-snapshot","release-history-inspection"]
tools: ["helm","kubectl"]
levels: ["intermediate"]
word_count: 2407
formats:
  json: https://agent-zone.ai/knowledge/kubernetes/helm-gotchas-reuse-values-revisions-rollback/index.json
  html: https://agent-zone.ai/knowledge/kubernetes/helm-gotchas-reuse-values-revisions-rollback/?format=html
  api: https://api.agent-zone.ai/api/v1/knowledge/search?q=Helm+Gotchas%3A+--reuse-values%2C+Revisions%2C+Rollback%2C+and+Disaster+Recovery
---


A Helm operator runs an upgrade with `--reuse-values -f new-values.yaml`. Helm reports success, increments the revision counter, and returns `STATUS: deployed`. The cluster behavior does not change. The new values file might as well not exist. This is a silent no-op upgrade — the load-bearing failure mode of `--reuse-values` — and it is one of several Day-2 Helm operations where the verbs look correct but the semantics are not what most operators assume. This article covers the flag combinations that bite, how to inspect any past revision, how rollback actually works, and the snapshot-before-upgrade discipline that turns Helm's revision storage into a real disaster-recovery backstop.

## Flag semantics

Five common variants, three distinct behaviors. The differences are not visible in the command line; they only surface in what ends up rendered.

| Command | Effect on values | When values come from `-f` |
|---|---|---|
| `helm upgrade R C` (no flags) | Resets to chart defaults; loses all prior overrides | Ignored if not passed |
| `helm upgrade R C -f my.yaml` | Chart defaults + `my.yaml`. Anything previously set via `--set` or earlier `-f` is lost unless re-specified. | Used (the explicit re-render path) |
| `helm upgrade R C --reuse-values` | Re-uses the *last release's computed values*; ignores chart updates to default values | Any `-f` or `--set` is **silently ignored when paired with `--reuse-values` for chart-default fields**; only NEW keys layer on top |
| `helm upgrade R C --reuse-values -f my.yaml` | **TRAP**: `-f my.yaml` does NOT replace existing keys; values from prior release win for any key already set there | Effectively no-op for already-set keys |
| `helm upgrade R C --reset-then-reuse-values -f my.yaml` (helm 3.14+) | Reset to chart defaults, re-apply prior values, then layer `-f` on top — what most operators *think* `--reuse-values -f` does | Used correctly |
| `helm upgrade R C --reset-values -f my.yaml` | Identical to plain `-f my.yaml`; throws out prior overrides explicitly | Used |

The trap row is the one that matters. `--reuse-values -f new.yaml` reads as "use the previous values plus this new file," but Helm interprets it as "use the previous computed values; the file only provides keys that didn't exist before." For a `kube-prometheus-stack` chart whose values file already contains every key the operator might want to change, `-f` becomes a no-op.

## The correct upgrade

Plain `-f` makes the values file the source of truth. Anything not in the file reverts to chart defaults; anything in the file lands as written.

```bash
# CORRECT — values file is the source of truth, prior overrides discarded
helm upgrade monitoring-stack prometheus-community/kube-prometheus-stack \
  -n prod \
  -f helm-values/prometheus-stack.yaml \
  --version 84.3.0
```

```bash
# WRONG — looks reasonable, silently ignores -f for any key already in revision N-1
helm upgrade monitoring-stack prometheus-community/kube-prometheus-stack \
  -n prod \
  --reuse-values \
  -f helm-values/prometheus-stack.yaml \
  --version 84.3.0
```

If the goal genuinely is "previous overrides plus this new file" — for example, when CI injects secret refs via `--set` and the values file should layer on top — use `--reset-then-reuse-values -f` (helm 3.14+). It does what `--reuse-values -f` reads as.

## The silent no-op trap

The hardest part of the bug is that Helm reports success. There is no error, no warning, no log line indicating that values were dropped:

```
$ helm upgrade monitoring-stack prometheus-community/kube-prometheus-stack \
    -n prod --reuse-values \
    -f helm-values/prometheus-stack.yaml --version 84.3.0
Release "monitoring-stack" has been upgraded. Happy Helming!
NAME: monitoring-stack
LAST DEPLOYED: Tue May  5 10:30:11 2026
NAMESPACE: prod
STATUS: deployed
REVISION: 4
```

`helm history` shows a new revision. `helm status` reports `deployed`. But `helm get values --revision 4` is byte-identical to `--revision 3`, and the cluster continues to behave as it did before the "upgrade."

The only reliable way to confirm an upgrade actually changed something is to diff:

```bash
# Diff the intended values against what's deployed
diff <(helm get values monitoring-stack -n prod) \
     helm-values/prometheus-stack.yaml

# Diff revision N-1 against revision N
diff <(helm get values monitoring-stack -n prod --revision $((N-1))) \
     <(helm get values monitoring-stack -n prod --revision $N)
# Empty diff => no-op upgrade
```

**Helm reports success on a no-op upgrade. The only way to confirm the upgrade landed is to diff `helm get values` between the new revision and the previous one.** Make this diff a CI step on any pipeline that runs `helm upgrade`.

## Inspecting past revisions

Every successful release is preserved as a Kubernetes Secret. `helm get` is the time-machine interface to it:

```bash
# What revisions exist?
helm history monitoring-stack -n prod
# REVISION  UPDATED                  STATUS    CHART                              APP    DESCRIPTION
# 1         Mon May  4 14:02:11 2026 superseded kube-prometheus-stack-84.3.0     v0.79  Install complete
# 2         Mon May  4 18:30:22 2026 superseded kube-prometheus-stack-84.3.0     v0.79  Upgrade complete
# 3         Tue May  5 09:11:04 2026 superseded kube-prometheus-stack-84.3.0     v0.79  Upgrade complete
# 4         Tue May  5 10:45:00 2026 failed     kube-prometheus-stack-84.3.0     v0.79  Upgrade "monitoring-stack" failed
# 5         Tue May  5 11:02:18 2026 deployed   kube-prometheus-stack-84.3.0     v0.79  Upgrade complete

# USER-SUPPLIED values for a revision (overrides only)
helm get values monitoring-stack -n prod --revision 2

# COMPLETE values for a revision (chart defaults + overrides)
helm get values monitoring-stack -n prod --revision 2 --all

# Rendered manifest actually applied at that revision
helm get manifest monitoring-stack -n prod --revision 2

# Hooks attached to the revision
helm get hooks monitoring-stack -n prod --revision 2

# Post-install banner text for the revision
helm get notes monitoring-stack -n prod --revision 2
```

**`helm get values --revision N` is the time machine: every past release of every Helm-managed thing in the cluster is one command away, until someone runs `helm uninstall`.**

For comparing two revisions, the [`helm-diff` plugin](https://github.com/databus23/helm-diff) is the cleanest UX, but plain `diff <(...)` works everywhere and is usually enough:

```bash
# Plugin (install: helm plugin install https://github.com/databus23/helm-diff)
helm diff revision monitoring-stack 4 5 -n prod

# Plain diff fallback
diff <(helm get values monitoring-stack -n prod --revision 4) \
     <(helm get values monitoring-stack -n prod --revision 5)

diff <(helm get manifest monitoring-stack -n prod --revision 4) \
     <(helm get manifest monitoring-stack -n prod --revision 5)
```

## Snapshotting before upgrade

Helm's revision storage is a feature, but it dies with the cluster. A `helm uninstall` without `--keep-history` deletes every revision Secret. A namespace deletion or cluster rebuild takes them with it. For shared infrastructure (monitoring stacks, ingress, cert-manager), a snapshot to disk costs one `helm get values` plus one `helm get manifest` and gives a known-good restore target that survives anything Kubernetes can do to itself.

```bash
# Snapshot dir per release, keyed by revision
SNAPSHOT_DIR=_helm-backups
mkdir -p "$SNAPSHOT_DIR"

CURRENT_REV=$(helm history monitoring-stack -n prod -o json | \
  jq -r 'map(select(.status=="deployed"))[0].revision')

helm get values   monitoring-stack -n prod --revision "$CURRENT_REV" \
  > "$SNAPSHOT_DIR/monitoring-stack-rev${CURRENT_REV}-values.yaml"
helm get manifest monitoring-stack -n prod --revision "$CURRENT_REV" \
  > "$SNAPSHOT_DIR/monitoring-stack-rev${CURRENT_REV}-manifest.yaml"

# Now run the upgrade
helm upgrade monitoring-stack ... -f new-values.yaml
```

**Snapshot `helm get values` AND `helm get manifest` to a file before every prod upgrade — Helm's revision storage dies with the cluster, the snapshot doesn't.** For dev releases that can be recreated from chart + checked-in values, the snapshot is optional. For anything where rebuilding is more expensive than two `helm get` invocations, it's free insurance.

## Rollback mechanics

`helm rollback` does not rewind state in place. It creates a NEW revision whose contents equal the target revision:

```bash
# Rollback to a specific revision
helm rollback monitoring-stack 3 -n prod

# helm history will then show:
# 6  ... pending-upgrade -> deployed  Rollback to 3
# Revision 6 IS the rollback; revisions 4 and 5 stay in history as "superseded"

# Rollback with timeout + atomic (rolls back the rollback if it fails)
helm rollback monitoring-stack 3 -n prod --timeout 5m --wait

# Auto-discover the last successful revision
LAST_GOOD=$(helm history monitoring-stack -n prod -o json | \
  jq -r 'map(select(.status=="deployed" or .status=="superseded"))[1].revision')
helm rollback monitoring-stack "$LAST_GOOD" -n prod
```

Because rollback creates a forward revision, the history is append-only and a "rollback of a rollback" is just another rollback to a still-earlier revision. There is no destructive history rewriting.

### Where revisions live

- Default backend: Kubernetes Secrets in the release namespace.
- Inspect: `kubectl get secret -n prod -l owner=helm,name=monitoring-stack`
- Naming: `sh.helm.release.v1.<release-name>.v<revision-number>` (accurate as of helm 3.x)
- Each Secret holds the full gzipped, base64-encoded release object: chart + values + rendered manifest.
- Configurable via `HELM_DRIVER` env: `secret` (default), `configmap`, `sql`.
- `helm uninstall <release>` deletes ALL revision Secrets unless `--keep-history` is passed.

### Failed rollback signature

```
$ helm rollback monitoring-stack 3 -n prod
Error: cannot patch "..." with kind StatefulSet:
  StatefulSet.apps "..." is invalid: spec: Forbidden:
  updates to statefulset spec for fields other than ...
```

This means the rollback target had immutable fields that differ from the current state — usually a StatefulSet's `volumeClaimTemplates` or selector labels. Recovery: `kubectl delete statefulset --cascade=orphan` followed by `helm rollback` again. The orphan flag preserves the underlying PVCs and Pods so they keep serving while the StatefulSet object is recreated. This is exactly the case where a manifest snapshot (`helm get manifest --revision N` saved to disk) becomes the recovery path of last resort.

## Recovering from "another operation in progress"

> **Test this recovery in a non-prod environment first.** It manipulates Helm's release Secrets directly and a wrong delete can leave a release unmanageable. The safer path is `helm rollback` to the last good revision; only use Secret manipulation when rollback itself reports the same lock error.

A previous `helm` operation that crashed before completing leaves the release in a `pending-*` status, and subsequent operations fail:

```
Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress
```

The lock is encoded as a label on the most recent revision Secret. Two recovery options:

```bash
# Find the most recent revision Secret
kubectl get secret -n NS -l owner=helm,name=R \
  -o json | jq -r '.items[].metadata.name' | tail -1

# Option 1: delete the latest revision Secret if the operation hadn't finished
# (this restores the prior 'deployed' revision as the active one)
kubectl delete secret -n NS sh.helm.release.v1.R.v<latest>

# Option 2: patch the label from pending-upgrade back to deployed
# (only if the underlying objects actually reached a healthy state)

# Confirm by reading history
helm history R -n NS  # latest revision should now be the previous 'deployed' one
```

The label-patch path is folklore-correct — it works in every reported case — but it lies about what actually happened. The delete path is more honest: it tells Helm the failed operation never landed, which matches reality more often than not.

## Useful upgrade flag combos

```bash
# --atomic = rollback automatically if upgrade fails. Default for any prod upgrade
# that doesn't have its own runbook.
helm upgrade R C -f v.yaml --atomic --timeout 10m

# --wait waits for resources to reach Ready. NOTE: --wait does NOT mean correct config.
# An apply that compiled fine but installed a broken config (unknown CRD field, etc.)
# can still pass --wait if pods restart cleanly.
helm upgrade R C -f v.yaml --wait --timeout 10m

# --force = recreate resources that have immutable field changes. DESTRUCTIVE on
# StatefulSets, PVCs. Almost always wrong; prefer surgical kubectl delete.
helm upgrade R C -f v.yaml --force

# --cleanup-on-fail = delete newly-created resources on failure (instead of leaving
# orphans). Pairs with --atomic.
helm upgrade R C -f v.yaml --atomic --cleanup-on-fail
```

**`--atomic` plus `--cleanup-on-fail` turns a broken upgrade from a forensic puzzle into a single automatic rollback.** The only argument against `--atomic` by default is timeout sensitivity: it waits for `--timeout` and rolls back EVERYTHING on any single resource not reaching Ready. For long-cold-start workloads, lengthen `--timeout 15m` rather than dropping `--atomic`.

## Sizing revision history

Helm keeps the last 10 revisions by default. Older revisions are pruned silently — no warning, no log entry.

```bash
# Default is 10
helm upgrade R C -f v.yaml

# 30 — typical for CI-driven releases (every merge → upgrade)
helm upgrade R C -f v.yaml --history-max 30

# 0 = keep all revisions forever. Not recommended; etcd grows without bound.
helm upgrade R C -f v.yaml --history-max 0
```

Each revision Secret is small (gzipped); keeping 50 of a typical chart costs under 50MB of etcd. **For any release upgraded by CI, bump `--history-max` to 30+ so there's rollback room when something lands wrong on a Friday.** Setting it to 0 (unbounded) eventually fills etcd; some upper bound is always healthier than none.

## Operator-managed config: silent reject

When the values DO make it into the release but contain a schema bug — a separate failure mode from the no-op upgrade — operator-managed configs surface it in the operator logs, not the workload's logs. The prometheus-operator pattern:

```
# In the prometheus-operator pod logs (NOT in alertmanager logs):
level=error ts=... msg="provisioning alertmanager configuration failed" \
  err="alertmanagerconfig validation failed: \
       field title not found in type config.plain"
```

The signature is repeated every reconcile loop (~30s). The managed workload (here, the alertmanager Pod) looks healthy because the operator stops syncing rather than pushing a known-broken config. This is the common shape across operator-driven projects: the workload appears fine, the operator pod is the one screaming. When debugging "Helm reports success but the cluster doesn't reflect the change," check the operator pod's logs as well as the workload's.

## Debugging checklist

When a Helm upgrade does not produce the expected behavior, run through these in order:

1. **Compare the deployed values to the intended values.** `diff <(helm get values R -n NS) values.yaml` — empty output means a no-op upgrade or the file already matches reality.
2. **Compare consecutive revisions.** `diff <(helm get values R -n NS --revision N-1) <(helm get values R -n NS --revision N)` — empty diff confirms a no-op.
3. **Check the rendered manifest.** `helm get manifest R -n NS --revision N` shows the actual YAML applied. Useful when values changed but the resulting manifest didn't (template logic ate the change).
4. **Inspect operator logs if the chart deploys CRDs.** Operators silently reject invalid configs and only log to their own pod. Workload logs may look clean.
5. **Confirm the chart version.** `helm history` shows the chart version per revision; an unintended chart bump can change defaults that override values.
6. **Snapshot before retrying.** Even when debugging, snapshot the current values + manifest before another upgrade attempt. The next attempt may break something the snapshot can restore.

## When `--reuse-values` is the right call

It has one legitimate use: surgical key changes where everything else should stay byte-identical to the previous release.

```bash
# Image bump only — values file may have drifted, intentionally not consulted
helm upgrade R C --reuse-values --set image.tag=v1.2.4
```

This pattern is safe when the operator explicitly does not want the on-disk values file consulted. It is wrong any time the values file is the intended source of truth — which is most of the time. **If you'd be surprised that `-f` is ignored, you wanted `--reset-then-reuse-values -f` or plain `-f`, not `--reuse-values -f`.**

