Helm Chart Development: Templates, Helpers, and Testing

Helm Chart Development#

Writing your own Helm charts turns static YAML into reusable, configurable packages. The learning curve is in Go’s template syntax and Helm’s conventions, but once you internalize the patterns, chart development is fast.

Chart Structure#

Create a new chart scaffold:

helm create my-app

This generates:

my-app/
  Chart.yaml              # chart metadata (name, version, dependencies)
  values.yaml             # default configuration values
  charts/                 # dependency charts (populated by helm dependency update)
  templates/              # Kubernetes manifest templates
    deployment.yaml
    service.yaml
    ingress.yaml
    serviceaccount.yaml
    hpa.yaml
    NOTES.txt             # post-install instructions (printed after helm install)
    _helpers.tpl           # named template definitions
    tests/
      test-connection.yaml # helm test pod

Chart.yaml#

The Chart.yaml defines your chart’s identity and dependencies:

Helm Values and Overrides: Precedence, Inspection, and Environment Patterns

Helm Values and Overrides#

Every Helm chart has a values.yaml file that defines defaults. When you install or upgrade a release, you override those defaults through values files (-f) and inline flags (--set). Getting the precedence wrong leads to silent misconfigurations where you think you set something but the chart used a different value.

Inspecting Chart Defaults#

Before overriding anything, look at what the chart provides. helm show values dumps the full default values.yaml for any chart:

Incident Management Lifecycle

Incident Lifecycle Overview#

An incident is an unplanned disruption to a service requiring coordinated response. The lifecycle has six phases: detection, triage, communication, mitigation, resolution, and review. Each has defined actions, owners, and exit criteria.

Phase 1: Detection#

Incidents are detected through three channels. Automated monitoring is best – alerts fire on SLO violations or error thresholds before users notice. Internal reports come from other teams noticing issues with dependencies. Customer reports are worst case – if users detect your incidents first, your observability has gaps.

Infrastructure Capacity Planning: Measurement, Projection, and Scaling

What Capacity Planning Solves#

Running out of capacity during a traffic spike causes outages. Over-provisioning wastes money continuously. Capacity planning is the process of measuring what you use now, projecting what you will need, and ensuring resources are available before demand arrives. Without it, you are either constantly firefighting resource exhaustion or explaining to finance why your cloud bill doubled.

Capacity planning is not a one-time exercise. It is a recurring process – monthly for fast-growing services, quarterly for stable ones.

Infrastructure Knowledge Scoping for Agents

Infrastructure Knowledge Scoping for Agents#

An agent working on infrastructure tasks needs to operate at the right level of specificity. Giving generic Kubernetes advice when the user runs EKS with IRSA is unhelpful – the agent misses the IAM integration that will make or break the deployment. Giving EKS-specific advice when the user runs minikube on a laptop is equally unhelpful – the agent references services and configurations that do not exist.

Ingress Controllers and Routing Patterns

Ingress Controllers and Routing Patterns#

An Ingress resource defines HTTP routing rules – which hostnames and paths map to which backend Services. But an Ingress resource does nothing on its own. You need an Ingress controller running in the cluster to watch for Ingress resources and configure the actual reverse proxy.

Ingress Controllers#

The two most common controllers are nginx-ingress and Traefik.

nginx-ingress (ingress-nginx):

helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm install ingress-nginx ingress-nginx/ingress-nginx --namespace ingress-nginx --create-namespace

Note: there are two different nginx ingress projects. kubernetes/ingress-nginx (community) and nginxinc/kubernetes-ingress (NGINX Inc). The community version is far more common. Make sure you install from https://kubernetes.github.io/ingress-nginx, not the NGINX Inc chart.

Istio Security: mTLS, Authorization Policies, and Egress Control

Istio Security#

Istio provides three security capabilities that are difficult to implement without a service mesh: automatic mutual TLS between services, fine-grained authorization policies, and egress traffic control. These features work at the infrastructure layer, meaning applications do not need any code changes.

Automatic mTLS with PeerAuthentication#

Istio’s sidecar proxies can automatically encrypt all pod-to-pod traffic with mutual TLS. The key resource is PeerAuthentication. There are three modes:

  • PERMISSIVE – Accepts both plaintext and mTLS traffic. This is the default and exists for migration. Do not leave it in production.
  • STRICT – Requires mTLS for all traffic. Plaintext connections are rejected.
  • DISABLE – Turns off mTLS entirely.

Enable strict mTLS across the entire mesh:

Istio Service Mesh: Traffic Management, Security, and Observability

Istio Service Mesh#

Istio adds a proxy sidecar (Envoy) to every pod in the mesh. These proxies handle traffic routing, mutual TLS, retries, circuit breaking, and telemetry without changing application code. The control plane (istiod) pushes configuration to all sidecars.

When You Actually Need a Service Mesh#

You need Istio when you have multiple services requiring mTLS, fine-grained traffic control (canary releases, fault injection), or consistent observability across service-to-service communication. If you have fewer than five services, standard Kubernetes Services and NetworkPolicies are sufficient. A service mesh adds operational complexity – more moving parts, higher memory usage per sidecar, and a learning curve for proxy-level debugging.

Jenkins Debugging: Diagnosing Stuck Builds, Pipeline Failures, Performance Issues, and Kubernetes Agent Problems

Jenkins Debugging#

Jenkins failures fall into a few categories: builds stuck waiting, cryptic pipeline errors, performance degradation, and Kubernetes agent pods that refuse to launch.

Builds Stuck in Queue#

When a build sits in the queue and never starts, check the queue tooltip in the UI – it tells you why. Common causes:

No agents with matching labels. The pipeline requests agent { label 'docker-arm64' } but no agent has that label. Check Manage Jenkins > Nodes to see available labels.

Jenkins Kubernetes Integration: Dynamic Pod Agents, Pod Templates, and In-Cluster Builds

Jenkins Kubernetes Integration#

The kubernetes plugin gives Jenkins elastic build capacity. Each build spins up a pod, runs its work, and the pod is deleted. No idle agents, no capacity planning, no snowflake build servers.

The Kubernetes Plugin#

The plugin creates agent pods on demand. When a pipeline requests an agent, a pod is created from a template, its JNLP container connects back to Jenkins, the build runs, and the pod is deleted.