Gateway API: The Modern Replacement for Ingress in Kubernetes

February 21, 2026

Gateway-Api-Configuration, Traffic-Management, Ingress-Migration, Tls-Termination, Canary-Deployment

Gateway-Api, Ingress, Httproute, Grpcroute, Traffic-Splitting, Canary, Tls, Networking, Load-Balancing

Gateway API: The Modern Replacement for Ingress#

The Ingress resource has been the standard way to expose HTTP services in Kubernetes since the early days. It works, but it has fundamental limitations: it only supports HTTP, its routing capabilities are minimal (host and path matching only), and every controller extends it through non-standard annotations that are not portable. Gateway API is the official successor – a set of purpose-built resources that provide richer routing, protocol support beyond HTTP, and a role-oriented design that cleanly separates infrastructure concerns from application concerns.

GitOps for Kubernetes: Patterns, Tools, and Workflow Design

February 21, 2026

Kubernetes

Intermediate

Gitops-Workflow-Design, Argocd-Management, Flux-Configuration, Secrets-in-Gitops, Multi-Cluster-Deployment

Gitops, Argocd, Flux, Kustomize, Sealed-Secrets, Sops, Drift-Detection, Multi-Cluster

Kubectl, Argocd, Flux, Kustomize, Sealed-Secrets, Sops

GitOps for Kubernetes#

GitOps is a deployment model where git is the source of truth for your cluster’s desired state. A controller running inside the cluster watches a git repository and continuously reconciles the live state to match what is declared in git. When you want to change something, you commit to git. The controller detects the change and applies it.

This replaces kubectl apply from laptops and CI pipelines with a pull-based model where the cluster pulls its own configuration. The benefits are an audit trail in git history, easy rollback via git revert, and drift detection when someone makes manual changes.

Helm Release Naming Gotchas: How Resource Names Actually Work

February 21, 2026

Kubernetes

Intermediate

Helm-Chart-Debugging, Release-Naming

Helm, Naming, Bitnami, Resource-Names

Helm, Kubectl

Helm Release Naming Gotchas#

Helm charts derive Kubernetes resource names from the release name, but every chart does it differently. If you assume a consistent pattern, you will get bitten by DNS resolution failures, broken connection strings, and mysterious “service not found” errors.

Bitnami PostgreSQL: Names Are Not What You Expect#

The Bitnami PostgreSQL chart names resources using the release name directly, not {release-name}-postgresql. This catches nearly everyone.

# You deploy like this:
helm upgrade --install dt-postgresql bitnami/postgresql \
  --namespace dream-team \
  --set auth.database=mattermost \
  --set auth.username=mmuser

# You expect these resource names:
#   Pod:     dt-postgresql-postgresql-0   <-- WRONG
#   Service: dt-postgresql-postgresql     <-- WRONG

# Actual names:
#   Pod:     dt-postgresql-0
#   Service: dt-postgresql

This means your application connection string should reference dt-postgresql, not dt-postgresql-postgresql. If you chose release name postgresql, your service is just postgresql – which might collide with other things in your namespace.

Init Containers and Sidecar Patterns: Sequential Setup and Co-located Services

February 21, 2026

Kubernetes

Intermediate

Init-Container-Design, Sidecar-Pattern-Implementation, Pod-Lifecycle-Management, Multi-Container-Debugging

Init-Containers, Sidecars, Pod-Design, Service-Mesh, Log-Shipping, Native-Sidecars

Kubectl

Init Containers and Sidecar Patterns#

A pod can contain more than one container. Init containers run sequentially before the main application starts. Sidecars run alongside the main container for the lifetime of the pod. Together, they enable patterns where setup logic and cross-cutting concerns are separated from application code.

Init Containers#

Init containers are defined in spec.initContainers[] and run in order. Each must exit 0 before the next one starts. If any init container fails, Kubernetes restarts the pod (subject to restartPolicy). The main application containers do not start until every init container has completed successfully.

Jobs and CronJobs: Batch Workloads, Retry Logic, and Scheduling

February 21, 2026

Kubernetes

Intermediate

Batch-Workload-Management, Cron-Scheduling, Job-Failure-Handling, Pod-Cleanup

Jobs, Cronjobs, Batch, Scheduling, Retry-Logic, Backoff, Ttl

Kubectl

Jobs and CronJobs#

Deployments manage long-running processes. Jobs manage work that finishes. A Job creates one or more pods, runs them to completion, and tracks whether they succeeded. CronJobs run Jobs on a schedule. Both are essential for database migrations, report generation, data pipelines, and any workload that is not a continuously running server.

Job Basics#

A Job runs a pod until it exits successfully (exit code 0). The simplest case is a single pod that runs once:

kubectl debug and Ephemeral Containers: Non-Invasive Production Debugging

February 21, 2026

Kubernetes

Intermediate

Ephemeral-Container-Debugging, Process-Inspection, Node-Debugging, Network-Troubleshooting

Kubectl-Debug, Ephemeral-Containers, Distroless, Production-Debugging, Troubleshooting

Kubectl

kubectl debug and Ephemeral Containers#

Production containers should be minimal. Distroless images, scratch-based Go binaries, and hardened base images strip out shells, package managers, and debugging tools. This is good for security and image size, but it means kubectl exec gives you nothing to work with. Ephemeral containers solve this problem.

The Problem#

A typical distroless container has no shell:

$ kubectl exec -it payments-api-7f8b9c6d4-x2k9m -- /bin/sh
OCI runtime exec failed: exec failed: unable to start container process:
exec: "/bin/sh": stat /bin/sh: no such file or directory

You cannot install tools, you cannot inspect files, and you cannot run any diagnostic commands. The application is returning 500 errors and you have nothing but logs.

Kubernetes Cost Optimization: Rightsizing, Resource Efficiency, and Waste Reduction

February 21, 2026

Kubernetes

Intermediate

Cost-Analysis, Resource-Sizing, Capacity-Planning, Cluster-Optimization

Cost-Optimization, Rightsizing, Resource-Requests, Kubecost, Vpa, Goldilocks, Karpenter, Cluster-Autoscaler, Finops

Kubectl, Prometheus, Grafana, Kubecost, Goldilocks, Karpenter

Kubernetes Cost Optimization#

Most Kubernetes clusters run at 15-30% actual CPU utilization but are billed for the full provisioned capacity. The gap between what you reserve and what you use is pure waste. This article covers the practical workflow for finding and eliminating that waste.

The Cost Problem: Requests vs Actual Usage#

Kubernetes resource requests are the foundation of cost. When a pod requests 4 CPUs, the scheduler reserves 4 CPUs on a node regardless of whether the pod ever uses more than 0.1 CPU. The node is sized (and billed) based on what is reserved, not what is consumed.

Kubernetes Disaster Recovery: Runbooks for Common Incidents

February 21, 2026

Kubernetes

Intermediate

Incident-Response, Etcd-Recovery, Certificate-Renewal, Deployment-Rollback, Backup-Restore

Disaster-Recovery, Runbooks, Incident-Response, Etcd, Certificates, Rollback, Velero

Kubectl, Etcdctl, Kubeadm, Velero, Openssl

Kubernetes Disaster Recovery Runbooks#

These runbooks cover the incidents you will encounter in production Kubernetes environments. Each follows the same structure: detection, diagnosis, recovery, and prevention. Print these out, bookmark them, put them in your on-call wiki. When the alert fires at 2 AM, you want a checklist, not a tutorial.

Incident Response Framework#

Every incident follows the same cycle:

Detect – monitoring alert, user report, or kubectl showing unhealthy state
Assess – determine scope and severity. Is it one pod, one node, or the entire cluster?
Contain – stop the bleeding. Prevent the issue from spreading
Recover – restore normal operation
Post-mortem – document what happened, why, and how to prevent it

Runbook 1: Node Goes NotReady#

Detection: Node condition changes to Ready=False. Pods on the node are rescheduled (if using Deployments). Monitoring alerts on node status.

Kubernetes Operator Development: Patterns, Frameworks, and Best Practices

February 21, 2026

Kubernetes

Intermediate

Operator-Development, Controller-Patterns, Go-Development, Kubernetes-Api

Operators, Kubebuilder, Controller-Runtime, Reconciliation, Crds, Operator-Sdk, Kopf, Testing, Envtest

Kubebuilder, Operator-Sdk, Kubectl, Envtest, Kind, Helm

Kubernetes Operator Development#

Operators are custom controllers that manage CRDs. They encode operational knowledge – the kind of tasks a human operator would perform – into software that runs inside the cluster. An operator watches for changes to its custom resources and reconciles the actual state to match the desired state, creating, updating, or deleting child resources as needed.

Operator Maturity Model#

The Operator Framework defines five maturity levels:

Level	Capability	Example
1	Basic install	Helm operator deploys the application
2	Seamless upgrades	Operator handles version migrations
3	Full lifecycle	Backup, restore, failure recovery
4	Deep insights	Exposes metrics, fires alerts, generates dashboards
5	Auto-pilot	Auto-scaling, auto-healing, auto-tuning without human input

Most custom operators target Level 2-3. Levels 4-5 are typically reached by mature projects like the Prometheus Operator or Rook/Ceph.

Kubernetes Resource Management: QoS Classes, Eviction, OOM Scoring, and Capacity Planning

February 21, 2026

Kubernetes

Advanced

Resource-Management, Capacity-Planning, Qos-Optimization, Eviction-Analysis, Resource-Monitoring

Resources, Qos, Eviction, Oom-Killer, Capacity-Planning, Cpu-Throttling, Memory-Management, Resource-Quotas, Limit-Ranges, Monitoring

Kubectl, Prometheus, Metrics-Server

Kubernetes Resource Management Deep Dive#

Resource management in Kubernetes is the mechanism that decides which pods get scheduled, which pods get killed when the node runs low, and how much CPU and memory each container is actually allowed to use. The surface-level concept of requests and limits is straightforward. The underlying mechanics – QoS classification, CFS CPU quotas, kernel OOM scoring, kubelet eviction thresholds – are where misconfigurations cause production outages.