Running Temporal Server on Minikube

February 22, 2026

Temporal-Deployment, Kubernetes-Helm, Minikube-Management

Temporal, Minikube, Kubernetes, Helm, Setup, Postgresql

Temporal, Minikube, Helm, Kubectl, Docker

Running Temporal Server on Minikube#

This guide deploys Temporal Server on a local Minikube cluster with PostgreSQL persistence. By the end, you will have the Temporal frontend, Web UI, and CLI all working against a real Kubernetes deployment.

If you need background on what Temporal is, start with Introduction to Temporal.

Prerequisites#

Tool	Minimum Version	Purpose
minikube	1.32+	Local Kubernetes cluster
kubectl	1.28+	Kubernetes CLI
helm	3.14+	Package manager for Kubernetes
temporal	1.0+	Temporal CLI
docker	24+	Container runtime (minikube driver)

Your machine needs at least 4 CPU cores and 8 GB RAM available to Docker. For minikube driver details, see Minikube Setup and Drivers and Minikube Docker Driver.

Temporal High Availability: Multi-Component Cluster on Kubernetes

February 22, 2026

Workflow-Orchestration

Intermediate

Temporal-Ha-Deployment, Production-Temporal, Kubernetes-Resource-Management

Temporal, High-Availability, Kubernetes, Helm, Postgresql, Elasticsearch, Production

Temporal, Helm, Kubectl, Postgresql, Elasticsearch

Temporal High Availability#

A single-replica Temporal deployment works for development, but any pod going down takes the workflow engine offline. This guide configures a multi-replica cluster with proper resource allocation, Elasticsearch visibility, and health monitoring.

For the single-replica setup this builds on, see Running Temporal Server on Minikube.

Why HA Matters#

Component	What Breaks When It Goes Down
Frontend	No client can start, signal, query, or cancel workflows. Workers cannot poll.
History	Running workflows stall. No state transitions. Timers do not fire.
Matching	Tasks queue up but never dispatch. Workflows appear frozen.
Worker	Internal system workflows stop (archival, replication). Application workflows unaffected.

With multiple replicas, losing a pod triggers a brief rebalance (seconds), not an outage.

PostgreSQL Disaster Recovery

February 22, 2026

Databases

Intermediate, Advanced

Disaster-Recovery-Planning, Postgres-Replication, Backup-Management, Failover-Execution

Postgresql, Disaster-Recovery, Streaming-Replication, Wal-Archiving, Pitr, Pgbackrest, Patroni, Failover, S3, Backup

Postgresql, Pg_basebackup, Pgbackrest, Patroni, Pg_ctl, Psql, Aws-Cli

PostgreSQL Disaster Recovery#

A DR plan for PostgreSQL has three layers: streaming replication for fast failover, WAL archiving for point-in-time recovery, and a backup tool like pgBackRest for managing retention. Each layer covers a different failure mode – replication for server crashes, WAL archiving for data corruption that replicates, full backups for when everything goes wrong.

Streaming Replication for DR#

Synchronous vs Asynchronous – The Core Tradeoff#

Asynchronous replication is the default. The primary streams WAL to the standby, but does not wait for confirmation before committing. This means the primary is fast, but the standby can be seconds behind. If the primary dies, those uncommitted-on-standby transactions are lost.

Database Cross-Region Replication Patterns

February 22, 2026

Databases

Intermediate, Advanced

Replication-Design, Cross-Region-Architecture, Monitoring-Setup, Failover-Management

Replication, Cross-Region, Disaster-Recovery, Postgresql, Mysql, Monitoring, Prometheus, Replication-Lag, Logical-Replication, Group-Replication

Postgresql, Mysql, Prometheus, Grafana, Psql, Mysqld

Database Cross-Region Replication Patterns#

Cross-region replication exists because regions fail. AWS us-east-1 has had multiple multi-hour outages. If your database runs in a single region, a regional failure takes your application down entirely. Cross-region replication gives you a copy of the data somewhere else so you can recover.

The fundamental problem is physics. Light through fiber between US East and US West takes about 30ms one way. Every replication strategy is a different answer to the question: do you wait for the remote region to confirm it has the data before telling the client the write succeeded?

Backup Verification and Restore Testing: Proving Your Backups Actually Work

February 22, 2026

Infrastructure

Intermediate, Advanced

Backup-Validation, Restore-Testing, Backup-Monitoring, Database-Recovery

Backup, Restore-Testing, Backup-Verification, Postgresql, Mysql, Etcd, Monitoring, Data-Integrity, Automation

Pg_restore, Pg_dump, Mysql, Mysqldump, Etcdctl, Aws-Cli, Prometheus, Cron, Bash

Backup Verification and Restore Testing#

An untested backup is not a backup. It is a file that might contain your data and might be restorable. Teams discover the difference during an actual incident, when the database backup turns out to be corrupted, the restore takes 6 hours instead of the expected 30 minutes, or the backup process silently stopped running three weeks ago.

Backup verification is the practice of regularly proving that your backups contain valid data and can be restored within your required RTO.

Database High Availability Patterns

February 22, 2026

Databases

Intermediate, Advanced

Ha-Architecture, Failover-Management, Replication-Design, Disaster-Recovery-Planning

High-Availability, Replication, Patroni, Group-Replication, Failover, Rpo, Rto, Postgresql, Mysql, Multi-Master

Patroni, Etcd, Haproxy, Mysql-Router, Pg_autoctl, Keepalived

Database High Availability Patterns#

Every database HA decision starts with two numbers: RPO (Recovery Point Objective – how much data you can afford to lose) and RTO (Recovery Time Objective – how long the database can be unavailable). These numbers dictate the pattern, and each pattern carries specific operational tradeoffs.

Core Concepts#

RPO = 0 means zero data loss. Every committed transaction must survive a failure. This requires synchronous replication, which adds latency to every write.

Database Performance Investigation Runbook

February 22, 2026

Databases

Intermediate, Advanced

Performance-Diagnosis, Query-Optimization, Database-Troubleshooting, Capacity-Analysis

Performance, Postgresql, Mysql, Slow-Queries, Execution-Plans, Locking, Connection-Pool, Io, Buffer-Cache, Runbook

Psql, Mysql, Pg_stat_statements, Performance_schema, Pt-Query-Digest, Iostat, Vmstat

Database Performance Investigation Runbook#

When a database is slow, resist the urge to immediately tune configuration parameters. Follow this sequence: identify what is slow, understand why, then fix the specific bottleneck. Most performance problems are caused by missing indexes or a single bad query, not global configuration issues.

Phase 1 – Identify Slow Queries#

The first step is always finding which queries are consuming the most time.

PostgreSQL: pg_stat_statements#

Enable the extension if not already loaded:

Devcontainer Sandbox Templates: Zero-Cost Validation Environments for Infrastructure Development

February 22, 2026

Developer-Workflows

Beginner, Intermediate

Devcontainer-Configuration, Cloud-Development-Environments, Infrastructure-Development

Devcontainers, Codespaces, Gitpod, Kubernetes, Terraform, Kind, Helm, Postgresql, Redis, Zero-Cost, Sandbox

Devcontainers, Github-Codespaces, Gitpod, Kind, Kubectl, Helm, Kustomize, Terraform, Tflint, Checkov

Devcontainer Sandbox Templates#

Devcontainers provide disposable, reproducible development environments that run in a container. You define the tools, extensions, and configuration in a .devcontainer/ directory, and any compatible host – GitHub Codespaces, Gitpod, VS Code with Docker, or the devcontainer CLI – builds and launches the environment from that definition.

For infrastructure validation, devcontainers solve a specific problem: giving every developer and every CI run the exact same set of tools at the exact same versions, without requiring them to install anything on their local machine. A Kubernetes devcontainer includes kind, kubectl, helm, and kustomize at pinned versions. A Terraform devcontainer includes terraform, tflint, checkov, and cloud CLIs. The environment is ready to use the moment it starts.

Planning and Executing Database Migrations: Schema Changes, Data Migrations, and Zero-Downtime Patterns

February 22, 2026

Databases

Intermediate

Migration-Planning, Schema-Management, Zero-Downtime-Operations, Rollback-Execution

Migrations, Schema-Changes, Zero-Downtime, Expand-Contract, Postgresql, Mysql, Rollback, Data-Migration

Postgresql, Mysql, Psql, Pg_dump, Pgloader, Flyway, Liquibase

Planning and Executing Database Migrations#

Database migrations are the highest-risk routine operations most teams perform. A bad migration can cause downtime, data loss, or application errors that cascade across every service that touches the affected tables. This operational sequence walks through the assessment, planning, execution, and rollback of database migrations from simple column additions to full platform changes.

Phase 1 – Assessment#

Step 1: Classify the Migration#

Every migration falls into one of three categories, each with a different risk profile:

PostgreSQL Backup and Recovery

February 22, 2026

Databases

Intermediate

Postgres-Backup, Postgres-Recovery, Disaster-Recovery

Postgresql, Backup, Recovery, Pitr, Wal, Pgbackrest, Pg_dump

Postgresql, Pg_dump, Pg_dumpall, Pg_basebackup, Pgbackrest, Cron

PostgreSQL Backup and Recovery#

A backup you have never tested restoring is not a backup. This covers the main backup tools, when to use each, point-in-time recovery, and automation.

Logical Backups: pg_dump and pg_dumpall#

pg_dump exports a single database as SQL or a compressed binary format. It takes a consistent snapshot without blocking writes.

# Custom format (compressed, supports parallel restore)
pg_dump -U postgres -Fc -d myapp -f myapp.dump

# Directory format (parallel dump)
pg_dump -U postgres -Fd -j 4 -d myapp -f myapp_dir/

pg_dumpall exports every database plus cluster-wide objects. In practice, dump roles separately and per-database for flexibility: