Database High Availability Patterns

Database High Availability Patterns#

Every database HA decision starts with two numbers: RPO (Recovery Point Objective – how much data you can afford to lose) and RTO (Recovery Time Objective – how long the database can be unavailable). These numbers dictate the pattern, and each pattern carries specific operational tradeoffs.

Core Concepts#

RPO = 0 means zero data loss. Every committed transaction must survive a failure. This requires synchronous replication, which adds latency to every write.

Multi-Cloud vs Single-Cloud Strategy Decisions

Multi-Cloud vs Single-Cloud Strategy#

Multi-cloud is one of the most oversold strategies in infrastructure. Vendors, consultants, and conference speakers promote it as the default approach, but the reality is that most organizations are better served by a single cloud provider used well. This framework helps you determine whether multi-cloud is actually worth the cost for your situation.

The Default Answer Is Single-Cloud#

Start with single-cloud unless you have a specific, concrete reason to go multi-cloud. Here is why.

Velero Backup and Restore: Disaster Recovery for Kubernetes

Velero Backup and Restore#

Velero backs up Kubernetes resources and persistent volume data to object storage. It handles scheduled backups, on-demand snapshots, and restores to the same or a different cluster. It is the standard tool for Kubernetes disaster recovery.

Velero captures two things: Kubernetes API objects (stored as JSON) and persistent volume data (via cloud volume snapshots or file-level backup with Kopia).

Installation#

You need an object storage bucket (S3, GCS, Azure Blob, or MinIO) and write credentials.

Choosing a Kubernetes Backup Strategy: Velero vs Native Snapshots vs Application-Level Backups

Choosing a Kubernetes Backup Strategy#

Kubernetes clusters contain two fundamentally different types of state: cluster state (the Kubernetes objects themselves – Deployments, Services, ConfigMaps, Secrets, CRDs) and application data (the contents of Persistent Volumes). A complete backup strategy must address both. Most backup failures happen because teams back up one but not the other, or because they never test the restore process.

What Needs Backing Up#

Before choosing tools, inventory what your cluster contains: