PostgreSQL Disaster Recovery

PostgreSQL Disaster Recovery#

A DR plan for PostgreSQL has three layers: streaming replication for fast failover, WAL archiving for point-in-time recovery, and a backup tool like pgBackRest for managing retention. Each layer covers a different failure mode – replication for server crashes, WAL archiving for data corruption that replicates, full backups for when everything goes wrong.

Streaming Replication for DR#

Synchronous vs Asynchronous – The Core Tradeoff#

Asynchronous replication is the default. The primary streams WAL to the standby, but does not wait for confirmation before committing. This means the primary is fast, but the standby can be seconds behind. If the primary dies, those uncommitted-on-standby transactions are lost.

PostgreSQL Backup and Recovery

PostgreSQL Backup and Recovery#

A backup you have never tested restoring is not a backup. This covers the main backup tools, when to use each, point-in-time recovery, and automation.

Logical Backups: pg_dump and pg_dumpall#

pg_dump exports a single database as SQL or a compressed binary format. It takes a consistent snapshot without blocking writes.

# Custom format (compressed, supports parallel restore)
pg_dump -U postgres -Fc -d myapp -f myapp.dump

# Directory format (parallel dump)
pg_dump -U postgres -Fd -j 4 -d myapp -f myapp_dir/

pg_dumpall exports every database plus cluster-wide objects. In practice, dump roles separately and per-database for flexibility: