---
title: "PostgreSQL Debugging"
description: "Step-by-step workflows for diagnosing and fixing common PostgreSQL problems including connection issues, slow queries, lock contention, disk space, OOM kills, and transaction ID wraparound."
url: https://agent-zone.ai/knowledge/databases/postgres-debugging/
section: knowledge
date: 2026-02-22
categories: ["databases"]
tags: ["postgresql","debugging","troubleshooting","locks","vacuum","oom","connections"]
skills: ["postgres-debugging","incident-response","database-troubleshooting"]
tools: ["postgresql","psql","pg_stat_activity","pg_locks","journalctl"]
levels: ["intermediate"]
word_count: 626
formats:
  json: https://agent-zone.ai/knowledge/databases/postgres-debugging/index.json
  html: https://agent-zone.ai/knowledge/databases/postgres-debugging/?format=html
  api: https://api.agent-zone.ai/api/v1/knowledge/search?q=PostgreSQL+Debugging
---


# PostgreSQL Debugging

When PostgreSQL breaks, it usually falls into a handful of patterns. This is a reference for diagnosing each one with specific queries and commands.

## Connection Refused

Work through these in order:

**1. Is PostgreSQL running?**

```bash
sudo systemctl status postgresql-16
```

**2. Is it listening on the right address?**

```bash
ss -tlnp | grep 5432
```

If it shows `127.0.0.1:5432` but you need remote access, set `listen_addresses = '*'` in `postgresql.conf`.

**3. Does pg_hba.conf allow the connection?** Check logs for `no pg_hba.conf entry for host`:

```bash
sudo tail -50 /var/log/postgresql/postgresql-16-main.log
```

**4. Firewall?** Check `iptables -L -n | grep 5432` or `firewall-cmd --list-ports`.

## Too Many Connections

**Check current connections:**

```sql
SELECT count(*), state, usename FROM pg_stat_activity GROUP BY state, usename ORDER BY count DESC;
```

**Find and kill stale idle connections:**

```sql
SELECT pid, usename, state, now() - state_change AS idle_duration
FROM pg_stat_activity WHERE state = 'idle' ORDER BY idle_duration DESC;

SELECT pg_terminate_backend(pid) FROM pg_stat_activity
WHERE state = 'idle' AND state_change < now() - interval '10 minutes' AND pid <> pg_backend_pid();
```

**Prevent it:** Set `idle_in_transaction_session_timeout = 60000` in `postgresql.conf` and use PgBouncer for connection pooling.

## Slow Queries

**Find currently running slow queries:**

```sql
SELECT pid, now() - query_start AS duration, state, query
FROM pg_stat_activity WHERE state = 'active' AND query_start < now() - interval '5 seconds'
ORDER BY duration DESC;
```

**Cancel or terminate:**

```sql
SELECT pg_cancel_backend(12345);     -- cancels the query
SELECT pg_terminate_backend(12345);  -- kills the connection
```

**Find historically slow queries with pg_stat_statements:**

```sql
SELECT calls, round(mean_exec_time::numeric, 2) AS mean_ms,
  round(total_exec_time::numeric, 2) AS total_ms, substr(query, 1, 120) AS query
FROM pg_stat_statements ORDER BY mean_exec_time DESC LIMIT 10;
```

Then use `EXPLAIN (ANALYZE, BUFFERS)` on the offending query. Look for sequential scans on large tables and high "Rows Removed by Filter" counts.

## Lock Contention

Queries hang without returning. Find blocked and blocking queries:

```sql
SELECT blocked.pid AS blocked_pid, blocked.query AS blocked_query,
  blocking.pid AS blocking_pid, blocking.query AS blocking_query, blocking.state
FROM pg_stat_activity blocked
JOIN pg_locks bl ON bl.pid = blocked.pid AND NOT bl.granted
JOIN pg_locks gl ON gl.locktype = bl.locktype
  AND gl.database IS NOT DISTINCT FROM bl.database
  AND gl.relation IS NOT DISTINCT FROM bl.relation
  AND gl.pid <> bl.pid AND gl.granted
JOIN pg_stat_activity blocking ON blocking.pid = gl.pid;
```

Common causes: `ALTER TABLE` waiting for long queries to finish, `idle in transaction` holding a lock. Fix by terminating the blocker: `SELECT pg_terminate_backend(<blocking_pid>);`

For DDL in production, always set a lock timeout:

```sql
SET lock_timeout = '5s';
ALTER TABLE orders ADD COLUMN new_col text;
```

## Disk Space Issues

**Find largest tables:**

```sql
SELECT schemaname || '.' || relname AS table_name,
  pg_size_pretty(pg_total_relation_size(relid)) AS total_size
FROM pg_stat_user_tables ORDER BY pg_total_relation_size(relid) DESC LIMIT 10;
```

**Check WAL accumulation:**

```bash
du -sh /var/lib/postgresql/16/main/pg_wal/
```

Excessive WAL usually means inactive replication slots or failing `archive_command`. Check `pg_replication_slots` and drop unused slots.

## OOM Kills

```bash
journalctl -k | grep -i "oom\|killed process"
```

Common causes: `shared_buffers` too high, `work_mem` too high with many concurrent queries, too many connections. Fix by reducing these values and using PgBouncer.

## Corrupted Indexes

Symptoms: wrong query results, or different results with `SET enable_indexscan = off`.

```sql
REINDEX INDEX idx_orders_customer;                  -- locks table
REINDEX INDEX CONCURRENTLY idx_orders_customer;     -- PostgreSQL 12+, no lock
```

## Transaction ID Wraparound

PostgreSQL uses 32-bit transaction IDs. Without vacuuming, it shuts down at ~2 billion to prevent data corruption.

**Check proximity:**

```sql
SELECT datname, age(datfrozenxid) AS xid_age,
  round(age(datfrozenxid)::numeric / 2000000000 * 100, 1) AS pct_to_wraparound
FROM pg_database ORDER BY age(datfrozenxid) DESC;
```

Above 50% is a problem. Above 80% is an emergency.

**Find the blocking table and fix it:**

```sql
SELECT relname, age(relfrozenxid) AS xid_age FROM pg_stat_user_tables ORDER BY age(relfrozenxid) DESC LIMIT 5;
VACUUM FREEZE VERBOSE <table_name>;
```

## Stuck VACUUM

A VACUUM running for hours is usually blocked by a long-running transaction that prevents dead tuple cleanup.

```sql
SELECT pid, xact_start, now() - xact_start AS duration, state, query
FROM pg_stat_activity WHERE xact_start IS NOT NULL ORDER BY xact_start ASC LIMIT 5;
```

The oldest transaction sets the `xmin` horizon. If it is an `idle in transaction` from hours ago, terminate it. Then VACUUM will make progress.

