Database Performance Investigation Runbook

Database Performance Investigation Runbook#

When a database is slow, resist the urge to immediately tune configuration parameters. Follow this sequence: identify what is slow, understand why, then fix the specific bottleneck. Most performance problems are caused by missing indexes or a single bad query, not global configuration issues.

Phase 1 – Identify Slow Queries#

The first step is always finding which queries are consuming the most time.

PostgreSQL: pg_stat_statements#

Enable the extension if not already loaded:

MySQL 8.x Setup and Configuration

MySQL 8.x Setup and Configuration#

MySQL 8.x is the current production series. It introduced caching_sha2_password as the default auth plugin, CTEs, window functions, and a redesigned data dictionary. Getting it installed is straightforward; getting it configured correctly for production takes more thought.

Installation#

Package Managers#

On Ubuntu/Debian, the MySQL APT repository gives you the latest 8.x:

# Add the MySQL APT repo
wget https://dev.mysql.com/get/mysql-apt-config_0.8.30-1_all.deb
sudo dpkg -i mysql-apt-config_0.8.30-1_all.deb
sudo apt update
sudo apt install mysql-server

On RHEL/Rocky/AlmaLinux:

MySQL Debugging: Common Problems and Solutions

MySQL Debugging: Common Problems and Solutions#

When MySQL breaks, it falls into a handful of failure modes. Here are the diagnostic workflows, in order of frequency.

Access Denied Errors#

Access denied for user 'appuser'@'10.0.1.5' (using password: YES) means wrong password, user does not exist for that host, or missing privileges.

Diagnosis:

-- 1. Does the user exist for that host?
SELECT user, host, plugin FROM mysql.user WHERE user = 'appuser';
-- MySQL matches user+host pairs. 'appuser'@'localhost' != 'appuser'@'%'.

-- 2. Check grants
SHOW GRANTS FOR 'appuser'@'%';

-- 3. Auth plugin mismatch? Old clients can't handle caching_sha2_password:
ALTER USER 'appuser'@'%' IDENTIFIED WITH mysql_native_password BY 'password';

To reset a lost root password:

MySQL Performance Tuning

MySQL Performance Tuning#

Performance tuning comes down to three things: making queries touch fewer rows (indexes), keeping hot data in memory (buffer pool), and finding the slow queries (slow query log, Performance Schema).

Reading EXPLAIN Output#

EXPLAIN shows MySQL’s query execution plan. Always use EXPLAIN ANALYZE (MySQL 8.0.18+) for actual runtime stats, not just estimates.

EXPLAIN ANALYZE
SELECT u.name, COUNT(o.id) as order_count
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.created_at > '2025-01-01'
GROUP BY u.id;

Key columns:

Planning and Executing Database Migrations: Schema Changes, Data Migrations, and Zero-Downtime Patterns

Planning and Executing Database Migrations#

Database migrations are the highest-risk routine operations most teams perform. A bad migration can cause downtime, data loss, or application errors that cascade across every service that touches the affected tables. This operational sequence walks through the assessment, planning, execution, and rollback of database migrations from simple column additions to full platform changes.

Phase 1 – Assessment#

Step 1: Classify the Migration#

Every migration falls into one of three categories, each with a different risk profile:

Choosing a Database Strategy: On Kubernetes vs Managed Service, and PostgreSQL vs MySQL vs CockroachDB

Choosing a Database Strategy#

Every Kubernetes-based platform eventually faces two questions: should the database run inside the cluster or as a managed service, and which database engine fits the workload? These decisions are difficult to reverse. A database migration is one of the highest-risk operations in production. Getting the initial decision roughly right saves months of future pain.

Where to Run: Kubernetes vs Managed Service#

This is not a technology question. It is an organizational question about who owns database operations and what tradeoffs the team will accept.

Database Connection Pooling: PgBouncer, ProxySQL, and Application-Level Patterns

Database Connection Pooling: PgBouncer, ProxySQL, and Application-Level Patterns#

Database connections are expensive resources. PostgreSQL forks a new OS process for every connection. MySQL creates a thread. Both allocate memory for session state, query buffers, and sort areas. When your application scales horizontally in Kubernetes – 10 pods, then 20, then 50 – the connection count multiplies, and most databases buckle long before your application pods do.

Connection pooling solves this by maintaining a smaller set of persistent connections to the database and sharing them across many application clients. Understanding pooling options, deployment patterns, and sizing is essential for any production database workload on Kubernetes.