kubectl Debugging: A Practical Command Reference

kubectl Debugging#

When something breaks in Kubernetes, you need to move through a specific sequence of commands. Here is every debugging command you will reach for, plus a step-by-step workflow for a pod that will not start.

Logs#

kubectl logs <pod-name> -n <namespace>                           # basic
kubectl logs <pod-name> -c <container-name> -n <namespace>       # specific container
kubectl logs <pod-name> --previous -n <namespace>                # previous crash (essential for CrashLoopBackOff)
kubectl logs -f <pod-name> -n <namespace>                        # stream in real-time
kubectl logs --since=5m <pod-name> -n <namespace>                # last 5 minutes
kubectl logs -l app=payments-api -n payments-prod --all-containers  # all pods matching label

The --previous flag is critical for crash-looping pods where the current container has no logs yet. The --all-containers flag captures init containers and sidecars.

Advanced Kubernetes Debugging: CrashLoopBackOff, ImagePullBackOff, OOMKilled, and Stuck Pods

Advanced Kubernetes Debugging#

Every Kubernetes failure follows a pattern, and every pattern has a diagnostic sequence. This guide covers the most common failure modes you will encounter in production, with the exact commands and thought process to move from symptom to resolution.

Systematic Debugging Methodology#

Before diving into specific scenarios, internalize this sequence. It applies to nearly every pod issue:

# Step 1: What state is the pod in?
kubectl get pod <pod> -n <ns> -o wide

# Step 2: What does the full pod spec and event history show?
kubectl describe pod <pod> -n <ns>

# Step 3: What did the application log before it failed?
kubectl logs <pod> -n <ns> --previous --all-containers

# Step 4: Can you get inside the container?
kubectl exec -it <pod> -n <ns> -- /bin/sh

# Step 5: Is the node healthy?
kubectl describe node <node-name>
kubectl top node <node-name>

Each failure mode below follows this pattern, with specific things to look for at each step.

Kubernetes Troubleshooting Decision Trees: Symptom to Diagnosis to Fix

Kubernetes Troubleshooting Decision Trees#

Troubleshooting Kubernetes in production is about eliminating possibilities in the right order. Every symptom maps to a finite set of causes, and each cause has a specific diagnostic command. The decision trees below encode that mapping. Start at the symptom, follow the branches, run the commands, and the output tells you which branch to take next.

These trees are designed to be followed mechanically. No intuition required – just execute the commands and interpret the results.