---
title: "Node Drain and Cordon: Safe Node Maintenance"
description: "How to safely remove workloads from Kubernetes nodes using cordon and drain, including flag reference, PDB interactions, and common maintenance scenarios."
url: https://agent-zone.ai/knowledge/kubernetes/node-drain-and-cordon/
section: knowledge
date: 2026-02-22
categories: ["kubernetes"]
tags: ["drain","cordon","node-maintenance","pod-eviction","pdb"]
skills: ["node-maintenance","workload-migration","drain-troubleshooting"]
tools: ["kubectl"]
levels: ["intermediate"]
word_count: 982
formats:
  json: https://agent-zone.ai/knowledge/kubernetes/node-drain-and-cordon/index.json
  html: https://agent-zone.ai/knowledge/kubernetes/node-drain-and-cordon/?format=html
  api: https://api.agent-zone.ai/api/v1/knowledge/search?q=Node+Drain+and+Cordon%3A+Safe+Node+Maintenance
---


# Node Drain and Cordon

Node maintenance is a routine part of cluster operations: kernel patches, instance type changes, Kubernetes upgrades, hardware replacement. The tools are `kubectl cordon` (stop scheduling new pods) and `kubectl drain` (evict existing pods). Getting the flags and sequence right is the difference between a seamless operation and a production incident.

## Cordon: Mark Unschedulable

Cordon sets the `spec.unschedulable` field on a node to `true`. The scheduler will not place new pods on it, but existing pods continue running undisturbed.

```bash
kubectl cordon node-1

# Verify
kubectl get node node-1
# NAME     STATUS                     ROLES    AGE   VERSION
# node-1   Ready,SchedulingDisabled   worker   90d   v1.31.0

# Reverse it
kubectl uncordon node-1
```

Cordon is non-disruptive. Use it when you want to stop new work from landing on a node before you drain it, or when investigating a node issue without immediately evicting workloads.

## Drain: Evict Pods Safely

`kubectl drain` does two things in sequence: it cordons the node, then evicts all pods from it. Eviction goes through the Kubernetes Eviction API, which means PodDisruptionBudgets are respected.

```bash
kubectl drain node-1 --ignore-daemonsets --delete-emptydir-data
```

### What Drain Does Step by Step

1. **Cordons the node** -- marks it unschedulable.
2. **Identifies all pods on the node** -- excluding mirror pods (static pods managed by kubelet) and DaemonSet pods (if `--ignore-daemonsets` is set).
3. **Sends eviction requests** through the Eviction API for each pod. This is not a delete -- it is a polite request that respects PDBs.
4. **Waits for pods to terminate.** Each pod gets its `terminationGracePeriodSeconds` to shut down cleanly.
5. **Reports completion** once all pods are gone or the timeout is reached.

### The Flags That Matter

```bash
kubectl drain node-1 \
  --ignore-daemonsets \       # Skip DaemonSet pods (they will be recreated anyway)
  --delete-emptydir-data \    # Delete pods using emptyDir volumes (data is lost)
  --force \                   # Delete pods not managed by a controller (bare pods)
  --grace-period=30 \         # Override pod's terminationGracePeriodSeconds
  --timeout=300s \            # Give up after 5 minutes
  --pod-selector='app!=critical' \  # Only drain pods matching this selector
  --disable-eviction          # Use DELETE instead of Eviction API (skips PDB checks)
```

**`--ignore-daemonsets`**: Almost always required. DaemonSet pods run on every node by definition. Drain cannot evict them (they would just be rescheduled back), so without this flag, drain errors out when it encounters them.

**`--delete-emptydir-data`**: Required if any pod uses `emptyDir` volumes. Drain refuses to evict these pods by default because eviction destroys the data. If the data is ephemeral (caches, temp files), this flag is safe.

**`--force`**: Required for pods not managed by a ReplicaSet, Deployment, StatefulSet, or Job. These "bare pods" will not be recreated after eviction. Drain warns you and refuses without this flag.

**`--grace-period`**: Overrides the pod's configured `terminationGracePeriodSeconds`. Useful when you need to speed up a drain, but be aware that pods may not shut down cleanly if the grace period is too short.

**`--timeout`**: How long drain waits for all pods to be evicted. If exceeded, drain exits with an error but the node remains cordoned. Default is no timeout (waits forever).

**`--disable-eviction`**: Bypasses the Eviction API entirely and issues direct DELETE requests. This ignores PDBs. Use only as a last resort when PDBs are blocking a drain you must complete.

## PodDisruptionBudgets Blocking Drains

The most common drain problem is a PDB that will not allow any more disruptions. Drain sends eviction requests through the API server, and the API server rejects evictions that would violate a PDB.

Symptoms: drain hangs indefinitely, printing messages like `evicting pod default/my-app-abc123` but never completing.

Diagnose:

```bash
# Find PDBs with zero allowed disruptions
kubectl get pdb --all-namespaces
# NAMESPACE   NAME         MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
# default     my-app-pdb   2               N/A               0                     30d

# Check why disruptions are zero
kubectl describe pdb my-app-pdb
# Look at currentHealthy vs desiredHealthy
```

Common causes:
- **Single-replica deployment with `minAvailable: 1`** -- there is never room to evict the one pod. Fix: use `maxUnavailable: 1` instead, or scale up before draining.
- **Pods already unhealthy** -- if `currentHealthy` is already at or below `desiredHealthy`, no evictions are allowed. Fix the unhealthy pods first.
- **Multiple nodes draining simultaneously** -- the first drain consumed all allowed disruptions. Drain nodes one at a time.

When it is safe to override:

```bash
# Nuclear option: bypass PDB checks entirely
kubectl drain node-1 --ignore-daemonsets --delete-emptydir-data --disable-eviction
```

Only do this when you have confirmed the workload can tolerate the disruption, or when the node is already dead and pods are not running anyway.

## Special Pod Types During Drain

**DaemonSet pods**: Ignored with `--ignore-daemonsets`. They keep running on the node until the node is removed or the DaemonSet is deleted. If you are decommissioning the node, they will be cleaned up automatically.

**Static pods**: Managed directly by kubelet via manifest files in `/etc/kubernetes/manifests/`. Drain does not touch them. To remove them, delete the manifest file on the node.

**Pods with local storage (hostPath)**: Drain skips these by default. Unlike `emptyDir`, `hostPath` data persists on the node and may be important. Use `--force` if the data is expendable.

## Common Scenarios

### Node Replacement

```bash
# 1. Cordon to stop new scheduling
kubectl cordon node-old

# 2. Verify new node is ready
kubectl get nodes

# 3. Drain the old node
kubectl drain node-old --ignore-daemonsets --delete-emptydir-data --timeout=600s

# 4. Verify pods rescheduled
kubectl get pods --all-namespaces --field-selector spec.nodeName=node-old

# 5. Delete the node object (after decommissioning the VM)
kubectl delete node node-old
```

### Kernel Patching

```bash
# Drain, patch, reboot, uncordon
kubectl drain node-1 --ignore-daemonsets --delete-emptydir-data
# SSH to node, apply patches, reboot
kubectl uncordon node-1
```

### Rolling Node Upgrades

When upgrading multiple nodes, drain one at a time. Wait for all evicted pods to be Running on other nodes before draining the next:

```bash
for node in node-1 node-2 node-3; do
  echo "Draining $node..."
  kubectl drain "$node" --ignore-daemonsets --delete-emptydir-data --timeout=300s

  echo "Waiting for pods to stabilize..."
  sleep 30
  kubectl get pods --all-namespaces | grep -v Running | grep -v Completed

  # Perform maintenance on the node here

  kubectl uncordon "$node"
  echo "Uncordoned $node, waiting before next..."
  sleep 60
done
```

