Kubernetes Scheduler: How Pods Get Placed on Nodes

Kubernetes Scheduler: How Pods Get Placed on Nodes#

The scheduler (kube-scheduler) watches for newly created pods that have no node assignment. For each unscheduled pod, the scheduler selects the best node and writes a binding back to the API server. The kubelet on that node then starts the pod. If no node is suitable, the pod stays Pending until conditions change.

The scheduler is the reason pods run where they do. Understanding its internals is essential for diagnosing Pending pods, designing placement constraints, and managing cluster utilization.

Jobs and CronJobs: Batch Workloads, Retry Logic, and Scheduling

Jobs and CronJobs#

Deployments manage long-running processes. Jobs manage work that finishes. A Job creates one or more pods, runs them to completion, and tracks whether they succeeded. CronJobs run Jobs on a schedule. Both are essential for database migrations, report generation, data pipelines, and any workload that is not a continuously running server.

Job Basics#

A Job runs a pod until it exits successfully (exit code 0). The simplest case is a single pod that runs once:

Pod Affinity and Anti-Affinity: Co-locating and Spreading Workloads

Pod Affinity and Anti-Affinity#

Node affinity controls which nodes a pod can run on. Pod affinity and anti-affinity go further – they control whether a pod should run near or away from other specific pods. This is how you co-locate a frontend with its cache for low latency, or spread database replicas across failure domains for high availability.

Pod Affinity: Schedule Near Other Pods#

Pod affinity tells the scheduler “place this pod in the same topology domain as pods matching a label selector.” The topology domain is defined by topologyKey – it could be the same node, the same zone, or any other node label.

Pod Topology Spread Constraints: Even Distribution Across Failure Domains

Pod Topology Spread Constraints#

Pod anti-affinity gives you binary control: either a pod avoids another pod’s topology domain or it does not. But it does not give you even distribution. If you have 6 replicas and 3 zones, anti-affinity cannot express “put exactly 2 in each zone.” Topology spread constraints solve this by letting you specify the maximum allowed imbalance between any two topology domains.

How Topology Spread Works#

A topology spread constraint defines:

Taints, Tolerations, and Node Affinity: Controlling Pod Placement

Taints, Tolerations, and Node Affinity#

Pod scheduling in Kubernetes defaults to “run anywhere there is room.” In production, that is rarely what you want. GPU workloads should land on GPU nodes. System components should not compete with application pods. Nodes being drained should stop accepting new work. Taints, tolerations, and node affinity give you control over where pods run and where they do not.

Taints: Repelling Pods from Nodes#

A taint is applied to a node and tells the scheduler “do not place pods here unless they explicitly tolerate this taint.” Taints have three parts: a key, a value, and an effect.