Vertical Pod Autoscaler (VPA): Right-Sizing Resource Requests Automatically

Vertical Pod Autoscaler (VPA)#

Horizontal scaling adds more pod replicas. Vertical scaling gives each pod more (or fewer) resources. VPA automates the vertical side by watching actual CPU and memory usage over time and adjusting resource requests to match reality. Without it, teams guess at resource requests during initial deployment and rarely revisit them, leading to either waste (over-provisioned) or instability (under-provisioned).

What VPA Does#

VPA monitors historical and current resource usage for pods in a target Deployment (or StatefulSet, DaemonSet, etc.) and produces recommendations for CPU and memory requests. Depending on the configured mode, it either reports these recommendations passively or actively applies them by evicting and recreating pods with updated requests.