Kubernetes autoscaling in practice — HPA, VPA and Cluster Autoscaler

Kubernetes can scale. But properly configuring autoscaling to respond to real load, not waste resources, and not collapse under peak traffic — that’s an art.

Three layers of autoscaling¶

HPA — adds/removes pods (for stateless services)
VPA — changes CPU/RAM limits of pods (for monoliths)
Cluster Autoscaler — adds/removes nodes

Custom metrics instead of CPU¶

Default HPA scales based on CPU, but that’s not enough. Through Prometheus Adapter we added requests/sec, latency p95 and queue depth. Now HPA scales based on what really matters.

Overprovisioning for fast scale-up¶

A new AKS node takes 3-5 minutes. Solution: we maintain an “empty” node with pause containers, immediately available for real workloads. Cluster Autoscaler adds a new node in the background.

Spot instances — 60-80% savings¶

For fault-tolerant workloads (batch, CI/CD, dev) we use Azure Spot VMs in a dedicated node pool. Production always on on-demand.

Biggest mistake: wrong resource requests¶

Developers set 2 CPU and 4 GB RAM “just to be safe”. Real utilization 15%. Cluster Autoscaler was adding nodes unnecessarily. Solution: VPA in recommendation mode.

Autoscaling requires investment¶

It’s not “set it and forget it”. Proper metrics, realistic requests and continuous tuning — but the reward is a system that handles peaks automatically.

kubernetesautoscalingaksdevopscloud

CORE SYSTEMS

Stavíme core systémy a AI agenty, které drží provoz. 15 let zkušeností s enterprise IT.

Need help with implementation?

Our experts can help with design, implementation, and operations. From architecture to production.

Kubernetes autoscaling in practice — HPA, VPA and Cluster Autoscaler

Three layers of autoscaling¶

Custom metrics instead of CPU¶

Overprovisioning for fast scale-up¶

Spot instances — 60-80% savings¶

Biggest mistake: wrong resource requests¶

Autoscaling requires investment¶

CORE SYSTEMS

Need help with implementation?

Related articles

The Complete Guide to Kubernetes

Kubernetes: 20 kubectl příkazů pro denní práci

Year 2019 — A Recap of Our Cloud and Kubernetes Journey