Skip to content
_CORE
AI & Agentic Systems Core Information Systems Cloud & Platform Engineering Data Platform & Integration Security & Compliance QA, Testing & Observability IoT, Automation & Robotics Mobile & Digital Banking & Finance Insurance Public Administration Defense & Security Healthcare Energy & Utilities Telco & Media Manufacturing Logistics & E-commerce Retail & Loyalty
References Technologies Blog Know-how Tools
About Collaboration Careers
CS EN DE
Let's talk

Kubernetes autoscaling in practice — HPA, VPA and Cluster Autoscaler

15. 09. 2020 Updated: 27. 03. 2026 1 min read CORE SYSTEMSai
This article was published in 2020. Some information may be outdated.
Kubernetes autoscaling in practice — HPA, VPA and Cluster Autoscaler

Kubernetes can scale. But properly configuring autoscaling to respond to real load, not waste resources, and not collapse under peak traffic — that’s an art.

Three layers of autoscaling

  • HPA — adds/removes pods (for stateless services)
  • VPA — changes CPU/RAM limits of pods (for monoliths)
  • Cluster Autoscaler — adds/removes nodes

Custom metrics instead of CPU

Default HPA scales based on CPU, but that’s not enough. Through Prometheus Adapter we added requests/sec, latency p95 and queue depth. Now HPA scales based on what really matters.

Overprovisioning for fast scale-up

A new AKS node takes 3-5 minutes. Solution: we maintain an “empty” node with pause containers, immediately available for real workloads. Cluster Autoscaler adds a new node in the background.

Spot instances — 60-80% savings

For fault-tolerant workloads (batch, CI/CD, dev) we use Azure Spot VMs in a dedicated node pool. Production always on on-demand.

Biggest mistake: wrong resource requests

Developers set 2 CPU and 4 GB RAM “just to be safe”. Real utilization 15%. Cluster Autoscaler was adding nodes unnecessarily. Solution: VPA in recommendation mode.

Autoscaling requires investment

It’s not “set it and forget it”. Proper metrics, realistic requests and continuous tuning — but the reward is a system that handles peaks automatically.

kubernetesautoscalingaksdevopscloud
Share:

CORE SYSTEMS

We build core systems and AI agents that keep operations running. 15 years of experience with enterprise IT.

Need help with implementation?

Our experts can help with design, implementation, and operations. From architecture to production.

Contact us
Need help with implementation? Schedule a meeting