kubernetes-autoscaling

Contents

Roadmap info from roadmap website

Autoscaling

Autoscaling in Kubernetes involves adjusting the resources allocated to a deployment or set of pods based on demand. It includes Horizontal Pod Autoscaling (HPA) and Vertical Pod Autoscaling (VPA), which increase or decrease replicas or adjust resource requests and limits, respectively. Autoscaling can be used with Cluster Autoscaling to efficiently allocate resources and ensure application responsiveness. It’s useful for handling variable workloads or sudden spikes in traffic.

Resources

Horizontal Pod Autoscaling vs Vertical Pod Autoscaling

Feature/AspectHorizontal Pod Autoscaling (HPA)Vertical Pod Autoscaling (VPA)
Scaling MethodAdds or removes pods based on workloadAdjusts the resource requests/limits of pods
Primary Use CaseHandle varying traffic load by increasing or decreasing the number of podsOptimize resource allocation (CPU, memory) for long-running or varying workload applications
TriggerCPU, memory, or custom metrics thresholdsPod’s resource usage (CPU, memory) over time
GranularityScales the number of replicasModifies individual pod resource requests and limits
Best ForHigh traffic and fluctuating demand scenariosApplications with unpredictable or evolving resource requirements
Resource EfficiencyEnsures workload is balanced across multiple podsEnsures each pod gets the right amount of resources without wasting or starving
Impact on ClusterCan increase the number of pods in the clusterCan change resource limits within the existing number of pods
Supported MetricsCPU, memory, custom metricsCPU, memory
DownsidesMight lead to over-provisioning if not configured wellMay cause pod restarts when resource limits are adjusted
Example Use CaseWeb servers with fluctuating trafficDatabases or backend services with varying CPU/memory needs
#roadmap #Informatic #kubernetes #ready #online #scaling