Proactive Auto Scaling Based on Marginal Request Change Analysis for Reducing Tail Latency in Kubernetes Cluster


연구 분야: Software Development



학회: International Conference on Innovative Computing


초록

Auto scaling dynamically adjusts resource allocation based on application metrics to optimize service performance and resource efficiency. In Kubernetes, a state-of-the-art resource management platform, resource scaling is managed using the horizontal pod autoscaler (HPA). However, HPA’s reactive approach is not appropriate for rapidly increasing workloads. Therefore, we propose a proactive horizontal pod autoscaler (p-HPA) to decrease tail latency by proactively allocating resources according to the number of requests. Experimental results show that p-HPA reduces tail latency by 8.2%, 8.8%, and 18.8% across all workloads, compared with HPA.


Author Profile
Donggyun Kim

Korea University Seoul South Korea

Korea
Author Profile
Heonchang Yu

Korea University Seoul South Korea

Korea

📄 논문 정보

발행 연도 2024년
인용수 0
출판 국가 Korea
사이트 Springer
좋아요 수 0

연관 논문 목록 (51건)