LARE-HPA: Co-optimizing Latency and Resource Efficiency for Horizontal Pod Autoscaling in Kubernetes


연구 분야: Software Development



학회: International Conference on Service-Oriented Computing


초록

Autoscaling is a technology that dynamically adjusts computing resources based on fluctuating demand without user intervention. It plays a crucial role in optimizing resources while maintaining Quality of Service (QoS). Kubernetes, an open-source container orchestration platform, manages resource scaling through a technique known as Horizontal Pod Autoscaler (HPA). However, HPA is a reactive scaler that makes scaling decisions based on static threshold values, which are challenging to set without profiling the application’s characteristics and workload volatility in advance. Incorrect configuration values can lead to over- or under-provisioning, resulting in excessive costs or Service Level Objective (SLO) violations. To address these challenges, we propose LARE-HPA, an adaptive scaling solution that adjusts thresholds and stabilization windows to optimize scaling timing, ensuring QoS while minimizing resource over-provisioning. Our experimental results based on real-world workloads demonstrate that LARE-HPA reduces average latency by 50.34%, 39.52%, and 46.18% while improving SLO satisfaction rates by 3.61%, 4.5%, and 2.21% compared to existing HPA and other state-of-the-art techniques, respectively.


Author Profile
Donggyun Kim

Department of Computer Science and Engineering Korea University Seoul South Korea

Andorra
Author Profile
Hyungjun Kim

Department of Computer Science and Engineering Korea University Seoul South Korea

Andorra
Author Profile
Eunyoung Lee

Department of Computer Science Dongduk Women’s University Seoul South Korea

Korea

📄 논문 정보

발행 연도 2024년
인용수 0
출판 국가 Andorra, Korea
사이트 Springer
좋아요 수 0

연관 논문 목록 (168건)