연구 분야: Software Development
학회: International Conference on Service-Oriented Computing
Autoscaling is a technology that dynamically adjusts computing resources based on fluctuating demand without user intervention. It plays a crucial role in optimizing resources while maintaining Quality of Service (QoS). Kubernetes, an open-source container orchestration platform, manages resource scaling through a technique known as Horizontal Pod Autoscaler (HPA). However, HPA is a reactive scaler that makes scaling decisions based on static threshold values, which are challenging to set without profiling the application’s characteristics and workload volatility in advance. Incorrect configuration values can lead to over- or under-provisioning, resulting in excessive costs or Service Level Objective (SLO) violations. To address these challenges, we propose LARE-HPA, an adaptive scaling solution that adjusts thresholds and stabilization windows to optimize scaling timing, ensuring QoS while minimizing resource over-provisioning. Our experimental results based on real-world workloads demonstrate that LARE-HPA reduces average latency by 50.34%, 39.52%, and 46.18% while improving SLO satisfaction rates by 3.61%, 4.5%, and 2.21% compared to existing HPA and other state-of-the-art techniques, respectively.
| 발행 연도 | 2024년 |
|---|---|
| 인용수 | 0 |
| 출판 국가 | Andorra, Korea |
| 사이트 | Springer |
| 좋아요 수 | 0 |