연구 분야: Software Development
학회: 2025 IEEE 18th International Conference on Cloud Computing (CLOUD)
Modern web services are required to fulfill important non-functional criteria such as availability, responsiveness, scalability, and reliability, which are outlined in Service Level Agreements (SLAs). These agreements include Service Level Objectives (SLOs), which set performance benchmarks like up-time, latency, and throughput, crucial for maintaining consistent service quality. Failing to meet these SLOs can lead to penalties and harm to the provider's reputation. Additionally, service providers must avoid over-provisioning resources, as this can lead to excessive costs and inefficient use of resources. To address this, autoscaling mechanisms dynamically adjust the number of service replicas to match user demand, while scheduling policies establish a placement for each replica. However, traditional autoscaling solutions typically rely on low-level metrics (e.g., CPU or memory usage), making it difficult for providers to optimize both SLOs and infrastructure costs. Furthermore scheduling policies do not consider the run time resource contention between replicas that can have an impact on the application response time. This paper proposes an SLO-aware autoscaling methodology, along with load-aware scheduling and descheduling strategies, for containerized workloads in Kubernetes clusters, integrating SLOs in the container orchestration process. This approach overcomes the limitations of conventional orchestration policies by making more efficient decisions that balance service-level requirements with operational costs, offering a comprehensive solution for managing containerized applications and their infrastructure in Kubernetes environments. The results, obtained by evaluating a prototype of our system in a testbed environment, show significant advantages over the vanilla Kubernetes platform.
| 발행 연도 | 2025년 |
|---|---|
| 인용수 | 10 |
| 출판 국가 | Andorra |
| 사이트 | IEEE |
| 좋아요 수 | 0 |