Observability in Kubernetes Cluster: Automatic Anomalies Detection using Prometheus


연구 분야: Software Development



학회: 2020 IEEE 22nd International Conference on High Performance Computing and Communications; IEEE 18th International Conference on Smart City; IEEE 6th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)


초록

Kubernetes is a portable, extensible, open-source platform for managing containers. It comes with features such as automatic scaling, service discovery, load balancing, fault tolerance, etc. Being such a complex system, which has a lot of internal services and with the ability to manage a lot more user services, Kubernetes comes with a monitoring system, which provides metrics and logs for every service in the cluster. However, most of the time, the monitoring system needs human intervention for detection and troubleshooting defects. Human intervention usually occurs when it is too late, when a defect appears. We think that detecting anomalies in metrics provided by the monitoring system will help to prevent defects. In this paper, we analyze current solutions for automatic anomaly detection and alerting, and also we propose a new solution that will help system administrators to catch and predict anomalies earlier, which may lead to defects. Our solution, which is a technical one, is developed around Prometheus, an open-source monitoring system for metrics.


Author Profile
Octavian Mart

Faculty of Automatic Control and Computer Science University POLITEHNICA of Bucharest Romania

Andorra
Author Profile
Catalin Negru

Faculty of Automatic Control and Computer Science University POLITEHNICA of Bucharest Romania

Andorra
Author Profile
Florin Pop

Faculty of Automatic Control and Computer Science University POLITEHNICA of Bucharest Romania

Andorra

📄 논문 정보

발행 연도 2020년
인용수 16
출판 국가 Italy, Andorra
사이트 IEEE
좋아요 수 0

연관 논문 목록 (19건)