Efficient data persistence and data division for distributed computing in cloud data center networks


연구 분야: Software Development



학회: The Journal of Supercomputing


초록

Container-based Hadoop distributed file system (HDFS) storage has been widely used in cloud data center networks, while traditional HDFS has single point problem resulting in overall unavailability. In this paper, we mainly study the storage reliability of the Docker container-based HDFS cluster with single point of failure. Firstly, we investigate a data volume-based persistence solution of Hadoop with the single point failure and single backup strategy of HDFS cluster. Secondly, we propose an HDFS-based replica placement algorithm for data storage with considering the performance of the host and container nodes. Thirdly, we design the KADC-KNN data segmentation algorithm to effectively store the persistent data of the Docker container. Extensive experimental results show that this method can effectively ensure the stable storage and fast migration of cluster data. Compared with the most advanced algorithm, the proposed data volume persistence algorithm DVPS can improve the data reliability by 19.8%. The data partitioning algorithm KADC-KNN improves the partitioning accuracy by 20.2% and has lower time overhead.


Author Profile
Xi Wang

Suzhou Institute of Industrial Technology Suzhou 215006 People’s Republic of China

China
Author Profile
Xinzhi Hu

Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education Jilin University Changchun 130012 People’s Republic of China

Andorra
Author Profile
Weibei Fan

College of Computer Nanjing University of Posts and Telecommunications Nanjing 210003 People’s Republic of China

Andorra

📄 논문 정보

발행 연도 2023년
인용수 3
출판 국가 Andorra, China
사이트 Springer
좋아요 수 0

연관 논문 목록 (92건)