High-Performance Mining of COVID-19 Open Research Datasets for Text Classification and Insights in Cloud Computing Environments


연구 분야: Networking



학회: 2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC)


초록

The COVID-19 global pandemic is an unprecedented health crisis. Many researchers around the world have produced an extensive collection of literature since the outbreak. Analysing this information to extract knowledge and provide meaningful insights in a timely manner requires a considerable amount of computational power. Cloud platforms are designed to provide this computational power in an on-demand and elastic manner. Specifically, hybrid clouds, composed of private and public data centers, are particularly well suited to deploy computationally intensive workloads in a cost-efficient, yet scalable manner. In this paper, we developed a system utilising the Aneka Platform as a Service middleware with parallel processing and multi-cloud capability to accelerate the data process pipeline and article categorising process using machine learning on a hybrid cloud. The results are then persisted for further referencing, searching and visualising. The performance evaluation shows that the system can help with reducing processing time and achieving linear scalability. Beyond COVID-19, the application might be used directly in broader scholarly article indexing and analysing.


Author Profile
Jie Zhao

Cloud Computing and Distributed Systems Laboratory School of Computing and Information Systems The University of Melbourne Australia

Andorra
Author Profile
Maria A. Rodriguez

Cloud Computing and Distributed Systems Laboratory School of Computing and Information Systems The University of Melbourne Australia

Andorra
Author Profile
Rajkumar Buyya

Cloud Computing and Distributed Systems Laboratory School of Computing and Information Systems The University of Melbourne Australia

Andorra

📄 논문 정보

발행 연도 2020년
인용수 3
출판 국가 Andorra
사이트 IEEE
좋아요 수 0

연관 논문 목록 (65건)