Performance Comparison for Data Retrieval from NoSQL and SQL Databases: A Case Study for COVID-19 Genome Sequence Dataset


연구 분야: Databases



학회: 2021 2nd International Conference on Robotics, Electrical and Signal Processing Techniques (ICREST)


초록

NoSQL database management system is introduced to tackle different sorts of challenges, including performing operations on unstructured, semi-structured, and structured data. NoSQL databases gained popularity because of the improved performance than the SQL databases. We aim to investigate the NoSQL system's performance, namely MongoDB and Cassandra and SQL database, namely MySQL for DNA sequences data from the COVID-19 dataset. Studies of the DNA sequences are essential for medical diagnosis and biotechnology. However, it is quite challenging to store these genomics data in a traditional RDMS because of their unstructured nature. NoSQL is an efficient solution for textual characters like genomics data. We used around 3GB of human genome data from the COVID-19 dataset provided by NCBI. The original data was in the FASTA format, and we process these data into JSON format. Also, we have analyzed the different query syntax, data load time, and query performance time for the genomics data.


Author Profile
Soarov Chakraborty

Department of Computer Science and Engineering Khulna University of Engineering & Technology Khulna Bangladesh

Andorra
Author Profile
Shourav Paul

Department of Computer Science and Engineering Khulna University of Engineering & Technology Khulna Bangladesh

Andorra
Author Profile
K. M. Azharul Hasan

Department of Computer Science and Engineering Khulna University of Engineering & Technology Khulna Bangladesh

Andorra

📄 논문 정보

발행 연도 2021년
인용수 8
출판 국가 Andorra
사이트 IEEE
좋아요 수 0

연관 논문 목록 (221건)