Application and user-specific data prefetching and parallel read algorithms for distributed file systems


연구 분야: Databases



학회: Cluster Computing


초록

Cloud computing systems are widely used to deploy big data-based applications because of their high storage and computation capacity. The key component for storage in cloud computing environment is distributed file system which can store and process data produced by big data-based applications effectively. The users of such big data-based applications issue read requests more frequently when compared to write requests. So, most of these cloud-based applications demand optimal performance from the distributed file system, especially for read operations. Numerous caching and prefetching techniques have been proposed in the existing literature to enhance the performance of distributed file system. However, these techniques typically adopt a synchronous approach, focusing on either application data prefetching or user data prefetching, when the user application starts executing and this may result in an extended read access time. Furthermore, the data is prefetched either based on access frequency or reuse distance with out considering the access recency of data which may result in less cache hit ratio. In this paper, we have proposed application-specific and user-specific data prefetching algorithms for prefetching the data from the distributed file system and storing the same in the multi-level caches present in the distributed file system based on the combination of access frequency and recency ranking of file blocks that were previously accessed by client application programs. Additionally, we have divided the cache into two partitions namely user and application caches to store the prefetched data as per the popularity value calculated by considering user and application level accesses. We have also introduced a parallel read algorithm to read data simultaneously from the multiple caches present in the distributed file system environment. The simulation results demonstrate that, the proposed algorithms improved the distributed file systems performance by minimum of 8 to maximum of 92 percent in terms of average read access time when compared with different existing approaches.


Author Profile
Anusha Nalajala

Department of CSE SRM University-AP Neerukonda Andhra Pradesh 522502 India

India
Author Profile
T. Ragunathan

Faculty of Engineering and Technology Sri Ramachandra Institute of Higher Education and Research Chennai Tamil Nadu India

Andorra
Author Profile
Ranesh Naha

Centre for Smart Analytics Federation University Australia Gippsland Campus Churchill VIC 3841 Australia

Australia

📄 논문 정보

발행 연도 2023년
인용수 0
출판 국가 Australia, Andorra, India
사이트 Springer
좋아요 수 0

연관 논문 목록 (269건)