ITRT(IT Research Trends)

A Learning-Based Scheduler for High Volume Processing in Data Warehouse Using Graph Neural Networks

연구 분야: Databases

논문 키워드: #heuristic #industrial #inefficient #jobs #job

학회: International Conference on Parallel and Distributed Computing: Applications and Technologies

초록

The process of extracting, transforming, and loading (also known as ETL) of a high volume of data plays an essential role in data integration strategies in data warehouse systems in recent years. In almost all distributed ETL systems currently use in both industrial and academia context, a simple heuristic-based scheduling policy is employed. Such a heuristic policy tries to process a stream of jobs in the best-effort fashion, however, it can result in under-utilization of computing resources in most practical scenarios. On the other hand, such inefficient resource allocation strategy can result in an unwanted increase in the total completion time of data processing jobs. In this paper, we develop an efficient reinforcement learning technique that uses a Graph Neural Network (GNN) model to combine all submitted tasks graphs into a single graph to simplify the representation of the states within the environment and efficiently make a parallel application for processing of the submitted jobs. Besides, to positively augment the embedding features in each leaf node, we pass messages from leaf to root so the nodes can collaboratively represent actions within the environment. The performance results show up to 15% improvement in job completion time compared to the state-of-the-art machine learning scheduler and up to 20% enhancement compared to a tuned heuristic-based scheduler.

📄 논문 정보

발행 연도	2022년
인용수	0
출판 국가	Andorra
사이트	Springer
좋아요 수	0

A Learning-Based Scheduler for High Volume Processing in Data Warehouse Using Graph Neural Networks

A Learning-Based Scheduler for High Volume Processing in Data Warehouse Using Graph Neural Networks

Vivek Bengre

M. Reza HoseinyFarahabady

Mohammad Pivezhandi

Albert Y. Zomaya

Ali Jannesari

📄 논문 정보

연관 논문 목록 (27건)

A Learning-Based Scheduler for High Volume Processing in Data Warehouse Using Graph Neural Networks

A Learning-Based Scheduler for High Volume Processing in Data Warehouse Using Graph Neural Networks

📄 논문 정보

연관 논문 목록 (27건) 내 서재 담기

연관 논문 목록 (27건)