Intelligent Assistance with ML in Data Mapping ETL Processing


연구 분야: Databases



학회: 2022 IEEE Information Technologies & Smart Industrial Systems (ITSIS)


초록

The ETL is a major component in the construction of a data warehouse. This element acts as an intermediary between data sources and data storage. It groups the data collection and preparation processes in order to undergo a set of transformations (cleaning, standardization, standardization, filtering, aggregation, etc.) before being loaded into the data warehouse. It was designed to act as an engine for extracting data from a variety of heterogeneous sources, homogenizing them so that they can be used together, and ultimately delivering consistent data that is suitable for analysis and decision-making. The complexity of the ETL lies in the large volume of data to be processed and their heterogeneity (syntactic, structural and semantic). This makes the generation of ETL the most expensive part in terms of time and budget as it represents about three-quarters of the decision project. We propose a new approach for ETL generation using Metadata with learning techniques. Our goal is to reduce the ETL complexity. Metadata is introduced to reduce resources and machine learning techniques are used to make the process more “intelligent” and can be adapted to different types of data.


Author Profile
Ahlem Ben Younes

Laboratory LaTICE University of Tunis ENSIT Tunis Tunisia

Tunisia
Author Profile
Leila Ben Ayed

Laboratory LIPSIC University of Manouba FST Tunisia

Tunisia
Author Profile
Marwa Najjar

Laboratory LaTICE University of Tunis ENSIT Tunis Tunisia

Tunisia

📄 논문 정보

발행 연도 2022년
인용수 1
출판 국가 Tunisia
사이트 IEEE
좋아요 수 0

연관 논문 목록 (15건)