A Hybrid ETL Framework for Optimized Data Ingestion and Real-Time Processing in Data Lakehouse Architectures using AI-Driven Orchestration and Contextual Data Partitioning


연구 분야: Databases



학회: 2025 7th International Conference on Signal Processing, Computing and Control (ISPCC)


초록

The need for effective Extract, Transform, Load (ETL) technologies that can manage the growing volumes of both structured and unstructured data in information lakehouse architectures is increasing due to the rapid expansion of data environments. Existing ETL systems struggle with performance, scalability, and adaptability, making it difficult to handle the rising demands of both batch and real-time data processing. To address these challenges, this study proposes a novel Hybrid ETL Infrastructure that combines contextual data partitioning with AI-driven orchestration to optimize data ingestion and real-time analysis in information lakehouses. The system addresses key issues, such as inefficient data handling, slow processing times, and inadequate transformation methods, by dynamically adjusting the ETL pipeline based on user requests, context-aware partitioning, and data characteristics. The AI-driven orchestration ensures efficient job scheduling by seamlessly switching between batch and real-time processing based on data importance, thereby improving both performance and flexibility. Contextual partitioning reduces processing costs and enhances query performance by automatically organizing data according to domain-specific knowledge and query intent. The primary goal of this framework is to improve data transformation and loading performance in information lakehouses, enabling faster and more accurate decision-making. Preliminary results show a 30% reduction in ETL processing time and significant improvements in query efficiency and accuracy, particularly in complex, cross-domain data retrieval environments. The proposed approach offers a scalable, adaptable, and intelligent solution for modern data lakehouse scenarios, outperforming traditional ETL methods.


Author Profile
K. Abirami

Department of Computer and Information Science Annamalai University Chidambaram

Andorra
Author Profile
S. Punitha

Department of Computer Science D.G.Govt.Arts College for Women Mayiladuthurai

정보 없음

📄 논문 정보

발행 연도 2025년
인용수 30
출판 국가 Andorra
사이트 IEEE
좋아요 수 0

연관 논문 목록 (103건)