Cost-Optimized Cloud Scheduling for ETL and Big Data Using AI


연구 분야: Databases



학회: 2025 13th International Symposium on Digital Forensics and Security (ISDFS)


초록

Cloud-based data pipelines are critical for large-scale ETL and big data analytics, yet in-efficient scheduling leads to high costs and resource underutilization. Traditional approaches, such as static provisioning and rule-based auto-scaling, fail to adapt to dynamic workloads and ever-changing cloud pricing. This paper introduces an AI-driven cost-aware scheduling framework that integrates LSTM-based workload forecasting, Isolation Forest anomaly detection, and real-time cost analytics. Unlike purely reactive or threshold-based methods, this approach proactively adjusts resources to minimize costs and prevent performance bottlenecks. Experiments using the Google Cluster Workload Traces dataset and Azure VM Pricing data demonstrate up to a 30-40% reduction in operational expenses, along with improved processing efficiency and auto-scaling responsiveness. However, the proposed solution may require further adaptation for extreme-scale scenarios or specialized compliance environments. These findings highlight the novelty of combining predictive analytics, anomaly detection, and cost optimization, offering a robust strategy for enhancing performance and reducing expenses in cloud-based ETL and big data workflows.


Author Profile
Chaitanya Krishnama

Dallas and Seattle Chapter

Andorra
Author Profile
Raghavender Puchhakayala

Dallas and Seattle Chapter

Andorra
Author Profile
Sudarshan Kotha

Dallas and Seattle Chapter

Andorra

📄 논문 정보

발행 연도 2025년
인용수 36
출판 국가 Andorra
사이트 IEEE
좋아요 수 0

연관 논문 목록 (90건)