연구 분야: Databases
학회: 2025 13th International Symposium on Digital Forensics and Security (ISDFS)
Cloud-based data pipelines are critical for large-scale ETL and big data analytics, yet in-efficient scheduling leads to high costs and resource underutilization. Traditional approaches, such as static provisioning and rule-based auto-scaling, fail to adapt to dynamic workloads and ever-changing cloud pricing. This paper introduces an AI-driven cost-aware scheduling framework that integrates LSTM-based workload forecasting, Isolation Forest anomaly detection, and real-time cost analytics. Unlike purely reactive or threshold-based methods, this approach proactively adjusts resources to minimize costs and prevent performance bottlenecks. Experiments using the Google Cluster Workload Traces dataset and Azure VM Pricing data demonstrate up to a 30-40% reduction in operational expenses, along with improved processing efficiency and auto-scaling responsiveness. However, the proposed solution may require further adaptation for extreme-scale scenarios or specialized compliance environments. These findings highlight the novelty of combining predictive analytics, anomaly detection, and cost optimization, offering a robust strategy for enhancing performance and reducing expenses in cloud-based ETL and big data workflows.
| 발행 연도 | 2025년 |
|---|---|
| 인용수 | 36 |
| 출판 국가 | Andorra |
| 사이트 | IEEE |
| 좋아요 수 | 0 |