On IT and OT Cybersecurity Datasets for Machine Learning-Based Intrusion Detection in Industrial Control Systems


연구 분야: Networking



학회: International Conference on Smart Grid Inspired Future Technologies


초록

Intrusion detection plays a pivotal role in the cybersecurity of industrial control systems (ICS) to safeguard the safety of individuals, communities, and nations. Lately, intrusion detection models based on machine learning have been adopted to improve the detection of cyberattacks. However, there is a lack of a systematic approach to selecting the appropriate dataset for training these models. An appropriately selected dataset should be based on the needed collection environment, i.e., Information Technology (IT) and Operational Technology (OT), and include required specifications of the under-study ICS, e.g., deployed protocols. On this basis, this paper classifies the existing intrusion detection datasets into IT and OT datasets. The IT datasets are investigated from the perspectives of attack/normal traffic inclusion and their anonymity, number of packets, duration, and kind of traffic. On the other hand, the OT datasets are studied based on features such as data protocols, distribution, and data domain. Then, we have discussed the gap between the method of detection and the selection of the appropriate dataset in terms of (i) performance indicators, i.e., detection time and imbalanced distribution of data, and (ii) use case, i.e., summarizing communication layers, protocols, and attack types contained in datasets. Finally, the essential features for constructing an effective cybersecurity dataset are discussed to illustrate how to establish an ideal dataset accordingly.


Author Profile
Mohammad Pasha Shabanfar

Concordia University Montreal Canada

Canada
Author Profile
Yiheng Zhao

Concordia University Montreal Canada

Canada
Author Profile
Jun Yan

Concordia University Montreal Canada

Canada

📄 논문 정보

발행 연도 2025년
인용수 0
출판 국가 Canada
사이트 Springer
좋아요 수 0

연관 논문 목록 (199건)