Is iterative feature selection technique efficient enough? A comparative performance analysis of RFECV feature selection technique in ransomware classification using SHAP


연구 분야: Networking



학회: Discover Internet of Things


초록

The realm of cybersecurity places significant importance on early ransomware detection. Feature selection is critical in this context, as it enhances detection accuracy, mitigates overfitting, and reduces training time by eliminating irrelevant and redundant data. However, iterative feature selection techniques tend to select the best-performing subset of features through an iterative process which leaves chance for a crucial feature not being selected and the number of selected features may not always be the optimal or the most suitable for a given problem. Hence, this study aims to conduct a performance comparison analysis of an iterative feature selection technique- Recursive Feature Elimination with Cross-Validation (RFECV) with six supervised Machine Learning (ML) models to evaluate its efficiency in classifying ransomware utilizing the Application Programming Interface (API) call and network traffic features. The study employs an Explainable Artificial Intelligence (XAI) framework called SHapley Additive exPlanations (SHAP) to derive the crucial features when RFECV is not integrated with the ML models. These features are then compared with RFECV-selected features when it is integrated. Results show that without RFECV the ML models achieve better classification accuracies on two datasets. Again, RFECV falls short of selecting impactful features, leading to more false alarms. Moreover, it lacks the capability to rank the features based on their importance, reducing its efficiency in ransomware classification overall. Thus, this study underscores the importance of integrating explainability techniques to identify critical features, rather than solely relying on iterative feature selection methods, to enhance the resilience of ransomware detection systems.


Author Profile
Rawshan Ara Mowri

Department of Computer Science North Carolina A &T State University Greensboro NC 27411 USA

New Caledonia
Author Profile
Madhuri Siddula

Department of Computer Science North Carolina A &T State University Greensboro NC 27411 USA

New Caledonia
Author Profile
Kaushik Roy

Department of Computer Science North Carolina A &T State University Greensboro NC 27411 USA

New Caledonia

📄 논문 정보

발행 연도 2023년
인용수 3
출판 국가 New Caledonia
사이트 Springer
좋아요 수 0

연관 논문 목록 (20건)