Explainable Surface-Level Malware Analysis Through Scalable and Accurate Feature Selection


연구 분야: Safety



학회: IFIP International Conference on Artificial Intelligence Applications and Innovations


초록

Surface-level malware analysis offers significant advantages over deep static and dynamic analysis by avoiding the complex and time-consuming process of reverse engineering obfuscated code and eliminating the risk of malware execution. Recent studies have shown that surface-level features alone can achieve high classification accuracy in distinguishing malware from benign software. However, an inherent challenge remains: surface-level datasets often contain an enormous number of features, hindering explainability and manual investigation. A notable example is the Ember dataset, a widely used public dataset for malware detection, which originally consists of more than ten million features. In malware detection, this issue primarily affects memory consumption and computational efficiency, which can be mitigated using techniques such as feature hashing. In contrast, malware analysis requires explainability involving manual investigation based on domain expertise, which necessitates focusing on a small subset of highly relevant features. While feature selection has been extensively studied in machine learning, existing algorithms struggle to balance scalability and selection accuracy. Recently, the authors proposed a novel feature selection algorithm, BornFS, which significantly improves this trade-off, reducing over ten million features of the Ember dataset to only 155 in under two hours while ensuring a mutual information loss below 5%. This paper presents a surface-level malware analysis method that leverages the scalable and accurate BornFS, demonstrating the effectiveness of feature-selection-based malware analysis through experiments.


Author Profile
Lulu Ito

Keio University Fujisawa Kanagawa Japan

Japan
Author Profile
Naoya Sawada

Gakushuin University Tokyo Japan

Japan
Author Profile
Katsuyuki Maeda

Nantoka Co. Ltd. Nagasaki Japan

Colombia

📄 논문 정보

발행 연도 2025년
인용수 0
출판 국가 Colombia, Japan
사이트 Springer
좋아요 수 0

연관 논문 목록 (360건)