Automated Feature Selection for Anomaly Detection in Network Traffic Data


연구 분야: Safety



학회: ACM Transactions on Management Information Systems (TMIS), Volume 12, Issue 3


초록

Variable selection (also known as feature selection) is essential to optimize the learning complexity by prioritizing features, particularly for a massive, high-dimensional dataset like network traffic data. In reality, however, it is not an easy task to effectively perform the feature selection despite the availability of the existing selection techniques. From our initial experiments, we observed that the existing selection techniques produce different sets of features even under the same condition (e.g., a static size for the resulted set). In addition, individual selection techniques perform inconsistently, sometimes showing better performance but sometimes worse than others, thereby simply relying on one of them would be risky for building models using the selected features. More critically, it is demanding to automate the selection process, since it requires laborious efforts with intensive analysis by a group of experts otherwise. In this article, we explore challenges in the automated feature selection with the application of network anomaly detection. We first present our ensemble approach that benefits from the existing feature selection techniques by incorporating them, and one of the proposed ensemble techniques based on greedy search works highly consistently showing comparable results to the existing techniques. We also address the problem of when to stop to finalize the feature elimination process and present a set of methods designed to determine the number of features for the reduced feature set. Our experimental results conducted with two recent network datasets show that the identified feature sets by the presented ensemble and stopping methods consistently yield comparable performance with a smaller number of features to conventional selection techniques.


Author Profile
Makiya Nakashima

Texas A&M University-Commerce Commerce TX

정보 없음
Author Profile
Alexander Sim

Lawrence Berkeley National Laboratory Berkeley CA

Canada
Author Profile
Youngsoo Kim

Electronics and Telecommunications Research Institute Daejeon Korea

Andorra

📄 논문 정보

발행 연도 2021년
인용수 19
출판 국가 Andorra, Canada
사이트 ACM
좋아요 수 0

연관 논문 목록 (26건)