The Impact of Feature Selection Techniques on Software Defect Identification Models


연구 분야: Verification



학회: 2021 IEEE 12th International Conference on Software Engineering and Service Science (ICSESS)


초록

Defect identification is an important task for ensuring the quality of software. Recently, researchers have begun to utilize artificial intelligence techniques to improve the usability of static analysis tools by automatically identifying true defects from the reported SA alarms. Existing methods mainly focus on using the static code features to represent the defective code. However, a challenge that threatens the performance of these machine learning methods is the irrelevant and redundant features. Feature selection techniques can be applied to alleviate this problem. Since many feature selection methods have been proposed, this paper conducts a rigorous experimental evaluation on the impact of feature selection techniques for defect identification and explores whether there is a smallest ratio when using the feature selection techniques for building defect identification models with acceptable performance. Additionally, this paper proposes an effective feature selection approach based on the idea of majority voting, combing the output results of different feature selection techniques. The experimental results for five open-source projects show that there is a best ratio (20%) for feature selection which achieves satisfied performance with far fewer features used for defect identification. This finding can serve as a practical guideline for software defect identification.


Author Profile
Huiquan Gong

School of Electronic Engineering and Computer Science Queen Mary University of London London UK

Andorra
Author Profile
Yuwei Zhang

State Key Laboratory of Networking and Switching Technology Beijing University of Posts and Telecommunications Beijing China

Andorra

📄 논문 정보

발행 연도 2021년
인용수 127
출판 국가 Andorra
사이트 IEEE
좋아요 수 0

연관 논문 목록 (391건)