Comparative analysis of text mining and clustering techniques for assessing functional dependency between manual test cases


연구 분야: Verification



학회: Software Quality Journal


초록

Text mining techniques, particularly those leveraging machine learning for natural language processing, have gained significant attention for qualitative data analysis in software testing. However, their complexity and lack of transparency can pose challenges, especially in safety-critical domains where simpler, interpretable solutions are often preferred unless accuracy is heavily compromised. This study investigates the trade-offs between complexity, effort, accuracy, and utility in text mining and clustering techniques, focusing on their application for detecting functional dependencies among manual integration test cases in safety-critical systems. Using empirical data from an industrial testing project at ALSTOM Sweden, we evaluate various string distance methods, NCD compressors, and machine learning approaches. The results highlight the impact of preprocessing techniques, such as tokenization, and intrinsic factors, such as text length, on algorithm performance. Findings demonstrate how text mining and clustering can be optimized for safety-critical contexts, offering actionable insights for researchers and practitioners aiming to balance simplicity and effectiveness in their testing workflows.


Author Profile
Sahar Tahvili

Compute Platforms Engineering Unit Ericsson AB Stockholm Sweden

Sweden
Author Profile
Leo Hatvani

Department of Industrial AI Systems Mälardalen University Västerås Sweden

Anguilla
Author Profile
Michael Felderer

Department of Industrial AI Systems Mälardalen University Västerås Sweden

Anguilla

📄 논문 정보

발행 연도 2025년
인용수 0
출판 국가 Anguilla, Germany, Andorra, Sweden
사이트 Springer
좋아요 수 0

연관 논문 목록 (404건)