연구 분야: Artificial Intelligence
학회: Human-Centric Intelligent Systems
Breast cancer is a serious global health challenge, calling for the invention of a reliable and interpretable machine learning (ML) models for its early detection. The Wisconsin Breast Cancer (WBC), Wisconsin Diagnostic Breast Cancer (WDBC), and Coimbra datasets are the three publicly available datasets that were used to train the four-machine learning (ML) classification models which are compared in this study: Logistic Regression, Decision Trees, Random Forest, and CatBoost. These models’ computational efficiency was measured by fit and test times, and their precision, recall, accuracy, F1-score, and area under the receiver operating characteristic curve (AUC-ROC) were evaluated. To promote transparency for clinical adoption, feature-level analysis of the model’s predictions was captured through the use of Local Interpretable Model-Agnostic Explanations (LIME). The highest accuracy was achieved using logistic regression, which recorded precision and recall values of 0.97/0.95 (WDBC), 0.95/0.92 (WBC), and 0.85/0.78 (Coimbra), respectively, thereby exceeding other models in terms of efficiency and consistency. The Key factors that matched clinical expectations were identified by LIME, including BMI (Coimbra), clump thickness (WBC), and radius mean (WDBC). The present research builds on previous work by combining various datasets and interpretable methodologies to address ML black-box challenges in medical diagnostics. Future research should look into larger, multi-layered medical datasets and deep learning models to enhance classification accuracy.
| 발행 연도 | 2025년 |
|---|---|
| 인용수 | 0 |
| 출판 국가 | Nigeria |
| 사이트 | Springer |
| 좋아요 수 | 0 |