연구 분야: Databases
학회: SN Computer Science
The World Health Organization (WHO) reports that in 2018, 422 million people throughout the globe are living with diabetes, making it one of the most widespread chronic life-threatening conditions. Early diagnosis is often favoured for clinically relevant findings due to the comparatively longer asymptomatic period associated with diabetes. It is estimated that around 50% of people with diabetes go undiagnosed because of the length of time it takes for symptoms to appear. The appropriate evaluation of both common and less common sign symptoms, which may be present at various times between the onset of the illness and diagnosis, is essential for early detection of diabetes. Researchers have relied heavily on data mining-based categorization algorithms for illness risk prediction models. To estimate a person’s risk of developing diabetes, it is required to have access to data on people who have recently developed diabetes or who are at high risk of developing diabetes. A dataset of 768 instances was provided to us via Kaggle and was created by the National Institute of Diabetes and Digestive and Kidney Diseases. This set of examples was narrowed down from a bigger database using a variety of criteria. All our female patients are at least 21 years old and are indigenous Pima. We performed statistical analysis on the dataset using the Naïve–Bayes Algorithm, the Logistic Regression Algorithm, and the Random Forest Algorithm. We found that Random Forest provided the best accuracy for this dataset when evaluated using both ten-fold Cross- Validation and the percentage split method. The National Institute of Diabetes and Digestive and Kidney Diseases is the original source of this data. The goal is to diagnose a patient and then forecast whether they have diabetes based on those results.
| 발행 연도 | 2023년 |
|---|---|
| 인용수 | 0 |
| 출판 국가 | Bahrain |
| 사이트 | Springer |
| 좋아요 수 | 0 |