TY - GEN
T1 - A Multiclass Approach to Predicting Diabetes Stage Using Machine Learning
AU - Mbuya, Emmanuel
AU - Mokheleli, Tsholofelo Diphoko
AU - Bokaba, Tebogo
AU - Ndayizigamiye, Patrick
N1 - Publisher Copyright:
Copyright © 2023 [Emmanuel Mbuya, Tsholofelo Diphoko Mokheleli, Tebogo Bokaba, Patrick Ndayizigamiye].
PY - 2023
Y1 - 2023
N2 - The global prevalence of diabetes mellitus poses a significant public health challenge. This study aims to use dimensionality reduction methods with machine learning (ML) algorithms to predict the diabetes stage and assess the performance of the developed predictive model. Unlike many studies on predicting diabetes, this study makes use of both medical indicators and social determinants of health to predict the risk of diabetes. Utilizing a large dataset obtained from the Centers for Disease Control and Prevention, comprising 253,680 instances and 23 features, this study employs various ML algorithms and dimensionality reduction techniques. In addition, the study applied several metrics namely accuracy, precision, recall, F1 score, Receiver Operating Characteristic, Area Under the Curve, and balanced accuracy. The study finds that Logistic Regression and XGBoost models outperform other classifiers, achieving an accuracy of 85%. The study suggests that future work could benefit from incorporating deep learning techniques.
AB - The global prevalence of diabetes mellitus poses a significant public health challenge. This study aims to use dimensionality reduction methods with machine learning (ML) algorithms to predict the diabetes stage and assess the performance of the developed predictive model. Unlike many studies on predicting diabetes, this study makes use of both medical indicators and social determinants of health to predict the risk of diabetes. Utilizing a large dataset obtained from the Centers for Disease Control and Prevention, comprising 253,680 instances and 23 features, this study employs various ML algorithms and dimensionality reduction techniques. In addition, the study applied several metrics namely accuracy, precision, recall, F1 score, Receiver Operating Characteristic, Area Under the Curve, and balanced accuracy. The study finds that Logistic Regression and XGBoost models outperform other classifiers, achieving an accuracy of 85%. The study suggests that future work could benefit from incorporating deep learning techniques.
KW - Cross-validation
KW - Diabetes Mellitus
KW - Dimensionality Reduction
KW - Machine Learning
UR - http://www.scopus.com/inward/record.url?scp=85192510008&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85192510008
T3 - International Conference on Information Systems, ICIS 2023: "Rising like a Phoenix: Emerging from the Pandemic and Reshaping Human Endeavors with Digital Technologies"
BT - International Conference on Information Systems, ICIS 2023
PB - Association for Information Systems
T2 - 44th International Conference on Information Systems: Rising like a Phoenix: Emerging from the Pandemic and Reshaping Human Endeavors with Digital Technologies, ICIS 2023
Y2 - 10 December 2023 through 13 December 2023
ER -