TY - GEN
T1 - Implementation of ensemble machine learning classifiers to predict diarrhoea with SMOTEENN, SMOTE, and SMOTETomek class imbalance approaches
AU - Mbunge, Elliot
AU - Sibiya, Maureen Nokuthula
AU - Takavarasha, Sam
AU - Millham, Richard C.
AU - Chemhaka, Garikayi
AU - Muchemwa, Benhildah
AU - Dzinamarira, Tafadzwa
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Diarrhoea continues to be a major public health burden and cause of death among children under 5 years in many developing countries. Rotavirus vaccination, hygiene practices, clean water, and health promotion are among the preventive measures implemented to improve child health. Nevertheless, tackling diarrhoea also requires the integration of ensemble machine learning (ML) into health systems to improve child health. However, the integration of ensemble classifiers into health systems in many developing countries is still nascent. Therefore, this study applied SMOTE, SMOTEEN and SMOTETomek class imbalance approaches and ensemble ML classifiers to predict diarrhoea. Ensemble methods significantly improve the performance of conventional ML classifiers. The study revealed that the ExtraTrees classifier achieved a high recall of 96.3%, accuracy of 94.3%, precision of 93.8%, and F1-score of 95% when predicting diarrhoea with SMOTEENN as compared to SMOTE and SMOTETomek. The performance of the HistGradientBoosting classifier also improved and achieved a high recall of 95.2%, accuracy of 91.5%, precision of 90.4%, and F1-score of 92.7%. The paper also shows that ensemble methods are increasingly becoming state-of-the-art solutions for multiple challenges encountered with ML algorithms such as overfitting, computationally intensive, underfitting and representation. The paper also demonstrates how ensemble methods are becoming state-of-the-art solutions to multiple problems that arise with ML algorithms. There is a need to develop data-driven applications that incorporate ensemble methods to model and predict diarrhoea to assist policymakers to craft interventions aimed to improve child health.
AB - Diarrhoea continues to be a major public health burden and cause of death among children under 5 years in many developing countries. Rotavirus vaccination, hygiene practices, clean water, and health promotion are among the preventive measures implemented to improve child health. Nevertheless, tackling diarrhoea also requires the integration of ensemble machine learning (ML) into health systems to improve child health. However, the integration of ensemble classifiers into health systems in many developing countries is still nascent. Therefore, this study applied SMOTE, SMOTEEN and SMOTETomek class imbalance approaches and ensemble ML classifiers to predict diarrhoea. Ensemble methods significantly improve the performance of conventional ML classifiers. The study revealed that the ExtraTrees classifier achieved a high recall of 96.3%, accuracy of 94.3%, precision of 93.8%, and F1-score of 95% when predicting diarrhoea with SMOTEENN as compared to SMOTE and SMOTETomek. The performance of the HistGradientBoosting classifier also improved and achieved a high recall of 95.2%, accuracy of 91.5%, precision of 90.4%, and F1-score of 92.7%. The paper also shows that ensemble methods are increasingly becoming state-of-the-art solutions for multiple challenges encountered with ML algorithms such as overfitting, computationally intensive, underfitting and representation. The paper also demonstrates how ensemble methods are becoming state-of-the-art solutions to multiple problems that arise with ML algorithms. There is a need to develop data-driven applications that incorporate ensemble methods to model and predict diarrhoea to assist policymakers to craft interventions aimed to improve child health.
KW - Children
KW - class imbalance
KW - Diarrhoea
KW - Ensemble methods
KW - machine learning
KW - Prediction
KW - Zimbabwe
UR - http://www.scopus.com/inward/record.url?scp=85153221989&partnerID=8YFLogxK
U2 - 10.1109/ICTAS56421.2023.10082744
DO - 10.1109/ICTAS56421.2023.10082744
M3 - Conference contribution
AN - SCOPUS:85153221989
T3 - 2023 Conference on Information Communications Technology and Society, ICTAS 2023 - Proceedings
BT - 2023 Conference on Information Communications Technology and Society, ICTAS 2023 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 7th Conference on Information Communications Technology and Society, ICTAS 2023
Y2 - 8 March 2023 through 9 March 2023
ER -