Implementation of ensemble machine learning classifiers to predict diarrhoea with SMOTEENN, SMOTE, and SMOTETomek class imbalance approaches

Elliot Mbunge, Maureen Nokuthula Sibiya, Sam Takavarasha, Richard C. Millham, Garikayi Chemhaka, Benhildah Muchemwa, Tafadzwa Dzinamarira

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Citations (Scopus)

Abstract

Diarrhoea continues to be a major public health burden and cause of death among children under 5 years in many developing countries. Rotavirus vaccination, hygiene practices, clean water, and health promotion are among the preventive measures implemented to improve child health. Nevertheless, tackling diarrhoea also requires the integration of ensemble machine learning (ML) into health systems to improve child health. However, the integration of ensemble classifiers into health systems in many developing countries is still nascent. Therefore, this study applied SMOTE, SMOTEEN and SMOTETomek class imbalance approaches and ensemble ML classifiers to predict diarrhoea. Ensemble methods significantly improve the performance of conventional ML classifiers. The study revealed that the ExtraTrees classifier achieved a high recall of 96.3%, accuracy of 94.3%, precision of 93.8%, and F1-score of 95% when predicting diarrhoea with SMOTEENN as compared to SMOTE and SMOTETomek. The performance of the HistGradientBoosting classifier also improved and achieved a high recall of 95.2%, accuracy of 91.5%, precision of 90.4%, and F1-score of 92.7%. The paper also shows that ensemble methods are increasingly becoming state-of-the-art solutions for multiple challenges encountered with ML algorithms such as overfitting, computationally intensive, underfitting and representation. The paper also demonstrates how ensemble methods are becoming state-of-the-art solutions to multiple problems that arise with ML algorithms. There is a need to develop data-driven applications that incorporate ensemble methods to model and predict diarrhoea to assist policymakers to craft interventions aimed to improve child health.

Original languageEnglish
Title of host publication2023 Conference on Information Communications Technology and Society, ICTAS 2023 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665489300
DOIs
Publication statusPublished - 2023
Externally publishedYes
Event7th Conference on Information Communications Technology and Society, ICTAS 2023 - Durban, South Africa
Duration: 8 Mar 20239 Mar 2023

Publication series

Name2023 Conference on Information Communications Technology and Society, ICTAS 2023 - Proceedings

Conference

Conference7th Conference on Information Communications Technology and Society, ICTAS 2023
Country/TerritorySouth Africa
CityDurban
Period8/03/239/03/23

Keywords

  • Children
  • class imbalance
  • Diarrhoea
  • Ensemble methods
  • machine learning
  • Prediction
  • Zimbabwe

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Information Systems
  • Signal Processing
  • Information Systems and Management
  • Safety, Risk, Reliability and Quality
  • Health (social science)
  • Urban Studies

Fingerprint

Dive into the research topics of 'Implementation of ensemble machine learning classifiers to predict diarrhoea with SMOTEENN, SMOTE, and SMOTETomek class imbalance approaches'. Together they form a unique fingerprint.

Cite this