TY - GEN
T1 - An In-Depth Comparative Analysis of Machine Learning Techniques for Addressing Class Imbalance in Mental Health Prediction
AU - Mokheleli, Tsholofelo
AU - Bokaba, Tebogo
AU - Museba, Tinofirei
N1 - Publisher Copyright:
Copyright © 2023 [Tsholofelo Mokheleli, Tebogo Bokaba, Tinofirei Museba].
PY - 2023
Y1 - 2023
N2 - The application of machine learning (ML) in predicting mental healthcare faces a challenge due to imbalanced datasets. ML techniques analyse extensive datasets to make predictions; however, the unequal distribution of samples, with the majority belonging to diagnosed mental disorders, can lead to biased model training and limited generalisation. To mitigate the issue of class imbalance in mental health datasets, this study employed diverse ML techniques, namely, resampling, ensemble, and algorithm-specific approaches and metrics such as accuracy, precision, recall and F1 score. The dataset used was collected from the Open Sourcing Mental Illness website, spanning 2016 to 2021. The findings indicate that ensemble techniques, particularly Random Forest, excelled in managing class imbalance compared to other methods. Beyond conventional performance metrics, the study introduced Kappa, balanced accuracy, and geometric mean to evaluate model effectiveness. These findings provide valuable insights for improving mental health predictions, enabling early diagnosis and personalised treatment strategies.
AB - The application of machine learning (ML) in predicting mental healthcare faces a challenge due to imbalanced datasets. ML techniques analyse extensive datasets to make predictions; however, the unequal distribution of samples, with the majority belonging to diagnosed mental disorders, can lead to biased model training and limited generalisation. To mitigate the issue of class imbalance in mental health datasets, this study employed diverse ML techniques, namely, resampling, ensemble, and algorithm-specific approaches and metrics such as accuracy, precision, recall and F1 score. The dataset used was collected from the Open Sourcing Mental Illness website, spanning 2016 to 2021. The findings indicate that ensemble techniques, particularly Random Forest, excelled in managing class imbalance compared to other methods. Beyond conventional performance metrics, the study introduced Kappa, balanced accuracy, and geometric mean to evaluate model effectiveness. These findings provide valuable insights for improving mental health predictions, enabling early diagnosis and personalised treatment strategies.
KW - Class imbalance
KW - Cross-validation
KW - Machine learning
KW - Mental health prediction
KW - Resampling methods
UR - http://www.scopus.com/inward/record.url?scp=85192538194&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85192538194
T3 - International Conference on Information Systems, ICIS 2023: "Rising like a Phoenix: Emerging from the Pandemic and Reshaping Human Endeavors with Digital Technologies"
BT - International Conference on Information Systems, ICIS 2023
PB - Association for Information Systems
T2 - 44th International Conference on Information Systems: Rising like a Phoenix: Emerging from the Pandemic and Reshaping Human Endeavors with Digital Technologies, ICIS 2023
Y2 - 10 December 2023 through 13 December 2023
ER -