A comparative analysis of ensemble learning models for predicting lapses in investment policies

  • Lerato Matlala
  • , Tebogo Bokaba
  • , Patrick Ndayizigamiye
  • , Siyabonga Mhlongo
  • , Eustace Dogo

Research output: Contribution to journalArticlepeer-review

Abstract

This study explores the application of machine learning (ML) algorithms to predict lapses in investment policies, addressing a big challenge for insurance and financial services companies. The study compares three ensemble techniques: random forest (RF), gradient boosting (GB), and extreme gradient boosting (XGBoost), to identify the most effective model for predicting policy lapses and to determine the key factors influencing these predictions. The dataset used for this analysis is sourced from an anonymous insurance and financial services company on Kaggle, and includes data from 51,685 policies spanning from 2017 to 2020. Thorough data pre-processing, including handling missing values, outlier treatment, and feature scaling, is performed before training and evaluating the models. The results reveal that features such as tenure, number of missed payments, and total sum assured play a big role in predicting lapses. Random Forest is identified as the top-performing model. Furthermore, local interpretable model-agnostic explanations (LIME) is used to improve interpretability, offering detailed insights into feature contributions. These findings suggest that ML models, particularly Random Forest, are highly effective in predicting lapses in investment policies, offering valuable insights for insurance and financial services companies to manage and reduce policy lapses.

Original languageEnglish
JournalJournal of Management Analytics
DOIs
Publication statusAccepted/In press - 2025

Keywords

  • LIME
  • bagging
  • boosting
  • ensemble models
  • investment policy lapses
  • machine learning

ASJC Scopus subject areas

  • Statistics and Probability
  • Business, Management and Accounting (miscellaneous)
  • Statistics, Probability and Uncertainty

Fingerprint

Dive into the research topics of 'A comparative analysis of ensemble learning models for predicting lapses in investment policies'. Together they form a unique fingerprint.

Cite this