A machine learning-based credit risk prediction engine system using a stacked classifier and a filter-based feature selection method

Ileberi Emmanuel, Yanxia Sun, Zenghui Wang

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)

Abstract

Credit risk prediction is a crucial task for financial institutions. The technological advancements in machine learning, coupled with the availability of data and computing power, has given rise to more credit risk prediction models in financial institutions. In this paper, we propose a stacked classifier approach coupled with a filter-based feature selection (FS) technique to achieve efficient credit risk prediction using multiple datasets. The proposed stacked model includes the following base estimators: Random Forest (RF), Gradient Boosting (GB), and Extreme Gradient Boosting (XGB). Furthermore, the estimators in the Stacked architecture were linked sequentially to extract the best performance. The filter- based FS method that is used in this research is based on information gain (IG) theory. The proposed algorithm was evaluated using the accuracy, the F1-Score and the Area Under the Curve (AUC). Furthermore, the Stacked algorithm was compared to the following methods: Artificial Neural Network (ANN), Decision Tree (DT), and k-Nearest Neighbour (KNN). The experimental results show that stacked model obtained AUCs of 0.934, 0.944 and 0.870 on the Australian, German and Taiwan datasets, respectively. These results, in conjunction with the accuracy and F1-score metrics, demonstrated that the proposed stacked classifier outperforms the individual estimators and other existing methods.

Original languageEnglish
Article number23
JournalJournal of Big Data
Volume11
Issue number1
DOIs
Publication statusPublished - Dec 2024

Keywords

  • Credit risk
  • Feature selection
  • Machine learning

ASJC Scopus subject areas

  • Information Systems
  • Hardware and Architecture
  • Computer Networks and Communications
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'A machine learning-based credit risk prediction engine system using a stacked classifier and a filter-based feature selection method'. Together they form a unique fingerprint.

Cite this