TY - GEN
T1 - Incremental Machine Learning-Based Approach for Credit Scoring in the Age of Big Data
AU - Museba, Tinofirei
N1 - Publisher Copyright:
© 2024, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2024
Y1 - 2024
N2 - The determination of the financial credibility of a loan applicant by financial institutions is quantified using a credit score. Sources of credit, such as banks and financial institutions, play a crucial role in sustaining economies and keeping cash flowing in the market. Financial institutions solve the problem of lack of data in credit scoring by extracting customer information from data sources such as social networks. Such data sources store data in large quantities. Traditional data mining techniques fail to accurately distinguish between a credit-worthy applicant and a non-creditworthy applicant using big data. The problem of big data has necessitated the advent of machine learning algorithms capable of sifting through large volumes of credit data sourced from social networks. Recently, to automate, streamline and digitise business processes such as credit scoring, machine learning approaches have been widely used, but the design and deployment of effective and robust credit scoring models require a lot of time, and if the behaviour of customers changes or the customer variables drift over time, the credit score model becomes obsolete or outdated. As a result, credit scoring tasks should be considered as an ephemeral scenario due to big data, as variables tend to drift over time. Incremental and adaptive credit scoring models can help to mitigate the loss of time of re-creating credit models due to drifting variables, big data challenges and changes in customer behaviour. This necessitates the design of robust and effective credit score models capable of learning incrementally, adaptive and able to detect changes. This paper proposes the Incremental Adaptive and Heterogeneous ensemble (IAHE) credit scoring model capable of learning incrementally, adapt to drifting variables and detect changes in customer behaviour and learn big data in a streaming fashion. Empirical experiments conducted indicate that IAHE has the strongest ability to recognise default samples and demonstrated the best generalisation ability on the datasets and the same time maintained a strong interpretability of the results when compared to nine credit scoring models on four public datasets. The superior generalisation performance of IAHE is statistically significant and demonstrated excellent robustness and adaptation to drifting variables.
AB - The determination of the financial credibility of a loan applicant by financial institutions is quantified using a credit score. Sources of credit, such as banks and financial institutions, play a crucial role in sustaining economies and keeping cash flowing in the market. Financial institutions solve the problem of lack of data in credit scoring by extracting customer information from data sources such as social networks. Such data sources store data in large quantities. Traditional data mining techniques fail to accurately distinguish between a credit-worthy applicant and a non-creditworthy applicant using big data. The problem of big data has necessitated the advent of machine learning algorithms capable of sifting through large volumes of credit data sourced from social networks. Recently, to automate, streamline and digitise business processes such as credit scoring, machine learning approaches have been widely used, but the design and deployment of effective and robust credit scoring models require a lot of time, and if the behaviour of customers changes or the customer variables drift over time, the credit score model becomes obsolete or outdated. As a result, credit scoring tasks should be considered as an ephemeral scenario due to big data, as variables tend to drift over time. Incremental and adaptive credit scoring models can help to mitigate the loss of time of re-creating credit models due to drifting variables, big data challenges and changes in customer behaviour. This necessitates the design of robust and effective credit score models capable of learning incrementally, adaptive and able to detect changes. This paper proposes the Incremental Adaptive and Heterogeneous ensemble (IAHE) credit scoring model capable of learning incrementally, adapt to drifting variables and detect changes in customer behaviour and learn big data in a streaming fashion. Empirical experiments conducted indicate that IAHE has the strongest ability to recognise default samples and demonstrated the best generalisation ability on the datasets and the same time maintained a strong interpretability of the results when compared to nine credit scoring models on four public datasets. The superior generalisation performance of IAHE is statistically significant and demonstrated excellent robustness and adaptation to drifting variables.
KW - Credit scoring
KW - Ensemble selection
KW - Incremental learning
KW - Machine learning
UR - http://www.scopus.com/inward/record.url?scp=85185770680&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-46177-4_29
DO - 10.1007/978-3-031-46177-4_29
M3 - Conference contribution
AN - SCOPUS:85185770680
SN - 9783031461767
T3 - Springer Proceedings in Business and Economics
SP - 547
EP - 565
BT - Towards Digitally Transforming Accounting and Business Processes - Proceedings of the International Conference of Accounting and Business iCAB, Johannesburg 2023
A2 - Moloi, Tankiso
A2 - George, Babu
PB - Springer Nature
T2 - International Conference of Accounting and Business, iCAB 2023
Y2 - 29 June 2023 through 30 June 2023
ER -