Abstract

Credit scoring for loan applicants is an essential measure to reduce the risk of personal credit loan. Due to low percentage of non-performing loans, credit scoring is typically considered as an imbalanced classification problem. It is difficult to adress this kind problem using a single classifier. In order to settle the problem of imbalanced samples in credit scoring system, an ensemble learning classification model named AdaBoost-DT is proposed. In this model, we employ adaptive boosting (AdaBoost) to cascade multiple decision trees (DT). The weights of the base classifier can be adjusted automatically by enhancing the learning of misclassified samples. In order to verify the effectiveness empirically, we use data from Kaggle platform. Ten-fold cross-validation is carried out to evaluate and compare the performance among AdaBoost-DT model, DT, and Random Forest. The empirical results show that AdaBoost-DT model has higher accuracy. This model is valuable for banks and other financial institutions to evaluate customers’ credit efficiently.

Share

COinS