Corresponding Author

Jiayu Yu

Document Type

Article

Abstract

For a long time, the credit business has been the main business of banks and financial institutions. With the rapid growth of business scale, how to use models to detect fraud risk quickly and automatically is a hot research direction. Logistic regression has become the most widely used risk assessment model in the industry due to its good robustness and strong interpretability, but it relies on differentiated features and feature combinations. XGBoost is a powerful and convenient algorithm for feature transformation. Therefore, in this paper, XGBOOST can be used to effectively perform the advantages of feature combination, and a XGBoost-LR hybrid model is constructed. Firstly, use the data to train a XGBoost model, then give the samples in the training data to the XGBoost model to get the leaves nodes of the sample, and then use the leaves nodes after one-hot encoding as a feature to train an LR model. Using the German credit data set published by UCI to verify this model and compare AUC values with other models. The results show that this hybrid model can effectively improve the accuracy of model prediction and has a good application value.

Share

COinS