Abstract

The increasing interest in Machine Learning (ML) based services and the need for more intelligent and automated processes in the finance industry brings new challenges and requires practitioners and academics to design, develop, and maintain new ML approaches for financial services companies. The main objective of this paper is to provide a standardized procedure to deal with cases that suffer from imbalanced datasets. For this, we propose design recommendations on, how to test and combine multiple oversampling techniques such as SMOTE, SMOTE-ENN and SMOTE-Tomek on such datasets with multiple ML models and attribute-based structure to reach higher accuracies. Moreover, this paper considers to find an appropriate structure while maintaining such systems that work with periodically changing datasets, so that the incoming datasets can be analyzed regularly via this procedure.

Share

COinS