Predicting Financial Risk Using Non-Financial Data: Design and Evaluation of a Predictive Analytics Framework

Chunxiao Li, Shanghai Jiao Tong University
Hongchang Wang, Georgia Institute of Technology
Wei Min, CreditX Technology
Zhengyang Tang, CreditX Technology
Bin Gu, Arizona State University

Abstract

Predicting financial risk is a long-lasting challenge in consumer finance industry. Existing predictive models mainly rely on structured credit data but fail to work on unstructured non-financial data. A few studies that work on non-financial data usually focus on merely one data source. We follow a design science approach and propose a framework to predict financial risk within multiple non-financial data sources (i.e. within-app browsing behavior, short message, and customer social network). Based on the kernel theory of Predictive Analytics, we detail a design framework which first develops individual predictive models within each data domain and then ensembles them together for a multifaceted risk profiling. We conduct multiple experiments to evaluate the performance of this framework and find empirical support. This paper contributes to financial risk literature by proposing potential causal connections and to design science literature by demonstrating how to gain predictive power from various non-financial data sources.

 

Predicting Financial Risk Using Non-Financial Data: Design and Evaluation of a Predictive Analytics Framework

Predicting financial risk is a long-lasting challenge in consumer finance industry. Existing predictive models mainly rely on structured credit data but fail to work on unstructured non-financial data. A few studies that work on non-financial data usually focus on merely one data source. We follow a design science approach and propose a framework to predict financial risk within multiple non-financial data sources (i.e. within-app browsing behavior, short message, and customer social network). Based on the kernel theory of Predictive Analytics, we detail a design framework which first develops individual predictive models within each data domain and then ensembles them together for a multifaceted risk profiling. We conduct multiple experiments to evaluate the performance of this framework and find empirical support. This paper contributes to financial risk literature by proposing potential causal connections and to design science literature by demonstrating how to gain predictive power from various non-financial data sources.