Management Information Systems Quarterly
Abstract
Unstructured multimedia data (text and audio) provides unprecedented opportunities to derive actionable decision-making in the financial industry, in areas such as portfolio and risk management. However, due to formidable methodological challenges, the promise of business value from unstructured multimedia data has not materialized. In this study, we use a design science approach to develop DeepVoice, a novel nonverbal predictive analysis system for financial risk prediction, in the setting of quarterly earnings conference calls. DeepVoice forecasts financial risk by leveraging not only what managers say (verbal linguistic cues) but also how managers say it (vocal cues) during the earnings conference calls. The design of DeepVoice addresses several challenges associated with the analysis of nonverbal communication. We also propose a two-stage deep learning model to effectively integrate managers’ sequential vocal and verbal cues. Using a unique dataset of 6,047 earnings call samples (audio recordings and textual transcripts) of S&P 500 firms across four years, we show that DeepVoice yields remarkably lower risk forecast errors than that achieved by previous efforts. The improvement can also translate into nontrivial economic gains in options trading. The theoretical and practical implications of analyzing vocal cues are discussed.