Abstract
The use of topic modeling for empirical analysis of text data has become increasingly prominent in business research. Standard studies often employ a two-step procedure: a topic model is first used to identify latent themes from textual data, and these themes are then combined with observed variables as explanatory factors in statistical models. A key limitation of this framework is that topic extraction is conducted independently of the response and observed variables, potentially weakening its effectiveness for downstream analysis. To address this concern, we introduce a unified topic model grounded in the latent Dirichlet allocation (LDA) framework. This approach simultaneously incorporates documents, observed variables, and the response variable within a regression analysis. Empirical validation with real-world data confirms the model’s advantages.
Recommended Citation
Liu, Xiaoping and Li, Xiaobai, "Topic Modeling for Empirical Analysis" (2025). NEAIS 2025 Proceedings. 8.
https://aisel.aisnet.org/neais2025/8
Abstract Only