Human Computer Interaction, Artificial Intelligence and Intelligent Augmentation
Loading...
Paper Type
short
Paper Number
2401
Description
Machine learning and artificial intelligence techniques have been increasingly employed in business research to discover or extract new simple features from large and unstructured data. These machine learned features (MLFs) are then used as independent or explanatory variables in the main econometric models for empirical research. Despite this growing trend, there has been little research regarding the impact of using MLFs on statistical inference for empirical research. In this paper, we undertake feature identification and parameter estimation issues related to the use of topics/features extracted by Latent Dirichlet Allocation, a popular machine learning technique for text mining. We propose a novel method to extract features that result in the minimum-variance estimation of the regression model parameters. This enables a better use of unstructured text data for econometric modeling in empirical research. The effectiveness of the proposed method is validated with an experimental evaluation study on real-world text data.
Recommended Citation
Liu, Xiaoping and Li, Xiaobai, "Designing Topic Models for Better Econometric Modeling" (2020). ICIS 2020 Proceedings. 19.
https://aisel.aisnet.org/icis2020/hci_artintel/hci_artintel/19
Designing Topic Models for Better Econometric Modeling
Machine learning and artificial intelligence techniques have been increasingly employed in business research to discover or extract new simple features from large and unstructured data. These machine learned features (MLFs) are then used as independent or explanatory variables in the main econometric models for empirical research. Despite this growing trend, there has been little research regarding the impact of using MLFs on statistical inference for empirical research. In this paper, we undertake feature identification and parameter estimation issues related to the use of topics/features extracted by Latent Dirichlet Allocation, a popular machine learning technique for text mining. We propose a novel method to extract features that result in the minimum-variance estimation of the regression model parameters. This enables a better use of unstructured text data for econometric modeling in empirical research. The effectiveness of the proposed method is validated with an experimental evaluation study on real-world text data.
When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.