Communications of the Association for Information Systems

Text Mining For Information Systems Researchers: An Annotated Topic Modeling Tutorial

Stefan Debortoli, University of LiechtensteinFollow
Oliver Müller, IT University of Copenhagen
Iris Junglas, Florida State University
Jan vom Brocke, University of Liechtenstein

Abstract

Analysts have estimated that more than 80 percent of today’s data is stored in unstructured form (e.g., text, audio, image, video)—much of it expressed in rich and ambiguous natural language. Traditionally, to analyze natural language, one has used qualitative data-analysis approaches, such as manual coding. Yet, the size of text data sets obtained from the Internet makes manual analysis virtually impossible. In this tutorial, we discuss the challenges encountered when applying automated text-mining techniques in information systems research. In particular, we showcase how to use probabilistic topic modeling via Latent Dirichlet allocation, an unsupervised text-mining technique, with a LASSO multinomial logistic regression to explain user satisfaction with an IT artifact by automatically analyzing more than 12,000 online customer reviews. For fellow information systems researchers, this tutorial provides guidance for conducting text-mining studies on their own and for evaluating the quality of others.

DOI

10.17705/1CAIS.03907

Recommended Citation

Debortoli, S., Müller, O., Junglas, I., & vom Brocke, J. (2016). Text Mining For Information Systems Researchers: An Annotated Topic Modeling Tutorial. Communications of the Association for Information Systems, 39, pp-pp. https://doi.org/10.17705/1CAIS.03907

Download

COinS

When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.

Text Mining For Information Systems Researchers: An Annotated Topic Modeling Tutorial

Authors

Abstract

DOI

Recommended Citation

Share

Search