Communications of the Association for Information Systems

A Guide to Text Analysis with Latent Semantic Analysis in R with Annotated Code: Studying Online Reviews and the Stack Exchange Community

David Gefen, Drexel UniversityFollow
James E. Endicott, University of Colorado Boulder
Jorge E. Fresneda, Drexel University
Jacob Miller, Drexel University
Kai R. Larsen, University of Colorado Boulder

Abstract

In this guide, we introduce researchers in the behavioral sciences in general and MIS in particular to text analysis as done with latent semantic analysis (LSA). The guide contains hands-on annotated code samples in R that walk the reader through a typical process of acquiring relevant texts, creating a semantic space out of them, and then projecting words, phrase, or documents onto that semantic space to calculate their lexical similarities. R is an open source, popular programming language with extensive statistical libraries. We introduce LSA as a concept, discuss the process of preparing the data, and note its potential and limitations. We demonstrate this process through a sequence of annotated code examples: we start with a study of online reviews that extracts lexical insight about trust. That R code applies singular value decomposition (SVD). The guide next demonstrates a realistically large data analysis of Stack Exchange, a popular Q&A site for programmers. That R code applies an alternative sparse SVD method. All the code and data are available on github.com.

DOI

10.17705/1CAIS.04121

Recommended Citation

Gefen, D., Endicott, J. E., Fresneda, J. E., Miller, J., & Larsen, K. R. (2017). A Guide to Text Analysis with Latent Semantic Analysis in R with Annotated Code: Studying Online Reviews and the Stack Exchange Community. Communications of the Association for Information Systems, 41, pp-pp. https://doi.org/10.17705/1CAIS.04121

Download

COinS

When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.

A Guide to Text Analysis with Latent Semantic Analysis in R with Annotated Code: Studying Online Reviews and the Stack Exchange Community

Authors

Abstract

DOI

Recommended Citation

Share

Search