Start Date
11-8-2016
Description
In this research-in-progress, we explored LSA (Latent Semantic Analysis), a technique used for Text Mining approach. With the growing volume of unstructured data, especially text, our main objective was to uncover and systematize a process in order to help researchers and practitioners to analyze and make sense in a Big Data environment. For that, we choose to work with all the titles from all researchers formally registered in Business Management PhD programs in Brazil, which resulted in 42.079 title articles from 1.600 different researchers, published between 1979 and 2016. As a result, we generate a describe a stepwise script based on Text Mining good practices in literature, that can be used for further unstructured data investigations, such as group of comments, news, interviews, and others. In addition, we demonstrate some of the main topics and key words from the investigated database.
Recommended Citation
Marcolin, Carla and Becker, João, "Exploring Latent Semantic Analysis in a Big Data(base)" (2016). AMCIS 2016 Proceedings. 17.
https://aisel.aisnet.org/amcis2016/Decision/Presentations/17
Exploring Latent Semantic Analysis in a Big Data(base)
In this research-in-progress, we explored LSA (Latent Semantic Analysis), a technique used for Text Mining approach. With the growing volume of unstructured data, especially text, our main objective was to uncover and systematize a process in order to help researchers and practitioners to analyze and make sense in a Big Data environment. For that, we choose to work with all the titles from all researchers formally registered in Business Management PhD programs in Brazil, which resulted in 42.079 title articles from 1.600 different researchers, published between 1979 and 2016. As a result, we generate a describe a stepwise script based on Text Mining good practices in literature, that can be used for further unstructured data investigations, such as group of comments, news, interviews, and others. In addition, we demonstrate some of the main topics and key words from the investigated database.