Abstract

Social Media data and machine learning techniques together pose an unprecedented opportunity to researchers in building theories. Machine learning using social media data can be considered a specific mixed method that combines qualitative methods and unstructured data with quantitative techniques. The extant approaches in social computing and machine learning in Information Systems literature, however, are criticized producing predictions without causal inferences as well as largely focused on social media as a context of interest and hence yet to be recognized in their utility in building generalizable theories. Through this study, we attempt to address these limitations by combining two text analysis techniques using social media data in the context of a natural experiment. First, we propose a novel framework that combines unsupervised (topic modeling) and supervised (sentiment analysis) machine learning abductively applied on longitudinal Twitter data. In turn, the approach facilitates decontextualizing the text from social media that can be used to theorize at a higher level of abstraction. Second, we exploit a natural experiment to integrate machine learning technique with causal inference. Together, by integrating topic modeling and sentiment analysis, and leveraging empirical setting of a natural experiment, this study demonstrates a novel framework to theory building using social media data.

Share

COinS
 

Social Media Data, Machine Learning and Causal Inference

Social Media data and machine learning techniques together pose an unprecedented opportunity to researchers in building theories. Machine learning using social media data can be considered a specific mixed method that combines qualitative methods and unstructured data with quantitative techniques. The extant approaches in social computing and machine learning in Information Systems literature, however, are criticized producing predictions without causal inferences as well as largely focused on social media as a context of interest and hence yet to be recognized in their utility in building generalizable theories. Through this study, we attempt to address these limitations by combining two text analysis techniques using social media data in the context of a natural experiment. First, we propose a novel framework that combines unsupervised (topic modeling) and supervised (sentiment analysis) machine learning abductively applied on longitudinal Twitter data. In turn, the approach facilitates decontextualizing the text from social media that can be used to theorize at a higher level of abstraction. Second, we exploit a natural experiment to integrate machine learning technique with causal inference. Together, by integrating topic modeling and sentiment analysis, and leveraging empirical setting of a natural experiment, this study demonstrates a novel framework to theory building using social media data.