Description
Crowdsourcing represents an innovative approach that allows companies to engage a diverse network of people over the internet and use their collective creativity, expertise, or workforce for completing tasks that have previously been performed by dedicated employees or contractors. However, the process of reviewing and filtering the large amount of solutions, ideas, or feedback submitted by a crowd is a latent challenge. Identifying valuable inputs and separating them from low quality contributions that cannot be used by the companies is time-consuming and cost-intensive. In this study, we build upon the principles of text mining and machine learning to partially automatize this process. Our results show that it is possible to explain and predict the quality of crowdsourced contributions based on a set of textual features. We use these textual features to train and evaluate a classification algorithm capable of automatically filtering textual contributions in crowdsourcing.
A Machine Learning Approach for Classifying Textual Data in Crowdsourcing
Crowdsourcing represents an innovative approach that allows companies to engage a diverse network of people over the internet and use their collective creativity, expertise, or workforce for completing tasks that have previously been performed by dedicated employees or contractors. However, the process of reviewing and filtering the large amount of solutions, ideas, or feedback submitted by a crowd is a latent challenge. Identifying valuable inputs and separating them from low quality contributions that cannot be used by the companies is time-consuming and cost-intensive. In this study, we build upon the principles of text mining and machine learning to partially automatize this process. Our results show that it is possible to explain and predict the quality of crowdsourced contributions based on a set of textual features. We use these textual features to train and evaluate a classification algorithm capable of automatically filtering textual contributions in crowdsourcing.