Abstract

Social media provides a platform for dissatisfied and frustrated customers to discuss matters of common concerns and share experiences about products and services. While listening to and learning from customer has long been recognized as an important marketing charge, how to identify customer complaints on social media is a nontrivial task. Customer complaint messages are highly distributed on social media, while non-complaint messages are unspecific and topically diverse. It is costly and time consuming to manually label a large number of customer complaint messages (positive examples) and non-complaint messages (negative examples) for training classification systems. Nevertheless, it is relatively easy to obtain large volumes of unlabeled content on social media. In this paper, we propose a partially supervised learning approach to automatically extract high quality positive and negative examples from an unlabeled dataset. The empirical evaluation suggested that the proposed approach generally outperforms the benchmark techniques and exhibits more stable performance.

Share

COinS
 

Environmental Scanning for Customer Complaint Identification in Social Media

Social media provides a platform for dissatisfied and frustrated customers to discuss matters of common concerns and share experiences about products and services. While listening to and learning from customer has long been recognized as an important marketing charge, how to identify customer complaints on social media is a nontrivial task. Customer complaint messages are highly distributed on social media, while non-complaint messages are unspecific and topically diverse. It is costly and time consuming to manually label a large number of customer complaint messages (positive examples) and non-complaint messages (negative examples) for training classification systems. Nevertheless, it is relatively easy to obtain large volumes of unlabeled content on social media. In this paper, we propose a partially supervised learning approach to automatically extract high quality positive and negative examples from an unlabeled dataset. The empirical evaluation suggested that the proposed approach generally outperforms the benchmark techniques and exhibits more stable performance.