•  
  •  
 

Abstract

With the community of online reviewers growing rapidly, we find it increasingly difficult to digest all the information within a limited time. Users’ requirements raise an interesting problem not well studied yet: how to discover the high quality product reviews? We believe a good solution will provide at least two types of benefit: 1) Rank reviews in terms of their quality. This could improve user experience by enabling them to learn more with a few detailed high-quality reviews instead of review outlines of irrelevant content and spam. 2) Automatically summarize user opinions. Researchers have studied this problem for years and are trying to assist users in getting the main products information concepts more efficiently. With this respect, low-quality content will definitely degrade the accuracy performance of any algorithm on this task. For the purpose of quality prediction, previous research thoroughly examined various properties of product reviews based on their content. Although some promising results have been obtained, we believe there is still room for improvement. Overall, we explore the topic of review quality from two aspects: 1) to filter out noisy data. Here we leverage classification techniques to differentiate real product reviews from other types of reviews and spam. Indeed many articles that fall under the label “product reviews” really belong to three groups: product reviews, feedback for retailers, and commercial spam. The empirical results show that this research could be put into practice with sufficient training data. 2) To assess the quality of a review we also take into consideration another information resource: the behavior of a review author in an e-commerce community. Our requirement is that after the noise filtering step, all product reviews must be ranked according to their quality. The common methods for this type of task are usually based solely on the analysis of the text of the review. By contrast, we performed a high-level analysis on two kinds of data: product reviews and deal transactions. An interesting finding reveals that review quality is not only related to their content, but can also be derived from the behavior of the review author. Therefore, in order to inspect review quality from the perspectives of human credibility and expertise, we consider the following three features: the author personal reputation, the “seller degree” that reflects if the author is also a seller, and the “expertise degree”. Our experiments show that the addition of these features increase the performance of the review quality ranking. Furthermore, we propose an evolving model given the above observations. The model is able to generate the basic characteristics of the review community, especially when the above three features are taken into consideration. In addition, the model could help us make more reasonable predictions for concerning the evolution of the review community.

Share

COinS