Abstract

Mental health apps hold great promise for users to self-manage their mental well-being. Of over 10,000 mental health apps, users depend on app reviews to make informed decisions, yet fake reviews often mislead their choices. This problem is further exacerbated by the proliferation of generative AI techniques (Gen AI) and large language models (LLM). Detecting AI-generated content (AIGC) is challenging due to the limitations of current detection methods, making human judgment the first line of defense. Hence, it is imperative to understand how humans process AI and human-generated reviews differently. Our study uses the combination of the Stimulus Organism Response (SOR) and Elaboration Likelihood Model (ELM) theories as a comprehensive approach to compare and contrast the underlying processes for humans to assess the credibility of AI and human-generated reviews in the context of mental health apps. In our study, we extracted linguistic cues (e.g., complexity, social engagement, emotion, immediacy, and uncertainty) for human-generated reviews from the Google Play Store and AI-generated reviews created by participants. We then surveyed users' perceptions of cognitive effort, social presence, persuasive motives, and credibility in both reviews. Our preliminary findings showed that the perceived review credibility (response) is not solely determined by app review content (stimulus) in AI and human-generated reviews; it is also affected by the internal states of the review readers (organism). Humans use more cognitive processes when writing reviews than machines. We also discovered that humans rely on central and peripheral routes when processing human-generated reviews. However, humans only cling to the peripheral route when evaluating AI-generated reviews. These findings offer insights into humans' mechanisms to shape credibility judgment in both reviews and provide practical implications for online platforms, reviewers, and consumers. This study has important implications for consumers of app reviews and app store vendors by helping identify AI-generated reviews and how they are processed differently by humans. This will help deliver recommendations for reviewers to optimize the credibility of their reviews. Findings from this study can be leveraged for developing next-generation detection tools by integrating semantic, cognitive, and affective features. Finally, review readers who intend to make purchases should not exclusively depend on the review web portal to flag AI-generated reviews but also rely on other information sources.

Comments

tpp1315

Share

COinS