Paper Number

ECIS2026-2849

Paper Type

CRP

Abstract

Conversational interfaces and the underlying large language models (LLMs) increasingly shape how users access and process information. Cognitive biases can affect how users formulate prompts and recent research suggests that such biases may also be reflected in LLM outputs. This raises important questions about the vulnerability of these systems to bias in everyday use. This study evaluates whether cognitive biases embedded in user prompts affect model accuracy and consistency using 790 binary-choice questions centered on common misconceptions. Our findings show that models differ in answer consistency across multiple runs, with larger models producing more stable output. Models also show susceptibility to biased prompts, with semantic cues producing the strongest effects. Authority signals, references to recent information and user-stated beliefs yield the largest shifts in accuracy, while structural cues such as answer order have smaller and inconsistent effects. These results highlight the need for bias-aware prompt engineering and greater model robustness, especially in contexts where factual accuracy is critical.

Share

COinS
 
Jun 14th, 12:00 AM

User Influence and Model Vulnerability: Human-Like Cognitive Bias In Conversational AI Systems

Conversational interfaces and the underlying large language models (LLMs) increasingly shape how users access and process information. Cognitive biases can affect how users formulate prompts and recent research suggests that such biases may also be reflected in LLM outputs. This raises important questions about the vulnerability of these systems to bias in everyday use. This study evaluates whether cognitive biases embedded in user prompts affect model accuracy and consistency using 790 binary-choice questions centered on common misconceptions. Our findings show that models differ in answer consistency across multiple runs, with larger models producing more stable output. Models also show susceptibility to biased prompts, with semantic cues producing the strongest effects. Authority signals, references to recent information and user-stated beliefs yield the largest shifts in accuracy, while structural cues such as answer order have smaller and inconsistent effects. These results highlight the need for bias-aware prompt engineering and greater model robustness, especially in contexts where factual accuracy is critical.

When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.