Paper Type
Complete
Abstract
Understanding how large language models perform relative to humans in socially interactive, deduction-based tasks is vital for advancing AI applications. This study compares the performance of human players and GPT-4o in Guess vs. AI, a custom strategic deduction game. Drawing on data from 85 completed games, the AI-opponent achieved a significantly higher win rate than human players (63.5%, p = 0.009) and required fewer questions to identify the target (humans: 17, AI-opponent: 9). These findings highlight GPT-4o’s strengths in systematic reasoning, pattern recognition and efficient decision-making. While showcasing the potential of large language models in structured deduction scenarios, they also emphasize the need for further research into AI adaptability in more socially complex tasks. Future directions include expanding demographic diversity, exploring additional game formats or different large language models and investigating potential human-AI collaborations rather than strictly competitive environments.
Paper Number
1316
Recommended Citation
Tinke, Pjotr and von Mentlen, Thomas, "Human vs LLM: a Comparative Performance Analysis in a Social Deduction Game" (2025). AMCIS 2025 Proceedings. 12.
https://aisel.aisnet.org/amcis2025/sig_odis/sig_odis/12
Human vs LLM: a Comparative Performance Analysis in a Social Deduction Game
Understanding how large language models perform relative to humans in socially interactive, deduction-based tasks is vital for advancing AI applications. This study compares the performance of human players and GPT-4o in Guess vs. AI, a custom strategic deduction game. Drawing on data from 85 completed games, the AI-opponent achieved a significantly higher win rate than human players (63.5%, p = 0.009) and required fewer questions to identify the target (humans: 17, AI-opponent: 9). These findings highlight GPT-4o’s strengths in systematic reasoning, pattern recognition and efficient decision-making. While showcasing the potential of large language models in structured deduction scenarios, they also emphasize the need for further research into AI adaptability in more socially complex tasks. Future directions include expanding demographic diversity, exploring additional game formats or different large language models and investigating potential human-AI collaborations rather than strictly competitive environments.
When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.
Comments
SIGODIS