Paper Number
ICIS2025-2779
Paper Type
Complete
Abstract
Large language models (LLMs) show impressive linguistic abilities but often lack the domain-specific reasoning needed for reliable decision-making. We introduce Socratic Iterative Reasoning (SIR), a human-guided approach that improves LLM reasoning through structured dialogue and iterative refinement. Using the Beer Game supply chain simulation, we compare SIR-enhanced agents with baseline LLMs, retrieval-augmented generation (RAG), and chain-of-thought (CoT) across 36-week multi-agent experiments. Results show that SIR reduces order variability and total costs, especially for upstream echelons (Distributor, Manufacturer), and improves decision consistency relative to baseline GPT-4o. Unlike RAG, which struggles with knowledge integration, and CoT, which offers limited linear guidance, SIR discovers prompt structures that activate more reliable reasoning. Our study contributes to theory by showing how structured human guidance can systematically enhance AI capabilities and to practice by providing design principles for human–AI collaboration in complex decision environments.
Recommended Citation
Boussioux, Leonard; Chen, Andrew; Fan, Ming; and Jain, Apurva, "Socratic Iterative Reasoning: Enhancing LLM Decision-Making in the Beer Game Supply Chain" (2025). ICIS 2025 Proceedings. 41.
https://aisel.aisnet.org/icis2025/gen_ai/gen_ai/41
Socratic Iterative Reasoning: Enhancing LLM Decision-Making in the Beer Game Supply Chain
Large language models (LLMs) show impressive linguistic abilities but often lack the domain-specific reasoning needed for reliable decision-making. We introduce Socratic Iterative Reasoning (SIR), a human-guided approach that improves LLM reasoning through structured dialogue and iterative refinement. Using the Beer Game supply chain simulation, we compare SIR-enhanced agents with baseline LLMs, retrieval-augmented generation (RAG), and chain-of-thought (CoT) across 36-week multi-agent experiments. Results show that SIR reduces order variability and total costs, especially for upstream echelons (Distributor, Manufacturer), and improves decision consistency relative to baseline GPT-4o. Unlike RAG, which struggles with knowledge integration, and CoT, which offers limited linear guidance, SIR discovers prompt structures that activate more reliable reasoning. Our study contributes to theory by showing how structured human guidance can systematically enhance AI capabilities and to practice by providing design principles for human–AI collaboration in complex decision environments.
When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.
Comments
12-GenAI