Track 01: AI in Business and Society

Evaluation Framework for Large Language Model-based Conversational Agents

Anna Wolters, University of KoblenzFollow
Arnold F. Arz von Straussenburg, University of KoblenzFollow
Dennis M. Riehle, University of KoblenzFollow

Paper Type

Complete

Paper Number

1390

Description

The integration of Large Language Models (LLM) in Conversational Agents (CA) enables a significant advancement in the agents’ ability to understand and respond to user queries in a more human-like manner. Despite the widespread adoption of LLMs in these agents, there exists a noticeable lack of research on standardized evaluation methods. Addressing this research gap, our study proposes a comprehensive evaluation framework tailored explicitly to LLM-based conversational agents. In a Design Science Research (DSR) project, we construct an evaluation framework that incorporates four essential components: the pre-defined objectives of the agents, corresponding tasks, and the selection of appropriate datasets and metrics. Our framework outlines how these elements relate to each other in the evaluation and enables a structured approach for the evaluation. We demonstrate how such a framework enables a more systematic evaluation process. This framework can be a guiding tool for researchers and developers working with LLM-based conversational agents.

Comments

Recommended Citation

Wolters, Anna; Arz von Straussenburg, Arnold F.; and Riehle, Dennis M., "Evaluation Framework for Large Language Model-based Conversational Agents" (2024). PACIS 2024 Proceedings. 14.
https://aisel.aisnet.org/pacis2024/track01_aibussoc/track01_aibussoc/14

Download

COinS

Jul 2nd, 12:00 AM

Evaluation Framework for Large Language Model-based Conversational Agents

When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.

Track 01: AI in Business and Society

Evaluation Framework for Large Language Model-based Conversational Agents

Paper Type

Paper Number

Description

Comments

Recommended Citation

PACIS 2024

Search

Browse

Author Corner

Track 01: AI in Business and Society

Evaluation Framework for Large Language Model-based Conversational Agents

Presenter Information

Paper Type

Paper Number

Description

Comments

Recommended Citation

Share

PACIS 2024

Search

Browse

Author Corner