Paper Number

1091

Paper Type

Complete Research Paper

Abstract

Numerical performance metrics of machine learning algorithms may not always correspond to human assessment. Most previous research either focused on statistical assessments of established metrics or the behavioral effects of varying algorithmic performances based on these metrics. This paper experimentally explores the impact of a single "bad" prediction (negative outlier) on advice-taking by participants with varying statistical literacy, while holding the average algorithmic performance constant. Surprisingly, we observe a positive impact of a negative outlier in prediction performance on advice-taking. Statistical literacy has a U-shaped relation to advice-taking with low and high literacy users taking more advice, especially for algorithms with a negative prediction outlier. Therefore, our findings suggest optimizing machine learning not only based on averaging metrics but also considering performance distributions and user awareness of such statistical properties. These insights inform developers, machine learning evaluations, and future research, emphasizing the need to understand behavioral consequences alongside numerical metrics.

Share

COinS
 
Jun 14th, 12:00 AM

Algorithmic Advice-Taking Beyond MAE: The Role of Negative Prediction Outliers and Statistical Literacy in Algorithmic Advice-Taking

Numerical performance metrics of machine learning algorithms may not always correspond to human assessment. Most previous research either focused on statistical assessments of established metrics or the behavioral effects of varying algorithmic performances based on these metrics. This paper experimentally explores the impact of a single "bad" prediction (negative outlier) on advice-taking by participants with varying statistical literacy, while holding the average algorithmic performance constant. Surprisingly, we observe a positive impact of a negative outlier in prediction performance on advice-taking. Statistical literacy has a U-shaped relation to advice-taking with low and high literacy users taking more advice, especially for algorithms with a negative prediction outlier. Therefore, our findings suggest optimizing machine learning not only based on averaging metrics but also considering performance distributions and user awareness of such statistical properties. These insights inform developers, machine learning evaluations, and future research, emphasizing the need to understand behavioral consequences alongside numerical metrics.

When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.