Abstract

Traditional approaches to evaluate data quality typically require pre-defined user requirements or data models of the source datasets. However, analysts are often faced with unfamiliar and repurposed data, for which they have little or no prior knowledge. In particular, process analysts rely on event log data but encounter challenges in evaluating the quality of such data. The situation is complicated by the typical inadequacy of metadata, despite various standardisation initiatives. Specifically, process analysts are frequently confronted with event logs from a variety of sources and different domains which often contain inadequate metadata. In this paper, we aim to gain an empirical understanding of the role of metadata in evaluating the quality of repurposed event logs. Through a semi-structured interview with 41 experienced process mining analysts, we extract, identify, and describe challenges, current practices and approaches, and preferences for the use of metadata in the context of event logs.

Share

COinS