Abstract

The rise of data re-purposing has resulted in an unprecedented opportunity to create new value from existing data assets. Unlike previous approaches that relied on well-understood data quality requirements, this open-ended setting demands a discovery-oriented approach. Despite a growing market of tools that assist with data exploration and profiling, data quality discovery continues to drain the time of data workers due to its cognitively demanding and contextual nature. Accordingly, we study data quality discovery through purpose-built experimental platforms that mimic typical data exploration platforms, and explore implicit sensemaking behaviours of data workers when required to write code or use available functions, by exploiting eye-tracking and log data. Our analysis of visual scanning and attention patterns reveals various sensemaking loops amongst data workers with different performance levels in different settings. Our findings provide insights for researchers and tool designers to consider how to support different data workers in data quality discovery.

Share

COinS