BLED 2019 Proceedings

From dirty data to multiple versions of truth: How different choices in data cleaning lead to different learning analytics outcomes

Justian Knobbout, HU University of Applied Sciences UtrechtFollow
Huub Everaert,, HU University of Applied Sciences Utrecht, The NetherlandsFollow
Esther van der Stappen, HU University of Applied Sciences Utrecht, the NetherlandsFollow

Abstract

Learning analytics is the analysis of student data with the purpose of improving learning. However, the process of data cleaning remains underexposed within learning analytics literature. In this paper, we elaborate on choices made in the cleaning process of student data and their consequences. We illustrate this with a case where data was gathered during six courses taught via Moodle. In this data set, only 21% of the logged activities were linked to a specific course. We illustrate possible choices in dealing with missing data by applying the cleaning process twelve times with different choices on copies of the raw data. Consequently, the analysis of the data shows varying outcomes. As the purpose of learning analytics is to intervene based on analysis and visualizations, it is of utmost importance to be aware of choices made during data cleaning. This paper's main goal is to make stakeholders of (learning) analytics activities aware of the fact that choices are made during data cleaning have consequences on the outcomes. We believe that there should be transparency to the users of these outcomes and give them a detailed report of the decisions made.

Recommended Citation

Knobbout, Justian; Everaert,, Huub; and van der Stappen, Esther, "From dirty data to multiple versions of truth: How different choices in data cleaning lead to different learning analytics outcomes" (2019). BLED 2019 Proceedings. 57.
https://aisel.aisnet.org/bled2019/57

Download

COinS

BLED 2019 Proceedings

From dirty data to multiple versions of truth: How different choices in data cleaning lead to different learning analytics outcomes

Abstract

Recommended Citation

Search

Links

Browse

Author Corner

BLED 2019 Proceedings

From dirty data to multiple versions of truth: How different choices in data cleaning lead to different learning analytics outcomes

Authors

Abstract

Recommended Citation

Share

Search

Links

Browse

Author Corner