Abstract

Accuracy reflects the extent of correctness of data. It is often evaluated by comparing the values recorded to a baseline perceived as correct. Even when data values are accurate at the time of recording – their accuracy may degrade over time, as certain properties of real-world entities may change, while the data values that reflect them are not being updated. This study uses the Markov-Chain model to develop an analytical framework that describes accuracy degradation over time – this by assessing the likelihood of certain data attributes to transition between states within a given time period. Evaluation of the framework with real-world data shows its potential contribution for key data-quality management tasks, such as the prediction of accuracy degradation, and the development of data auditing and maintenance policies.

Share

COinS
 

Using a Markov-Chain Model for Assessing Accuracy Degradation and Developing Data Maintenance Policies

Accuracy reflects the extent of correctness of data. It is often evaluated by comparing the values recorded to a baseline perceived as correct. Even when data values are accurate at the time of recording – their accuracy may degrade over time, as certain properties of real-world entities may change, while the data values that reflect them are not being updated. This study uses the Markov-Chain model to develop an analytical framework that describes accuracy degradation over time – this by assessing the likelihood of certain data attributes to transition between states within a given time period. Evaluation of the framework with real-world data shows its potential contribution for key data-quality management tasks, such as the prediction of accuracy degradation, and the development of data auditing and maintenance policies.