Management Information Systems Quarterly
Abstract
Despite its many benefits, widespread access to individuals’ personal data also causes severe privacy concerns for consumers, companies, and policymakers. This study proposes a novel framework that adapts the Shapley value-based feature attribution approach to the problem domain of data privacy by capturing the two crucial dimensions of data privacy—disclosure risk and data utility. Our proposed framework takes a holistic view of data masking through a fair feature attribution approach based on Shapley values. Different from the existing literature that mostly focuses on the risk-utility trade-off at the dataset level, the proposed framework addresses the trade-off at the feature level. Furthermore, the proposed framework is agnostic to data masking methods, statistical and machine learning methods, and data utility and disclosure risk evaluation metrics. Experimental results show that our proposed method can effectively reduce disclosure risk while preserving data utility.