Abstract

This study examines privacy disclosure risks when multiple records in a dataset are associated with the same individual. Existing data privacy approaches typically assume that each individual in a dataset corresponds to a single record, which tends to underestimate the disclosure risks in the multiple-record problems. We propose a novel privacy approach, which uses a measure called g-balance to assess identity disclosure risk and another measure called h-affiliation to assess sensitive value disclosure risk in the multiple-record scenario. We develop an efficient algorithm based on the proposed measures to protect privacy disclosure due to multiple record linkage. An experimental study was conducted using real-world healthcare data with multiple records per person. The results of the experiments demonstrate that the proposed approach is more effective than traditional techniques in protecting privacy and preserving data quality.

Share

COinS