Paper ID

1671

Paper Type

short

Description

Firms have been sharing transactional data with business partners for decades. The benefits of sharing not withstanding, a significant number of firms are still reluctant to share, for fear of sensitive information getting into the wrong hands. Consequently, effective ways to hide sensitive information before sharing data has become an important consideration. As most versions of the underlying problem are NP-hard, and as transactional databases involve millions of transactions, research on this topic has also been ongoing for many years. In this paper, we explore hiding sensitive itemsets while minimizing the number of items removed from the database to achieve it. We present an integer programming formulation, and show that while the general problem is NP-hard, there is underlying structure in the problem that allows it to be decomposed and solved more efficiently. We illustrate various aspects of the problem using examples, and present a solution procedure based on propositions developed in the paper. This is research in progress, and we intend to implement and illustrate the effectiveness of the proposed procedure on large, real and synthetic databases involving millions of transactions.

Share

COinS
 

Hiding Sensitive Itemsets in Shared Transactional Databases: Minimizing the Number of Items Removed

Firms have been sharing transactional data with business partners for decades. The benefits of sharing not withstanding, a significant number of firms are still reluctant to share, for fear of sensitive information getting into the wrong hands. Consequently, effective ways to hide sensitive information before sharing data has become an important consideration. As most versions of the underlying problem are NP-hard, and as transactional databases involve millions of transactions, research on this topic has also been ongoing for many years. In this paper, we explore hiding sensitive itemsets while minimizing the number of items removed from the database to achieve it. We present an integer programming formulation, and show that while the general problem is NP-hard, there is underlying structure in the problem that allows it to be decomposed and solved more efficiently. We illustrate various aspects of the problem using examples, and present a solution procedure based on propositions developed in the paper. This is research in progress, and we intend to implement and illustrate the effectiveness of the proposed procedure on large, real and synthetic databases involving millions of transactions.