Stochastic protection of confidential information in databases: A hybrid of data perturbation and query restriction
成果类型:
Article
署名作者:
Nunez, Manuel A.; Garfinkel, Robert S.; Gopal, Ram D.
署名单位:
University of Connecticut
刊物名称:
OPERATIONS RESEARCH
ISSN/ISSBN:
0030-364X
DOI:
10.1287/opre.1070.0407
发表日期:
2007
页码:
890-908
关键词:
摘要:
Data perturbation and query restriction are two methods developed to protect confidential data in statistical databases. In the former, the data is systematically changed to yield answers to queries that are statistically similar to those that would have resulted from the original data. The latter provides exact answers to queries as long as the risk of exact disclosure of confidential data does not become too great. We present a new methodology to combine these techniques so that the advantages of both are captured. The model is appropriate and computationally viable for large databases whether the queries are linear or nonlinear. The query restriction phase consists of finding an optimal subset of queries to answer exactly without compromising the database. This is an NP-hard problem with a matroid intersection structure that lends itself to an efficient greedy heuristic. Then, given the queries that are answered exactly, we implement a data Perturbation phase that provides stochastic protection and consistency. We present computational results on a large database with both linear and nonlinear queries. The results indicate that many queries can be answered exactly and the proposed perturbation approach provides more accurate answers than the standard perturbation method.