ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2016, Vol. 53 ›› Issue (10): 2343-2353.doi: 10.7544/issn1000-1239.2016.20160465

所属专题: 2016网络空间共享安全研究进展专题

• 信息安全 • 上一篇    下一篇



  1. 1(中国科学院信息工程研究所 北京 100093); 2(中国科学院大学 北京 100049) (
  • 出版日期: 2016-10-01
  • 基金资助: 
    国家“八六三”高技术研究发展计划基金项目(2013AA013204);中国科学院战略性先导科技专项课题(XDA06030200) This work was supported by the National High Technology Research and Development Program of China (863 Program) (2013AA013204) and the State Priority Research Program of the Chinese Academy of Sciences (XDA06030200).

Privacy Preserving Data Publishing via Weighted Bayesian Networks

Wang Liang1,2, Wang Weiping1, Meng Dan1   

  1. 1(Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093); 2(University of Chinese Academy of Sciences, Beijing 100049)
  • Online: 2016-10-01

摘要: 数据发布中的隐私保护问题是目前信息安全领域的一个研究热点.如何有效地防止敏感隐私信息泄露已成为信息安全领域的重要课题.差分隐私保护技术是最新发展起来的隐私保护技术,它的最大优点是不对攻击者的背景知识做任何特定假设,该技术不但能为隐私数据发布提供强有力的安全防护,而且在实践中也得到了广泛应用.现有的差分隐私保护技术并不能全面有效地处理高维隐私数据的发布问题,虽然基于贝叶斯网络的隐私数据发布方法(PrivBayes)有效地处理了高维数据集转化为低维数据集的发布问题,但这种方法也存在一定的缺陷和不足.基于对贝叶斯网络的隐私数据发布方法的分析研究和改进优化,建立了加权贝叶斯网络隐私数据发布方法(加权PrivBayes),通过理论分析和实验评估,该方法不仅能保证原始隐私发布数据集的隐私安全性,同时又能大幅提升原始隐私发布数据集的数据精确性.

关键词: 数据隐私, 贝叶斯网络, 隐私保护, 数据发布, 差分隐私

Abstract: Privacy preserving in data publishing is a hot topic in the field of information security currently. How to effectively prevent the disclosure of sensitive information has become a major issue in enabling public access to the published dataset that contain personal information. As a newly developed notion of privacy preserving, differential privacy can provide strong security protection due to its greatest advantage of not making any specific assumptions on the attacker's background, and has been extensively studied. The existing approaches of differential privacy cannot fully and effectively solve the problem of releasing high-dimensional data. Although the PrivBayes can transform high-dimensional data to low-dimensional one, but cannot prevent attributes disclosure on certain conditions, and also has some limitations and shortcomings. In this paper, to solve these problems, we propose a new and powerful improved algorithm for data publishing called weighted PrivBayes. In this new algorithm, thorough both theoretical analysis and experiment evaluation, not only guarantee the security of the published dataset but also significantly improve the data accuracy and practical value than PrivBayes.

Key words: data privacy, Bayesian network, privacy preserving, data publishing, differential privacy