ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2020, Vol. 57 ›› Issue (10): 2221-2231.doi: 10.7544/issn1000-1239.2020.20200444

所属专题: 2020密码学与数据隐私保护研究专题

• 信息安全 • 上一篇    下一篇

面向集合计算的隐私保护统计协议

宋祥福1,盖敏2,赵圣楠1,蒋瀚2   

  1. 1(山东大学计算机科学与技术学院 济南 250101);2(山东大学软件学院 济南 250101) (bintasong@gmail.com)
  • 出版日期: 2020-10-01
  • 基金资助: 
    国家自然科学基金项目(61632020,61572294)

Privacy-Preserving Statistics Protocol for Set-Based Computation

Song Xiangfu1, Gai Min2, Zhao Shengnan1, Jiang Han2   

  1. 1(School of Computer Science and Technology, Shandong University, Jinan 250101);2(School of Software, Shandong University, Jinan 250101)
  • Online: 2020-10-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (61632020, 61572294).

摘要: 通过挖掘数据中蕴含的重要信息指导实际生产和社会管理,已经成为大数据时代的客观需求.然而,现实生活中大量数据往往分布于不同实体,传统数据收集和共享方式将数据毫无保留地交予某一方进行处理,无法保障用户隐私.集中式的数据处理方式同样容易遭受外部敌手的攻击,造成数据泄露等严重安全威胁.随着数据安全和隐私相关的法律法规的出台,对数据的存储、处理和共享提出了更高的要求.在保护隐私的前提下,如何采用隐私保护技术对数据进行有效利用已经成为了热门话题.在此类协议中,保密集合求交由于其众多的应用场景,越来越受到学术界和产业界的关注.目前大多数集合求交协议仅支持计算集合交集,然而,在很多场景下,参与方可能更偏向于在不泄露交集的设定下计算关于交集的某些函数,如交集大小、交集权值求和,甚至更一般的函数.针对这个问题,基于茫然传输设计了一组协议组件,利用这些组件,可以在不泄露交集元素的设定下,较高效地计算交集大小、交集权值的统计和、交集权值的方差等统计量.值得关注的是,这些协议的构造不依赖同态加密或通用电路构造,可以仅利用茫然传输实现相应的安全计算需求.茫然传输可以利用茫然传输拓展技术大幅度降低公钥操作,因而可以实现较好的计算效率.同时,借助已有的Hash技巧,对协议的通信量进行了优化.在半诚实敌手下基于视图模拟对协议进行了形式化证明,并提供了针对协议的复杂度分析和对比.

关键词: 集合求交, 隐私保护, 统计, 茫然传输, 安全计算, 交集大小, 交集权值和

Abstract: Mining valuable information from high volume of data, which can be used to guide real-world business and management, has been an important demand in the era of big data. In reality, data is usually distributed among different entities. A common way for data analysis is to let a trusted party to perform algorithms on the collected data. However, this trivial approach not only puts data owners privacy at risk, but also may bring with potential data breach issued by some outside attackers. With more and more law and regulation on data security and privacy coming out, there are more requirements for the whole life of data, which includes privacy-preserving data storing, processing and so on. It has been a common sense to leverage privacy-preserving techniques to protect sensitive data while still enabling participants to mine value from the whole dataset contributed from each party. Among those techniques, private set intersection (PSI) attracts more and more attention as it can be applied in many real-world scenarios. However, most PSI protocols only focus on computing the intersection itself, and in many cases the participants may also want to compute a function of the intersection, e.g., the cardinality of intersection or intersection sum, and even more general functions. To this end, a several of protocols are designed in this paper. By these protocols, one can privately compute intersection cardinality or intersection sum without leaking the elements of the intersection. Notably, these protocols, without resorting to homomorphic encryption or general circuit-based technique, can complete needed computation by only leveraging oblivious transfer. Since oblivious transfer can be efficiently extended by current highly efficient oblivious transfer extension protocols, this means these protocols are highly efficient in computation complexity. In addition, communication complexity of the protocols is also optimized by leveraging existing hashing technique. The protocols can be formally proven secure under semi-honest adversary, and complexity analysis is also provided in the paper.

Key words: private set intersection, privacy-preserving, statistics, oblivious transfer, secure computation, cardinality of intersection, intersection sum

中图分类号: