ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2022, Vol. 59 ›› Issue (1): 144-171.doi: 10.7544/issn1000-1239.20201042

• 隐私保护 • 上一篇    下一篇



  1. (中国人民大学信息学院 北京 100872) (
  • 出版日期: 2022-01-01
  • 基金资助: 

ESA: A Novel Privacy Preserving Framework

Wang Leixia Meng Xiaofeng   

  1. (School of Information, Renmin University of China, Beijing 100872)
  • Online: 2022-01-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (61941121, 91846204, 62172423).

摘要: 随着大数据驱动下智能技术的快速发展,大规模数据收集场景成为数据治理和隐私保护的主战场,本地化差分隐私技术作为该场景下的主流技术,被谷歌、苹果、微软等企业广泛使用.然而,该技术在用户本地对数据进行扰动,引入较多噪声,数据可用性较差.为实现可用性与隐私性兼顾的隐私保护方法,ESA(encode-shuffle-analyze)框架被提出,它在混洗器(shuffler)的作用下尽可能对数据进行较小扰动,同时保护用户隐私,使得任一用户的隐私信息都不能被数据分析者从收集数据中唯一识别.鉴于差分隐私在数学上优雅且严格的隐私定义,该框架目前主要基于差分隐私技术进行实现,该种实现称为混洗差分隐私(shuffle differential privacy, SDP).在保证相同隐私损失ε的情况下,混洗差分隐私比本地化差分隐私的可用性高O(n\+{1/2})倍,接近中心化差分隐私而不依赖于可信第三方.为对该新型的隐私保护框架进行综述,首先对该框架进行分析;之后基于主流的混洗差分隐私技术,对相关理论基础与技术基础进行总结,对不同统计问题下的隐私保护机制进行理论与实验对比;最终提出ESA框架的挑战问题,并对该框架下非差分隐私方法的实现进行展望.

关键词: 隐私保护, ESA框架, 本地化差分隐私, 中心化差分隐私, 混洗差分隐私, 数据收集

Abstract: With the rapid development of data-driven intelligent technologies, large-scale data collection has become a main application scenario of data governance and privacy-preserving. Local differential privacy technology as a mainstream technology has been widely used in companies, such as Google, Apple, and Microsoft. However, this technology has a fatal drawback, which is its poor data utility caused by accumulative noises added to users’ data. To juggle the data privacy and utility, the ESA (encode-shuffle-analyze) framework is proposed. This framework tries adding noises as little as possible while maintaining the same degree of data privacy, which ensures that any user’s sensitive information can be used effectively but cannot be recognized from collected data. Considering the elegant and strict definition of differential privacy in math, the major implementation of the ESA framework is based on differential privacy, named shuffle differential privacy. In the case of the same privacy loss, the data utility of shuffled differential privacy method is O(n\+{1/2}) higher than that of local differential privacy, closing to the central differential privacy but does not rely on a trusted third party. This paper is a survey about this novel privacy-preserving framework. Based on the popular shuffle differential privacy technology, it analyzes this framework, summarizes the theoretical and technical foundations, and compares different privacy-preserving mechanisms under different statistical issues theoretically and experimentally. Finally, this work proposes the challenges of the ESA, and prospects the implementation of non-differential privacy methods over this framework.

Key words: privacy preserving, ESA framework, local differential privacy, central differential privacy, shuffle differential privacy, data collection