Loading [MathJax]/jax/output/SVG/jax.js
  • 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Xu Jingnan, Wang Leixia, Meng Xiaofeng. Research on Privacy Auditing in Data Governance[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202540530
Citation: Xu Jingnan, Wang Leixia, Meng Xiaofeng. Research on Privacy Auditing in Data Governance[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202540530

Research on Privacy Auditing in Data Governance

Funds: This work was supported by the National Natural Science Foundation of China (62172423).
More Information
  • Author Bio:

    Xu Jingnan: born in 1997. PhD candidiate. Her main research interests include differential privacy and privacy auditing

    Wang Leixia: born in 1994. PhD candidate. Her main research interests include secure data collection, differential privacy and its application

    Meng Xiaofeng: born in 1964. Professor and PhD supervisor. Fellow of CCF. His main research interests include cloud data management, web data management, privacy preserving, etc.(xfmeng@ruc.edu.cn)

  • Received Date: June 16, 2024
  • Revised Date: February 10, 2025
  • Accepted Date: March 02, 2025
  • Available Online: March 02, 2025
  • Privacy auditing is a crucial issue of data governance, aiming to detect whether data privacy has been protected effectively. Typically, scholars would protect personal private data to meet differential privacy guarantees by perturbing data or adding noise to them. Especially in scenarios of machine learning, an increasing number of differential privacy algorithms have emerged, claiming a relatively stringent level of privacy protection. Although rigorous mathematical proofs of privacy have been conducted before the algorithms’ release, the actual effect on privacy in practice is hardly assured. Due to the complexity of the theory of differential privacy, the correctness of their proofs may not have been thoroughly examined, and imperceptible errors may occur during programming. All of these can undermine the extent of privacy protection to the claimed degree, leaking additional privacy. To tackle this issue, privacy auditing for differential privacy algorithms has emerged. This technique aims to obtain the actual degree of privacy-preserving of differential privacy algorithms, facilitating the discovery of mistakes and improving existing differential privacy algorithms. This paper surveys the scenarios and methods of privacy auditing, summarizing the methods from three aspects―data construction, data measurement, and result quantification, and evaluating them through experiments. Finally, this work presents the challenges of privacy auditing and its future direction.

  • [1]
    Archer D W, Pigem B B, Bogdanov D, et al. UN handbook on privacy-preserving computation techniques[J]. arXiv preprint, arXiv: 2301.06167, 2023
    [2]
    Dwork C. Differential privacy: A survey of results [C/OL] //Proc of the 5th Int Conf on Theory and Applications of Models of Computation, Berlin: Springer , 2008[2024-01-24]. https://web.cs.ucdavis.edu/~franklin/ecs289/2010/dwork_2008.pdf
    [3]
    Dwork C, Lei J. Differential privacy and robust statistics[C]//Proc of the 41st Annual ACM Symp on Theory of Computing. New York: ACM, 2009. 371–380
    [4]
    Smith A. Privacy-preserving statistical estimation with optimal convergence rates[C]//Proc of the 43rd Annual ACM Symp on Theory of Computing. New York: ACM, 2011: 813−822
    [5]
    Stevens T, Ngong I C, Darais D, et al. Backpropagation clipping for deep learning with differential privacy[J]. arXiv preprint, arXiv: 2202.05089, 2022
    [6]
    Tramer F, Terzis A, Steinke T, et al. Debugging differential privacy: A case study for privacy auditing[J]. arXiv preprint, arXiv: 2202.12219, 2022
    [7]
    Baldoni R, Coppa E, D’elia D C, et al. A survey of symbolic execution techniques[J]. ACM Computing Surveys (CSUR), 2018, 51(3): 1−39
    [8]
    陈若曦,金海波,陈晋音,等. 面向深度学习模型的可靠性测试综述[J]. 信息安全学报,2024,9(1):33−55

    Chen Ruoxi, Jin Haibo, Chen Jinyin, et al. Deep learning testing for reliability: A survey[J]. Journal of Cyber Security, 2024, 9(1): 33−55(in Chinese)
    [9]
    Barthe G, Gaboardi M, Arias E J G, et al. Proving differential privacy in Hoare logic[C]//Proc of the 27th Computer Security Foundations Symp. Piscataway, NJ: IEEE, 2014: 411−424
    [10]
    Barthe G, Danezis G, Grégoire B, et al. Verified computational differential privacy with applications to smart metering[C]// Proc of the 26th Computer Security Foundations Symposium. Piscataway, NJ: IEEE, 2013: 287−30
    [11]
    Barthe G, Fong N, Gaboardi M, et al. Advanced probabilistic couplings for differential privacy [C] //Proc of the 23rd ACM SIGSAC Conf on Computer and Communications Security. New York: ACM, 2016: 55−67
    [12]
    Barthe G, Gaboardi M, Grégoire B, et al. Proving differential privacy via probabilistic couplings[C]//Proc of the 31st Annual ACM/IEEE Symp on Logic in Computer Science. New York: ACM, 2016: 749−758
    [13]
    Barthe G, Köpf B, Olmedo F, et al. Probabilistic relational reasoning for differential privacy[C]//Proc of the 39th Annual ACM SIGPLAN-SIGACT Symp on Principles of Programming Languages. New York: ACM, 2012: 97−110
    [14]
    Barthe G, Olmedo F. Beyond differential privacy: Composition theorems and relational logic for f-divergences between probabilistic programs[C]//Proc of the 40th Int Colloquium on Automata, Languages, and Programming. Berlin: Springer, 2013: 49−60
    [15]
    Gaboardi M, Haeberlen A, Hsu J, et al. Linear dependent types for differential privacy[C]//Proc of the 40th Annual ACM SIGPLAN-SIGACT Symp on Principles of Programming Languages. New York: ACM, 2013: 357−370
    [16]
    Ding Zeyu, Wang Yuxin, Wang Guanhong, et al. Detecting violations of differential privacy[C]//Proc of the 25th ACM SIGSAC Conf on Computer and Communications Security. New York: ACM, 2018: 475−489
    [17]
    Bichsel B, Gehr T, Drachsler-Cohen D, et al. Dp-finder: Finding differential privacy violations by sampling and optimization[C]//Proc of the 25th ACM SIGSAC Conf on Computer and Communications Security. New York: ACM, 2018: 508−524
    [18]
    McSherry F D. Privacy integrated queries: an extensible platform for privacy-preserving data analysis[C]//Proc of the 34th ACM SIGMOD Int Conf on Management of Data. New York: ACM, 2009: 19−30
    [19]
    Roy I, Setty S T V, Kilzer A, et al. Airavat: Security and privacy for MapReduce[C]// Proc of the 7th USENIX Conf on Networked Systems Design and Implementation. New York: ACM, 2010: 297−312
    [20]
    Mohan P, Thakurta A, Shi E, et al. GUPT: Privacy preserving data analysis made easy[C]// Proc of the 12th ACM SIGMOD Int Conf on Management of Data. New York: ACM, 2012: 349−360
    [21]
    Abadi M , Chu A, Goodfellow I , et al. Deep learning with differential privacy[C]// Proc of the 23rd ACM SIGSAC Conf on Computer and Communications Security. New York: ACM, 2016: 308–318
    [22]
    McMahan H B, Daniel R, Kunal T, et al. Learning differentially private recurrent language models. [J]. arXiv preprint, arXiv: 1710.06963, 2017
    [23]
    叶青青,孟小峰,朱敏杰,等. 本地化差分隐私研究综述[J]. 软件学报,2018,29(7):159−183

    Ye Qingqing, Meng Xiaofeng, Zhu Minjie, et al. Survey on local differential privacy[J]. Journal of Software, 2018, 29(7): 159−183 (in Chinese)
    [24]
    McMahan B, Moore E, Ramage D, et al. Communication-efficient learning of deep networks from decentralized data [C]// Proc of the 20th Int Conf on Artificial Intelligence and Statistics. New York: PMLR, 2017: 1273−1282
    [25]
    Wang Tianhao, Blocki J, Li N, et al. Locally differentially private protocols for frequency estimation[C]// Proc of the 26th USENIX Conf on Security Symp. Berkeley, CA: USENIX Association, 2017: 729−745
    [26]
    Shokri R, Stronati M, Song C, et al. Membership inference attacks against machine learning models[C]// Proc of the 37th IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2017: 3−18
    [27]
    Dong Jinshuo, Aaron R, Su W J, Gaussian differential privacy. [J]Journal of the Royal Statistical Society Series B : Statistical Methodology 2022: 84(1): 3−37
    [28]
    Pillutla K, Andrew G, Kairouz P, et al. Unleashing the power of randomization in auditing differentially private ML[C]//Proc of the 36th Advances in Neural Information Processing Systems. New York: Curran Associates, 2023: 66201−66238
    [29]
    Domingo-Enrich C, Mroueh Y. Auditing differential privacy in high dimensions with the kernel quantum R\'enyi divergence[J]. arXiv preprint, arXiv: 2205.13941, 2022
    [30]
    Jagielski M, Ullman J, Oprea A. Auditing differentially private machine learning: How private is private sgd?[C]//Proc of the 33rd Int Conf on Neural Information Processing Systems. New York: Curran Associates, 2020: 22205−22216
    [31]
    Lu Fred, Munoz J, Fuchs M, et al. A general framework for auditing differentially private machine learning[C]//Proc of the 35th Int Conf on Neural Information Processing Systems. New York: Curran Associates, 2022: 4165−4176
    [32]
    Chaudhuri K, Monteleoni C. Privacy-preserving logistic regression[C]//Proc of the 21st Advances in Neural Information Processing Systems. New York: Curran Associates, 2008, 21: 289−296
    [33]
    Chaudhuri K, Monteleoni C, Sarwate A D. Differentially private empirical risk minimization[J]. Journal of Machine Learning Research, 2011, 12(3): 1069−1109
    [34]
    Vaidya J, Shafiq B, Basu A, et al. Differentially private naive bayes classification[C]//Proc of the 10th IEEE/WIC/ACM Int Joint Conf on Web Intelligence (WI) and Intelligent Agent Technologies (IAT). Piscataway, NJ: IEEE, 2013: 571−576
    [35]
    Fletcher S, Islam M Z. Differentially private random decision forests using smooth sensitivity[J]. Expert Systems with Applications, 2017, 78: 16−31 doi: 10.1016/j.eswa.2017.01.034
    [36]
    Nasr M, Songi S, Thakurta A, et al. Adversary instantiation: Lower bounds for differentially private machine learning[C]//Proc of the 42nd IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2021: 866−882
    [37]
    Matsumoto M, Takahashi T, Liew S P, et al. Measuring lower bounds of local differential privacy via adversary instantiations in federated learning. [J]arXiv preprint, arXiv: 2206.09122, 2022
    [38]
    Erlingsson Ú, Feldman V, Mironov I, et al. Encode, shuffle, analyze privacy revisited: Formalizations and empirical evaluation[J]. arXiv preprint, arXiv: 2001.03618, 2020
    [39]
    Ian J G, Jonathon S, Christian S. Explaining and harnessing adversarial examples[J]. arXiv preprint, arXiv: 1412.6572, 2014
    [40]
    Arcolezi H H, Gambs S. Revealing the true cost of local privacy: An auditing perspective[J]. arXiv preprint, arXiv: 2309.01597, 2023
    [41]
    Kairouz P, Bonawitz K, Ramage D. Discrete distribution estimation under local privacy[C]// Proc of the 33rd Int Conf on Int Conf on Machine Learning. New York: PMLR, 2016: 2436−2444
    [42]
    Wang Shaowei, Huang Liusheng, Wang Pengzhan, et al. Mutual information optimally local private discrete distribution estimation[J]. arXiv preprint, arXiv: 1607.08025, 2016
    [43]
    Ye Min, Barg A. Optimal schemes for discrete distribution estimation under locally differential privacy[J]. IEEE Transaction on Infomation Theory, 2018, 64(8): 5662−5676 doi: 10.1109/TIT.2018.2809790
    [44]
    Erlingsson Ú, Pihur V, Korolova A. Rappor: Randomized aggregatable privacy-preserving ordinal response[C]//Proc of the 21st ACM SIGSAC Conf on Computer and Communications Security. New York: ACM, 2014: 1054−1067
    [45]
    Bassily R, Smith A. Local, private, efficient protocols for succinct histograms[C]//Proc of the 47th Annual ACM Symp on Theory of Computing. New York: ACM, 2015: 127−135
    [46]
    Maddock S, Sablayrolles A, Stock P. CANIFE: Crafting canaries for empirical privacy measurement in federated learning[J]. arXiv preprint, arXiv: 2210.02912, 2022
    [47]
    Steinke T, Nasr M, Jagielski M. Privacy auditing with one (1) training run[C]//Proc of the 36th Advances in Neural Information Processing Systems. New York: Curran Associates, 2023: 49268−49280
    [48]
    Andrew G, Kairouz P, Oh S, et al. One-shot empirical privacy estimation for federated learning[J]. arXiv preprint, arXiv: 2302.03098, 2023
    [49]
    Nasr M, Hayes J, Steinke T, et al. Tight auditing of differentially private machine learning[C]// Proc of the 32nd USENIX Conf on Security Symp. Berkeley, CA: USENIX Association, 2023: 1631−1648
    [50]
    Chadha K, Jagielski M, Papernot N, et al. Auditing private prediction[J]. arXiv preprint, arXiv: 2402.09403, 2024
    [51]
    Papernot N, Abadi M, Erlingsson U, et al. Semi-supervised knowledge transfer for deep learning from private training data[J]. arXiv preprint, arXiv: 1610.05755, 2016
    [52]
    Papernot N, Song Shuang, Mironov I, et al. Scalable private learning with pate[J]. arXiv preprint, arXiv: 1802.08908, 2018
    [53]
    Choquette-Choo C A, Dullerud N, Dziedzic A, et al. Capc learning: Confidential and private collaborative learning[J]. arXiv preprint, arXiv: 2102.05188, 2021
    [54]
    Duan H, Dziedzic A, Papernot N, et al. Flocks of stochastic parrots: Differentially private prompt learning for large language models[C]//Proc of the 36th Advances in Neural Information Processing Systems. New York: Curran Associates, 2024: 76852−76871
    [55]
    Zhu Yuqing, Yu Xiang, Chandraker M, et al. Private-kNN: Practical differential privacy for computer vision[C]//Proc of the 43rd IEEE/CVF Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2020: 11854−11862
    [56]
    Dwork C, Feldman V. Privacy-preserving prediction[C] //Proc of the 31st Conf on Learning Theory. New York: Curran Associates, 2018: 1693−1702
    [57]
    Mironov I. Rényi differential privacy[C]// Proc of the 30th IEEE Computer Security Foundations Symp (CSF). Piscataway, NJ: IEEE, 2017: 263−275
    [58]
    Bernau D, Eibl G, Grassal P W, et al. Quantifying identifiability to choose and audit ε in differentially private deep learning[C]//Proceedings of the VLDB Endowment, 2022, 14(13): 3335–3347
    [59]
    Lee J, Clifton C. How much is enough? Choosing ε for differential privacy[C]//Proc of the 14th Int Conf on Information Security, Berlin: Springer, 2011: 325−340
    [60]
    Yeom S, Giacomelli I, Fredrikson M, et al. Privacy risk in machine learning: Analyzing the connection to overfitting[C]// Proc of the 31st IEEE Computer Security Foundations Symp (CSF). Piscataway, NJ: IEEE, 2018: 268−282
    [61]
    Zanella-Béguelin S, Wutschitz L, Tople S, et al. Bayesian estimation of differential privacy[C]// Proc of the 40th Int Conf on Machine Learning. New York: PMLR, 2023: 40624−40636
    [62]
    Krizhevsky A , Hinton G . Learning multiple layers of features from tiny images[J/OL]. Handbook of Systemic Autoimmune Diseases, 2009, 1(4): [2025-01-23]. https://www.cs.utoronto.ca/~kriz/learning-features-2009-TR.pdf
    [63]
    Liu Ziwei, Luo Peng, Wang Xiaogang, et al. Deep learning face attributes in the wild[C]//Proc of the 15th IEEE Int Conf on Computer Vision. Piscataway, NJ: IEEE, 2015: 3730−3738
    [64]
    Go A, Bhayani R, Huang Lei. Twitter sentiment classification using distant supervision[R/OL]. 2009[2025-01-23]. https://www-cs.stanford.edu/people/alecmgo/papers/TwitterDistantSupervision09.pdf
    [65]
    Cun Y L , Boser B , Denker J , et al. Handwritten digit recognition with a backpropogation network[C]//Proc of the 2nd Advances in Neural Information Processing Systems. New York: Curran Associates, 1989: 396−404
    [66]
    He Kaiming, Zhang Xiangyu, Ren Shaoqing, et al. Deep residual learning for image recognition [C] //Proc of the 29th IEEE Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2016: 770−778
    [67]
    Zagoruyko S, Nikos K. Wide residual networks[J]. arXiv preprint, arXiv: 1605.07146
    [68]
    Ye Qingqing , Hu Haibo , Meng Xiaofeng , et al. PrivKV: Key-value data collection with local differential privacy[C]// Proc of the 40th IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2019: 317−331
    [69]
    Ye Qinging , Hu Haibo, Li Ninghui , et al. Beyond value perturbation: local differential privacy in the temporal setting[C/OL]//Proc of the IEEE Conf on Computer Communications. Piscataway, NJ: IEEE, 2021[2025-01-23]. https://drive.google.com/file/d/1ODzRtejolFnKHoZr_cGjQr6E6lMBJeFL/view
    [70]
    Ghazi B, Golowich N, Kumar R, et al. Deep learning with label differential privacy[C]//Proc of the 34th Advances in Neural Information Processing Systems. New York: Curran Associates, 2021: 27131−27145
    [71]
    Malek Esmaeili M, Mironov I, Prasad K, et al. Antipodes of label differential privacy: Pate and alibi[C]//Proc of the 34th Advances in Neural Information Processing Systems. New York: Curran Associates, 2021: 6934−6945
    [72]
    Ghazi B, Kamath P, Kumar R, et al. Regression with label differential privacy[J]. arXiv preprint, arXiv: 2212.06074, 2022
    [73]
    Busa-Fekete R I, Syed U, Vassilvitskii S. On the pitfalls of label differential privacy[EB/OL]. 2021[2025-01-20]. https://openreview.net/forum?id=2sWidqliCDG
    [74]
    Brahmbhatt A, Saket R, Havaldar S, et al. Label differential privacy via aggregation[J]. arXiv preprint, arXiv: 2310.10092, 2023
    [75]
    Esfandiari H, Mirrokni V, Syed U, et al. Label differential privacy via clustering[C/OL]//Proc of the 25th Int Conf on Artificial Intelligence and Statistics. New York: PMLR, 2022: 7055−7075
    [76]
    Bitau A, Erlingssonú, Maniatis P, et al. Prochlo: Strong privacy for analytics in the crowd [C]//Proc of the 17th ACM Symp on Operating Systems Principles. New York: ACM, 2017: 441−459
    [77]
    Wu Ruihan, Zhou Jinpeng, Weinberger K Q, et al. Does label differential privacy prevent label inference attacks?[C]//Proc of the 26th Int Conf on Artificial Intelligence and Statistics. New York: PMLR, 2023: 4336−4347
  • Related Articles

    [1]Li Dongwen, Zhong Zhenyu, Sun Yufei, Shen Junyu, Ma Zizhi, Yu Chuanyue, Zhang Yuzhi. LingLong: A High-Quality Small-Scale Chinese Pre-trained Language Model[J]. Journal of Computer Research and Development, 2025, 62(3): 682-693. DOI: 10.7544/issn1000-1239.202330844
    [2]Jiang Yi, Yang Yong, Yin Jiali, Liu Xiaolei, Li Jiliang, Wang Wei, Tian Youliang, Wu Yingcai, Ji Shouling. A Survey on Security and Privacy Risks in Large Language Models[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440265
    [3]Zhang Naizhou, Cao Wei, Zhang Xiaojian, Li Shijun. Conversation Generation Based on Variational Attention Knowledge Selection and Pre-trained Language Model[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440551
    [4]Yi Xiaoyuan, Xie Xing. Unpacking the Ethical Value Alignment in Big Models[J]. Journal of Computer Research and Development, 2023, 60(9): 1926-1945. DOI: 10.7544/issn1000-1239.202330553
    [5]Feng Jun, Shi Yichen, Gao Yuhao, He Jingjing, Yu Zitong. Domain Adaptation for Face Anti-Spoofing Based on Dual Disentanglement and Liveness Feature Progressive Alignment[J]. Journal of Computer Research and Development, 2023, 60(8): 1727-1739. DOI: 10.7544/issn1000-1239.202330251
    [6]Zeng Weixin, Zhao Xiang, Tang Jiuyang, Tan Zhen, Wang Wei. Iterative Entity Alignment via Re-Ranking[J]. Journal of Computer Research and Development, 2020, 57(7): 1460-1471. DOI: 10.7544/issn1000-1239.2020.20190643
    [7]Shi Haihe, Zhou Weixing. Design and Implementation of Pairwise Sequence Alignment Algorithm Components Based on Dynamic Programming[J]. Journal of Computer Research and Development, 2019, 56(9): 1907-1917. DOI: 10.7544/issn1000-1239.2019.20180835
    [8]Jia Xibin, Jin Ya, Chen Juncheng. Domain Alignment Based on Multi-Viewpoint Domain-Shared Feature for Cross-Domain Sentiment Classification[J]. Journal of Computer Research and Development, 2018, 55(11): 2439-2451. DOI: 10.7544/issn1000-1239.2018.20170496
    [9]Wang Yuquan, Wen Lijie, Yan Zhiqiang. Alignment Based Conformance Checking Algorithm for BPMN 2.0 Model[J]. Journal of Computer Research and Development, 2017, 54(9): 1920-1930. DOI: 10.7544/issn1000-1239.2017.20160756
    [10]Zhuang Yan, Li Guoliang, Feng Jianhua. A Survey on Entity Alignment of Knowledge Base[J]. Journal of Computer Research and Development, 2016, 53(1): 165-192. DOI: 10.7544/issn1000-1239.2016.20150661
  • Cited by

    Periodical cited type(0)

    Other cited types(6)

Catalog

    Article views (25) PDF downloads (13) Cited by(6)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return