• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
高级检索

机器学习的隐私保护研究综述

刘俊旭, 孟小峰

刘俊旭, 孟小峰. 机器学习的隐私保护研究综述[J]. 计算机研究与发展, 2020, 57(2): 346-362. DOI: 10.7544/issn1000-1239.2020.20190455
引用本文: 刘俊旭, 孟小峰. 机器学习的隐私保护研究综述[J]. 计算机研究与发展, 2020, 57(2): 346-362. DOI: 10.7544/issn1000-1239.2020.20190455
Liu Junxu, Meng Xiaofeng. Survey on Privacy-Preserving Machine Learning[J]. Journal of Computer Research and Development, 2020, 57(2): 346-362. DOI: 10.7544/issn1000-1239.2020.20190455
Citation: Liu Junxu, Meng Xiaofeng. Survey on Privacy-Preserving Machine Learning[J]. Journal of Computer Research and Development, 2020, 57(2): 346-362. DOI: 10.7544/issn1000-1239.2020.20190455
刘俊旭, 孟小峰. 机器学习的隐私保护研究综述[J]. 计算机研究与发展, 2020, 57(2): 346-362. CSTR: 32373.14.issn1000-1239.2020.20190455
引用本文: 刘俊旭, 孟小峰. 机器学习的隐私保护研究综述[J]. 计算机研究与发展, 2020, 57(2): 346-362. CSTR: 32373.14.issn1000-1239.2020.20190455
Liu Junxu, Meng Xiaofeng. Survey on Privacy-Preserving Machine Learning[J]. Journal of Computer Research and Development, 2020, 57(2): 346-362. CSTR: 32373.14.issn1000-1239.2020.20190455
Citation: Liu Junxu, Meng Xiaofeng. Survey on Privacy-Preserving Machine Learning[J]. Journal of Computer Research and Development, 2020, 57(2): 346-362. CSTR: 32373.14.issn1000-1239.2020.20190455

机器学习的隐私保护研究综述

基金项目: 国家自然科学基金项目(91646203,61532010,91846204,61532016,61762082);国家重点研发计划项目(2016YFB1000602,2016YFB1000603)
详细信息
  • 中图分类号: TP391

Survey on Privacy-Preserving Machine Learning

Funds: This work was supported by the National Natural Science Foundation of China (91646203, 61532010, 91846204, 61532016, 61762082) and the National Key Research and Development Program of China (2016YFB1000602, 2016YFB1000603).
  • 摘要: 大规模数据收集大幅提升了机器学习算法的性能,实现了经济效益和社会效益的共赢,但也令个人隐私保护面临更大的风险与挑战.机器学习的训练模式主要分为集中学习和联邦学习2类,前者在模型训练前需统一收集各方数据,尽管易于部署,却存在极大数据隐私与安全隐患;后者实现了将各方数据保留在本地的同时进行模型训练,但该方式目前正处于研究的起步阶段,无论在技术还是部署中仍面临诸多问题与挑战.现有的隐私保护技术研究大致分为2条主线,即以同态加密和安全多方计算为代表的加密方法和以差分隐私为代表的扰动方法,二者各有利弊.为综述当前机器学习的隐私问题,并对现有隐私保护研究工作进行梳理和总结,首先分别针对传统机器学习和深度学习2类情况,探讨集中学习下差分隐私保护的算法设计;之后概述联邦学习中存在的隐私问题及保护方法;最后总结目前隐私保护中面临的主要挑战,并着重指出隐私保护与模型可解释性研究、数据透明之间的问题与联系.
    Abstract: Large-scale data collection has vastly improved the performance of machine learning, and achieved a win-win situation for both economic and social benefits, while personal privacy preservation is facing new and greater risks and crises. In this paper, we summarize the privacy issues in machine learning and the existing work on privacy-preserving machine learning. We respectively discuss two settings of the model training process—centralized learning and federated learning. The former needs to collect all the user data before training. Although this setting is easy to deploy, it still exists enormous privacy and security hidden troubles. The latter achieves that massive devices can collaborate to train a global model while keeping their data in local. As it is currently in the early stage of the study, it also has many problems to be solved. The existing work on privacy-preserving techniques can be concluded into two main clues—the encryption method including homomorphic encryption and secure multi-party computing and the perturbation method represented by differential privacy, each having its advantages and disadvantages. In this paper, we first focus on the design of differentially-private machine learning algorithm, especially under centralized setting, and discuss the differences between traditional machine learning models and deep learning models. Then, we summarize the problems existing in the current federated learning study. Finally, we propose the main challenges in the future work and point out the connection among privacy protection, model interpretation and data transparency.
  • 期刊类型引用(3)

    1. 白婷,刘轩宁,吴斌,张梓滨,徐志远,林康熠. 基于多粒度特征交叉剪枝的点击率预测模型. 计算机研究与发展. 2024(05): 1290-1298 . 本站查看
    2. 李莎莎,崔铁军. 系统故障演化过程中故障事件发生概率的修正方法研究. 安全与环境学报. 2024(06): 2068-2074 . 百度学术
    3. 苗忠琦,童向荣. 一种偏差和方差双降的双鲁棒去偏学习模型. 小型微型计算机系统. 2024(11): 2663-2672 . 百度学术

    其他类型引用(1)

计量
  • 文章访问数:  6276
  • HTML全文浏览量:  26
  • PDF下载量:  5948
  • 被引次数: 4
出版历程
  • 发布日期:  2020-01-31

目录

    /

    返回文章
    返回