ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2020, Vol. 57 ›› Issue (10): 2066-2085.doi: 10.7544/issn1000-1239.2020.20200426

所属专题: 2020密码学与数据隐私保护研究专题

• 信息安全 • 上一篇    下一篇



  1. (上海海洋大学信息学院 上海 201306) (
  • 出版日期: 2020-10-01
  • 基金资助: 

Security Issues and Privacy Preserving in Machine Learning

Wei Lifei, Chen Congcong, Zhang Lei, Li Mengsi, Chen Yujiao, Wang Qin   

  1. (College of Information Technology, Shanghai Ocean University, Shanghai 201306)
  • Online: 2020-10-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (61972241, 61802248, 61672339), the Natural Science Foundation of Shanghai (18ZR1417300), and the Luo Zhaorao Science and Technology Innovation Fund of Shanghai Ocean University (A1-2004-20-201312).

摘要: 近年来,机器学习迅速地发展,给人们带来便利的同时,也带来极大的安全隐患.机器学习的安全与隐私问题已经成为其发展的绊脚石.机器学习模型的训练和预测均是基于大量的数据,而数据中可能包含敏感或隐私信息,随着数据安全与隐私泄露事件频发、泄露规模连年加剧,如何保证数据的安全与隐私引发科学界和工业界的广泛关注.首先,介绍了机器学习隐私保护中的敌手模型的概念;其次总结机器学习在训练和预测阶段常见的安全及隐私威胁,如训练数据的隐私泄露、投毒攻击、对抗攻击、隐私攻击等.随后介绍了常见的安全防御方法和隐私保护方法,重点介绍了同态加密技术、安全多方计算技术、差分隐私技术等,并比较了典型的方案及3种技术的适用场景.最后,展望机器学习隐私保护的未来发展趋势和研究方向.

关键词: 机器学习, 隐私保护, 安全威胁, 安全多方计算, 同态加密, 差分隐私

Abstract: In recent years, machine learning has developed rapidly, and it is widely used in the aspects of work and life, which brings not only convenience but also great security risks. The security and privacy issues have become a stumbling block in the development of machine learning. The training and inference of the machine learning model are based on a large amount of data, which always contains some sensitive information. With the frequent occurrence of data privacy leakage events and the aggravation of the leakage scale annually, how to make sure the security and privacy of data has attracted the attention of the researchers from academy and industry. In this paper we introduce some fundamental concepts such as the adversary model in the privacy preserving of machine learning and summarize the common security threats and privacy threats in the training and inference phase of machine learning, such as privacy leakage of training data, poisoning attack, adversarial attack, privacy attack, etc. Subsequently, we introduce the common security protecting and privacy preserving methods, especially focusing on homomorphic encryption, secure multi-party computation, differential privacy, etc. and compare the typical schemes and applicable scenarios of the three technologies. At the end, the future development trend and research direction of machine learning privacy preserving are prospected.

Key words: machine learning, privacy preserving, security threat, secure multi-party computation, homomorphic encryption, differential privacy