ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2021, Vol. 58 ›› Issue (5): 1106-1117.doi: 10.7544/issn1000-1239.2021.20200903

所属专题: 2021人工智能安全与隐私保护技术专题

• 信息安全 • 上一篇    下一篇

针对深度神经网络模型指纹检测的逃避算法

钱亚冠1,2,何念念1,2,郭艳凯1,2,王滨2,李晖3,顾钊铨4,张旭鸿5,吴春明6   

  1. 1(浙江科技学院大数据学院 杭州 310023);2(海康威视&浙江科技学院边缘智能安全联合实验室 杭州 310023);3(西安电子科技大学网络与信息安全学院 西安 710071);4(广州大学网络空间先进技术研究院 广州 510006);5(浙江大学控制科学与工程学院 杭州 310058);6(浙江大学计算机科学与技术学院 杭州 310058) (qianyg@yeah.net)
  • 出版日期: 2021-05-01
  • 基金资助: 
    国家重点研发计划项目(2018YFB2100400,2018YFB1800601);国家自然科学基金项目(61902082);浙江省重点研发计划项目(2020C01077,2021C01036,2020C01021);之江实验室科技预研项目(2018FD0ZX01)

An Evasion Algorithm to Fool Fingerprint Detector for Deep Neural Networks

Qian Yaguan1,2, He Niannian1,2, Guo Yankai1,2, Wang Bin2, Li Hui3, Gu Zhaoquan4, Zhang Xuhong5, Wu Chunming6   

  1. 1(School of Big-data Science, Zhejiang University of Science and Technology, Hangzhou 310023);2(Edge Intelligence Security Joint Laboratory, Hikvision & Zhejiang University of Science and Technology, Hangzhou 310023);3(School of Cyber Engineering, Xidian University, Xi’an 710071);4(Cyberspace Institute of Advanced Technology, Guangzhou University, Guangzhou 510006);5(College of Control Science and Engineering, Zhejiang University, Hangzhou 310058);6(College of Computer Science and Technology, Zhejiang University, Hangzhou 310058)
  • Online: 2021-05-01
  • Supported by: 
    This work was supported by the National Key Research and Development Program of China (2018YFB2100400, 2018YFB1800601), the National Natural Science Foundation of China (61902082), the Key Research and Development Program of Zhejiang Province (2020C01077, 2021C01036, 2020C01021), and the Major Scientific Project of Zhejiang Lab (2018FD0ZX01).

摘要: 随着深度神经网络在不同领域的成功应用,模型的知识产权保护成为了一个备受关注的问题.由于深度神经网络的训练需要大量计算资源、人力成本和时间成本,攻击者通过窃取目标模型参数,可低成本地构建本地替代模型.为保护模型所有者的知识产权,最近提出的模型指纹比对方法,利用模型决策边界附近的指纹样本及其指纹查验模型是否被窃取,具有不影响模型自身性能的优点.针对这类基于模型指纹的保护策略,提出了一种逃避算法,可以成功绕开这类保护策略,揭示了模型指纹保护的脆弱性.该逃避算法的核心是设计了一个指纹样本检测器——Fingerprint-GAN.利用生成对抗网络(generative adversarial network, GAN)原理,学习正常样本在隐空间的特征表示及其分布,根据指纹样本与正常样本在隐空间中特征表示的差异性,检测到指纹样本,并向目标模型所有者返回有别于预测的标签,使模型所有者的指纹比对方法失效.最后通过CIFAR-10,CIFAR-100数据集评估了逃避算法的性能,实验结果表明:算法对指纹样本的检测率分别可达95%和94%,而模型所有者的指纹比对成功率最高仅为19%,证明了模型指纹比对保护方法的不可靠性.

关键词: 知识产权保护, 模型窃取, 模型指纹, 生成对抗网络, 逃避算法

Abstract: With the successful application of deep neural networks in various fields, the protection of intellectual property of models becomes more important. Since training the deep neural network requires a large number of computing resources, labor costs, and time costs, some people attempt to build a local substitute model with lower cost by stealing the target model’s parameters. For protecting the intellectual property of model owners, a model fingerprint matching method is proposed recently, which uses the fingerprint examples near the decision boundary of the model and their fingerprints to check whether their models have been stolen. The advantage of this method is that it does not affect the performance of the model itself. However, this protection strategy has some vulnerabilities, and we propose an evasion algorithm to successfully bypass the protection. The key component of our evasion algorithm is a fingerprint-example detector termed as Fingerprint-GAN. The Fingerprint-GAN first learns the feature representation and distribution of normal examples in a latent space. According to the difference of the feature representation in the latent space between the fingerprint examples and the normal examples, the Fingerprint-GAN finds the fingerprint examples. Finally, the labels of the fingerprint examples different from the predictions are returned to fool fingerprint matching method of the target model owner. Extensive experiments are conducted on CIFAR-10 and CIFAR-100. The results show that the detection rate of this algorithm for fingerprint examples can reach 95% and 94%, respectively, while the model owner’s fingerprint matching success rate is only 19%, which proves the unreliability of the model fingerprint matching protection method.

Key words: intellectual property protection, model stealing, model fingerprints, generative adversarial network, evasion algorithms

中图分类号: