Fishing Leakage of Deep Learning Training Data via Neuron Activation Pattern Manipulation

Pan Xudong; Zhang Mi; Yang Min

doi:10.7544/issn1000-1239.20220498

Journal of Computer Research and Development > 2022 > 59(10): 2323-2337. > DOI: 10.7544/issn1000-1239.20220498 CSTR: 32373.14.issn1000-1239.20220498

Pan Xudong, Zhang Mi, Yang Min. Fishing Leakage of Deep Learning Training Data via Neuron Activation Pattern Manipulation[J]. Journal of Computer Research and Development, 2022, 59(10): 2323-2337. DOI: 10.7544/issn1000-1239.20220498

Citation:

PDF (3589 KB)

Fishing Leakage of Deep Learning Training Data via Neuron Activation Pattern Manipulation

(School of Computer Science, Fudan University, Shanghai 200438)

Funds: This work was supported by the National Key Research and Development Program (2021YFB3101200), the National Natural Science Foundation of China (61972099, U1736208, U1836210, U1836213, 62172104, 62172105, 61902374, 62102093, 62102091), and the Natural Science Foundation of Shanghai (19ZR1404800).

More Information

Published Date: September 30, 2022

Graphical Abstract

Abstract

Abstract

The rise of distributed deep learning in the open network brings potential risks of data leakage. As one of the core information media in the construction of distributed learning systems, the training gradient is the joint product between the model and the training data of the local clients, which contains the private information of the corresponding user. Therefore, the research in recent years has witnessed the discovery of a number of new attack surfaces, where data reconstruction attacks probably pose the severest threats on user privacy: From the average gradient of a deep neural network (DNN) on a training batch only, an attacker can reconstruct every individual sample in the batch with almost no distortion. However, existing data reconstruction attacks mostly stay at a demonstrative and experimental level. Little is known on the underlying mechanism of data reconstruction attacks. Although a very recent work reveals a training batch which satisfies a certain neuron activation exclusivity condition can be reconstructed within a provable upper bound on the reconstruction error, our empirical results show the probability of a realistic batch to satisfy their proposed exclusivity condition is scarce, which may be impractical for in-the-wild attacks. To enhance the effectiveness and the coverage of the theory-oriented attack, we propose a novel neuron activation manipulation algorithm based on linear programming techniques, which automatically generates small perturbations to each sample in the target batch to satisfy the exclusivity condition. Therefore, the perturbed batch can be provably reconstructed with the theory-oriented attack, leading to privacy breach. In practice, by deploying our proposed algorithm at the local client, an honest-but-curious distributed learning server can fish deep data leakage from the average gradients submitted by the clients during the training. Extensive experiments on 5 datasets spanning face recognition and intelligent diagnosis applications show that our proposed approach increases the size of reconstructable training batches from 8 to practical training batch sizes, and accelerates the attack process by 10 times. Meanwhile, the reconstructed results have competitive quality to the results of existing data reconstruction attacks.
- deep learning privacy,
- neuron activation pattern,
- deep learning,
- artificial intelligence,
- information security

FullText(HTML)

References (0)

[1]	Ge Zhenxing, Xiang Shuai, Tian Pinzhuo, Gao Yang. Solving GuanDan Poker Games with Deep Reinforcement Learning[J]. Journal of Computer Research and Development, 2024, 61(1): 145-155. DOI: 10.7544/issn1000-1239.202220697
[2]	Liu Qixu, Liu Jiaxi, Jin Ze, Liu Xinyu, Xiao Juxin, Chen Yanhui, Zhu Hongwen, Tan Yaokang. Survey of Artificial Intelligence Based IoT Malware Detection[J]. Journal of Computer Research and Development, 2023, 60(10): 2234-2254. DOI: 10.7544/issn1000-1239.202330450
[3]	Li Qian, Lin Chenhao, Yang Yulong, Shen Chao, Fang Liming. Adversarial Attacks and Defenses Against Deep Learning Under the Cloud-Edge-Terminal Scenes[J]. Journal of Computer Research and Development, 2022, 59(10): 2109-2129. DOI: 10.7544/issn1000-1239.20220665
[4]	Li Minghui, Jiang Peipei, Wang Qian, Shen Chao, Li Qi. Adversarial Attacks and Defenses for Deep Learning Models[J]. Journal of Computer Research and Development, 2021, 58(5): 909-926. DOI: 10.7544/issn1000-1239.2021.20200920
[5]	Chen Yufei, Shen Chao, Wang Qian, Li Qi, Wang Cong, Ji Shouling, Li Kang, Guan Xiaohong. Security and Privacy Risks in Artificial Intelligence Systems[J]. Journal of Computer Research and Development, 2019, 56(10): 2135-2150. DOI: 10.7544/issn1000-1239.2019.20190415
[6]	Cao Zhenfu. New Devolopment of Information Security——For the 60th Anniversary of Journal of Computer Research and Development[J]. Journal of Computer Research and Development, 2019, 56(1): 131-137. DOI: 10.7544/issn1000-1239.2019.20180756
[7]	Wang Yilei, Zhuo Yifan, Wu Yingjie, Chen Mingqin. Question Answering Algorithm on Image Fragmentation Information Based on Deep Neural Network[J]. Journal of Computer Research and Development, 2018, 55(12): 2600-2610. DOI: 10.7544/issn1000-1239.2018.20180606
[8]	Li Chao, Yin Lihua, Guo Yunchuan. Analysis for Probabilistic and Timed Information Flow Security Properties via ptSPA[J]. Journal of Computer Research and Development, 2011, 48(8): 1370-1380.
[9]	Wei Yong, Lian Yifeng, and Feng Dengguo. A Network Security Situational Awareness Model Based on Information Fusion[J]. Journal of Computer Research and Development, 2009, 46(3): 353-362.
[10]	Liu Guohua, Song Jinling, Huang Liming, Zhao Danfeng, Song Li. Measurement and Elimination of Information Disclosure in Publishing Views[J]. Journal of Computer Research and Development, 2007, 44(7): 1227-1235.

Cited By

Cited by

Periodical cited type(2)

1.	邵子豪，霍如，王志浩，倪东，谢人超. 基于区块链的移动群智感知数据处理研究综述. 浙江大学学报(工学版). 2024(06): 1091-1106 .
2.	赵贺贺，高鹏飞，张健明. 英式逆拍卖可以提高第三支柱养老保险市场效率吗？. 长沙民政职业技术学院学报. 2023(01): 74-80 .