Pan Xudong, Zhang Mi, Yang Min. Fishing Leakage of Deep Learning Training Data via Neuron Activation Pattern Manipulation[J]. Journal of Computer Research and Development, 2022, 59(10): 2323-2337. DOI: 10.7544/issn1000-1239.20220498
Citation:
Pan Xudong, Zhang Mi, Yang Min. Fishing Leakage of Deep Learning Training Data via Neuron Activation Pattern Manipulation[J]. Journal of Computer Research and Development, 2022, 59(10): 2323-2337. DOI: 10.7544/issn1000-1239.20220498
Pan Xudong, Zhang Mi, Yang Min. Fishing Leakage of Deep Learning Training Data via Neuron Activation Pattern Manipulation[J]. Journal of Computer Research and Development, 2022, 59(10): 2323-2337. DOI: 10.7544/issn1000-1239.20220498
Citation:
Pan Xudong, Zhang Mi, Yang Min. Fishing Leakage of Deep Learning Training Data via Neuron Activation Pattern Manipulation[J]. Journal of Computer Research and Development, 2022, 59(10): 2323-2337. DOI: 10.7544/issn1000-1239.20220498
(School of Computer Science, Fudan University, Shanghai 200438)
Funds: This work was supported by the National Key Research and Development Program (2021YFB3101200), the National Natural Science Foundation of China (61972099, U1736208, U1836210, U1836213, 62172104, 62172105, 61902374, 62102093, 62102091), and the Natural Science Foundation of Shanghai (19ZR1404800).
The rise of distributed deep learning in the open network brings potential risks of data leakage. As one of the core information media in the construction of distributed learning systems, the training gradient is the joint product between the model and the training data of the local clients, which contains the private information of the corresponding user. Therefore, the research in recent years has witnessed the discovery of a number of new attack surfaces, where data reconstruction attacks probably pose the severest threats on user privacy: From the average gradient of a deep neural network (DNN) on a training batch only, an attacker can reconstruct every individual sample in the batch with almost no distortion. However, existing data reconstruction attacks mostly stay at a demonstrative and experimental level. Little is known on the underlying mechanism of data reconstruction attacks. Although a very recent work reveals a training batch which satisfies a certain neuron activation exclusivity condition can be reconstructed within a provable upper bound on the reconstruction error, our empirical results show the probability of a realistic batch to satisfy their proposed exclusivity condition is scarce, which may be impractical for in-the-wild attacks. To enhance the effectiveness and the coverage of the theory-oriented attack, we propose a novel neuron activation manipulation algorithm based on linear programming techniques, which automatically generates small perturbations to each sample in the target batch to satisfy the exclusivity condition. Therefore, the perturbed batch can be provably reconstructed with the theory-oriented attack, leading to privacy breach. In practice, by deploying our proposed algorithm at the local client, an honest-but-curious distributed learning server can fish deep data leakage from the average gradients submitted by the clients during the training. Extensive experiments on 5 datasets spanning face recognition and intelligent diagnosis applications show that our proposed approach increases the size of reconstructable training batches from 8 to practical training batch sizes, and accelerates the attack process by 10 times. Meanwhile, the reconstructed results have competitive quality to the results of existing data reconstruction attacks.