高级检索

    基于神经元激活模式控制的深度学习训练数据泄露诱导

    Fishing Leakage of Deep Learning Training Data via Neuron Activation Pattern Manipulation

    • 摘要: 开放网络下分布式深度学习的兴起带来潜在的数据泄露风险.作为分布式模型构建中的重要信息载体,训练梯度是模型和端侧数据共同计算的产物,包含参与计算的私密用户数据信息.因此,近年的研究工作针对训练梯度提出一系列新型攻击方法.其中,尤以数据重建攻击(data reconstruction attack)所造成的攻击效果最佳:仅从深度神经网络的平均训练梯度中,攻击者即可近乎无损地恢复一个训练数据批次的各个样本.然而,已有数据重建攻击大多仅停留在攻击方法设计和实验验证层面,对重要实验现象缺乏深层机理分析.尽管有研究发现,满足特定神经元激活独占性(exclusivity)条件的任意大小训练数据批次能被攻击者从训练梯度中像素级重建,然而,实证研究表明在实际训练数据中满足该条件的训练数据批次比例较少,难以造成实际泄露威胁.为增强上述攻击的有效性和应用范围,提出基于线性规划的神经元激活模式控制算法,为给定训练批次生成微小扰动,从而满足神经元激活独占性,以增强后续数据重建攻击效能.在实际中,通过在端侧节点部署该算法,半诚实(honest-but-curious)分布式训练服务能诱导本地训练批次的训练梯度具有理论保证的可重建性.在5个涵盖人脸识别、智能诊断的数据集上的实验结果表明,提出方法在与原始攻击算法重建效果持平的情况下,将可重建训练批次大小从8张提升至实际应用大小,并提升攻击效率10倍以上.

       

      Abstract: The rise of distributed deep learning in the open network brings potential risks of data leakage. As one of the core information media in the construction of distributed learning systems, the training gradient is the joint product between the model and the training data of the local clients, which contains the private information of the corresponding user. Therefore, the research in recent years has witnessed the discovery of a number of new attack surfaces, where data reconstruction attacks probably pose the severest threats on user privacy: From the average gradient of a deep neural network (DNN) on a training batch only, an attacker can reconstruct every individual sample in the batch with almost no distortion. However, existing data reconstruction attacks mostly stay at a demonstrative and experimental level. Little is known on the underlying mechanism of data reconstruction attacks. Although a very recent work reveals a training batch which satisfies a certain neuron activation exclusivity condition can be reconstructed within a provable upper bound on the reconstruction error, our empirical results show the probability of a realistic batch to satisfy their proposed exclusivity condition is scarce, which may be impractical for in-the-wild attacks. To enhance the effectiveness and the coverage of the theory-oriented attack, we propose a novel neuron activation manipulation algorithm based on linear programming techniques, which automatically generates small perturbations to each sample in the target batch to satisfy the exclusivity condition. Therefore, the perturbed batch can be provably reconstructed with the theory-oriented attack, leading to privacy breach. In practice, by deploying our proposed algorithm at the local client, an honest-but-curious distributed learning server can fish deep data leakage from the average gradients submitted by the clients during the training. Extensive experiments on 5 datasets spanning face recognition and intelligent diagnosis applications show that our proposed approach increases the size of reconstructable training batches from 8 to practical training batch sizes, and accelerates the attack process by 10 times. Meanwhile, the reconstructed results have competitive quality to the results of existing data reconstruction attacks.

       

    /

    返回文章
    返回