BTFAT：突破联邦对抗训练中鲁棒性与准确性权衡的框架

闫彩虹; 芦效峰

doi:10.7544/issn1000-1239.202550442

BTFAT：突破联邦对抗训练中鲁棒性与准确性权衡的框架

BTFAT: A Framework Breaking the Trade-off Between Robustness and Accuracy in Federated Adversarial Training

摘要

摘要: 在分布式人工智能领域，联邦学习模型与集中训练的模型一样，在推理时面对对抗样本是脆弱的. 目前，联邦对抗训练很大程度上仍未被探索，面临以下两大难题：1）干净样本的模型准确性和对抗鲁棒性之间存在权衡，难以共同提升两者效果；2）非独立同分布的数据限制了联邦对抗训练的性能提升. 针对以上难题，提出一种突破联邦对抗训练中鲁棒性与准确性权衡的框架BTFAT，包括：1）决策空间收紧算法以标签进行初始类内定位，同时在决策空间中缩小类内样本的距离，增加类间样本的距离，实现鲁棒性和准确性的共同提升；2）权重惩罚优化算法以全局模型权重作为最佳统一目标，惩罚过度偏离的本地对抗模型训练，辅助决策空间算法抵御非独立同分布数据分布带来的影响. 在理论上分析出限制对抗训练鲁棒性和准确性增益的关键以及BTFAT的收敛性，同时从实验上证明了BTFAT在整体性能、收敛性、时间开销以及面对非独立同分布数据时全面优于最先进基线算法的效果，为联邦对抗训练研究提供了一种新思路. 代码可见https://anonymous.4open.science/r/BTFAT-11265.

Abstract: In the field of distributed artificial intelligence, federated learning models, like centrally trained models, are vulnerable to adversarial samples during reasoning. Currently, federated adversarial training has not been extensively explored and is facing two major challenges: 1) There is a trade-off between the model accuracy on clean samples and adversarial robustness, making it difficult to improve both simultaneously; 2) non-independent and identically distributed (Non-IID) data limit the performance improvement of federated adversarial training. To address these challenges, we propose the BTFAT framework, which breaks the trade-off between robustness and accuracy in federated adversarial training. The framework includes: 1) A decision-space tightening algorithm that performs initial intra-class localization using labels, while shrinking intra-class sample distances and increasing inter-class sample distances, thereby improving both robustness and accuracy; 2) A weight penalty optimization algorithm that uses the global model weights as the best unified target, penalizing local adversarial model training that deviates excessively, and assisting the decision-space algorithm to counter the impact of Non-IID data distributions. We theoretically analyze the key factors limiting the robustness and accuracy gains of adversarial training as well as the convergence of BTFAT. Simultaneously, we experimentally demonstrate that BTFAT comprehensively outperforms state-of-the-art baseline algorithms in terms of overall performance, convergence, time cost, and handling Non-IID data, providing a new perspective for research in federated adversarial training. Our code can be found in the website: https://anonymous.4open.science/r/BTFAT-11265.

HTML全文

参考文献(43)

施引文献

资源附件(0)