FedDW：基于一致性优化的权重蒸馏异构联邦学习方法

刘佳瑜; 王勇; 杨静; 王念滨

doi:10.7544/issn1000-1239.202550451

FedDW：基于一致性优化的权重蒸馏异构联邦学习方法

FedDW: Distilling Weights Through Consistency Optimization in Heterogeneous Federated Learning

摘要

摘要: 联邦学习是一种经典的分布式机器学习范式，允许在不集中数据的前提实现模型协同训练. 该方法在保障数据隐私方面具有显著优势，然而由于客户端间数据异质性显著以及联邦规模不断扩大，对训练效率和模型性能提出诸多挑战. 已有研究表明，在独立同分布环境下，模型的参数结构通常满足特定一致性关系，这些一致性关系往往在神经网络训练过程的中间结果中得以保留. 若在非独立同分布（non-independent and identically distributed，non-IID）数据下能够识别并正则化上述一致性关系，有助于将参数分布向独立同分布（independent and identically distributed，IID）情形对齐，从而缓解数据异质性带来的影响. 基于上述思想，引入深度学习加密数据概念，并基于此提出了一种一致性优化范式. 并发掘数据软标签与模型分类层权重矩阵之间的一致性关系，据此构建了一个新的异构联邦学习框架FedDW. 其在4个公开数据集及多种神经网络模型（包括ResNet，ViT）上开展实验. 结果表明，在高度异构的客户端数据设置下，相较于10种主流联邦学习方法在平均精度上提升了约3%. 此外从理论上证明了FedDW具备更高的训练效率，其附加的反向传播计算开销可以忽略不计.

Abstract: Federated learning is a classic distributed machine learning paradigm that enables collaborative model training without centralizing data. This approach offers significant advantages in ensuring data privacy. However, it faces numerous challenges in terms of training efficiency and model performance due to the significant data heterogeneity across clients and the increasing scale of federated networks. Existing studies have shown that, in independent and identically distributed (IID) environments, the model parameter structure typically adheres to specific consistency relations, which are often preserved in the intermediate results during the neural network training process. If these consistency relations can be identified and regularized under non-independent and identically distributed (non-IID) data, it can help align the parameter distribution with the IID scenario, thereby mitigating the impact of data heterogeneity. Based on this idea, we introduce the concept of deep learning on encrypted data and propose a consistency optimization paradigm. We further explore the consistency relationship between data soft labels and the model's classification layer weight matrix, from which we construct a new heterogeneous federated learning framework, FedDW. Experiments are conducted on four public datasets and several neural network models, including ResNet and ViT. The results show that, under highly heterogeneous client data settings, FedDW outperforms 10 state-of-the-art federated learning methods with an average accuracy improvement of approximately 3%. Moreover, we theoretically prove that FedDW offers higher training efficiency, with the additional backpropagation computation overhead being negligible.

HTML全文

参考文献(42)

施引文献

资源附件(0)