Abstract:
Federated learning is a classic distributed machine learning paradigm that enables collaborative model training without centralizing data. This approach offers significant advantages in ensuring data privacy. However, it faces numerous challenges in terms of training efficiency and model performance due to the significant data heterogeneity across clients and the increasing scale of federated networks. Existing studies have shown that, in independent and identically distributed (IID) environments, the model parameter structure typically adheres to specific consistency relations, which are often preserved in the intermediate results during the neural network training process. If these consistency relations can be identified and regularized under non-independent and identically distributed (non-IID) data, it can help align the parameter distribution with the IID scenario, thereby mitigating the impact of data heterogeneity. Based on this idea, we introduce the concept of deep learning on encrypted data and propose a consistency optimization paradigm. We further explore the consistency relationship between data soft labels and the model's classification layer weight matrix, from which we construct a new heterogeneous federated learning framework, FedDW. Experiments are conducted on four public datasets and several neural network models, including ResNet and ViT. The results show that, under highly heterogeneous client data settings, FedDW outperforms 10 state-of-the-art federated learning methods with an average accuracy improvement of approximately 3%. Moreover, we theoretically prove that FedDW offers higher training efficiency, with the additional backpropagation computation overhead being negligible.