异构边缘环境下自适应分层联邦学习协同优化方法

冯奕铭; 钱珍; 李光辉; 代成龙

doi:10.7544/issn1000-1239.202550146

异构边缘环境下自适应分层联邦学习协同优化方法

Synergistic Optimization Method for Adaptive Hierarchical Federated Learning in Heterogeneous Edge Environments

摘要

摘要: 传统联邦学习在应用中面临设备异构、数据异构、通信资源约束等挑战. 终端设备异构导致训练过程中过低的协作效率，而数据异构所包括的数据量和数据特征分布异构则导致全局模型精度损失以及模型缺少泛化性. 为了有效利用终端的计算、通信以及数据资源，提出了一种自适应优化的分层联邦学习方法. 该方法在考虑设备硬件资源约束、通信资源约束以及数据非独立同分布（Non-IID）特性下，结合模型分割和客户端选择技术加速联邦学习训练，提高模型准确率以及其在不同异构环境下的适应性. 为了反映各客户端数据对全局模型的一致性影响，引入数据贡献度以度量本地模型对全局模型的影响. 通过深度强化学习方法，在每一轮训练前智能体根据系统的资源分布以及本地数据贡献度来学习如何选择合理的训练客户端集合及相应边端协同模型划分方案，以加速本地训练及全局模型收敛. 仿真结果表明，与基线方法相比，所提算法在模型准确率与训练效率2个方面均表现出显著优势，且在不同异构环境配置下显示出良好的鲁棒性及适应性.

Abstract: Traditional hierarchical federated learning (HFL) encounters significant challenges in real world due to device heterogeneity, data heterogeneity (e.g., variations in data volume and feature distribution), and communication resource constraints. Device heterogeneity results in inefficient cross-device collaboration during model training, whereas data heterogeneity induces accuracy degradation and diminished generalization capabilities in the global model. To address these limitations while maximizing the utilization of computation, communication, and data resources in the heterogeneous edge networks, we propose an adaptive synergistic method for hierarchical federated learning. This method synergistically integrates model partitioning and client selection under hardware resource constraints, communication bottlenecks, and non-independent and identically distributed (Non-IID) data conditions to accelerate federated learning training while enhancing model accuracy and adaptability across heterogeneous environments. To quantify the influence of local datasets on global model convergence, a data contribution metric is introduced to evaluate the consistency of client contributions. Furthermore, by integrating deep reinforcement learning (DRL) with real-time resource monitoring and data contribution quantification, the DRL agent dynamically optimizes client selection and edge-cloud collaborative model partitioning strategies prior to each training iteration. This adaptive mechanism leverages system resource availability (e.g., bandwidth, device status) and local data contribution scores to derive optimal policies, thereby accelerating training convergence and enhancing global model accuracy. Simulation results demonstrate that the proposed method achieves significant improvements in model accuracy and training efficiency compared with baseline methods, while exhibiting robust adaptability across diverse heterogeneous environment configurations.

HTML全文

参考文献(36)

施引文献

资源附件(1)