ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2022, Vol. 59 ›› Issue (2): 478-487.doi: 10.7544/issn1000-1239.20200668

• 网络技术 • 上一篇    

基于联邦学习的多源异构数据融合算法

莫慧凌,郑海峰,高敏,冯心欣   

  1. (福州大学物理与信息工程学院 福州 350108) (1298948502@qq.com)
  • 出版日期: 2022-02-01
  • 基金资助: 
    国家自然科学基金项目(61971139)

Multi-Source Heterogeneous Data Fusion Based on Federated Learning

Mo Huiling, Zheng Haifeng, Gao Min, Feng Xinxin   

  1. (College of Physics and Information Engineering, Fuzhou University, Fuzhou 350108)
  • Online: 2022-02-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (61971139).

摘要: 随着科技的迅猛发展,具有计算和存储能力的边缘设备数量不断增加,产生的数据流量更是呈指数式增长,这使得以云计算为核心的集中式处理模式难以高效处理边缘设备产生的数据.另外,由于边缘网络设备的多样性以及数据表示手段的不断丰富,多模态数据广泛存在.为充分利用边缘设备上的异构数据,解决边缘计算中由于数据隐私引起的“数据通信壁垒”问题,提出了一种联邦学习中基于Tucker分解的多源异构数据融合算法.该算法针对异构数据在无交互条件下的融合问题,引入张量Tucker分解理论,通过构建一个具有异构空间维度特性的高阶张量以捕捉异构数据的高维特征,从而实现联邦学习中多源异构数据的融合.最后,在MOSI数据集上验证了算法的有效性.

关键词: 边缘计算, 联邦学习, 深度学习, 张量理论, 异构数据融合

Abstract: With the rapid development of technology, the number of network edge devices with the capability of computation and memory is increasing, and the volume of the generated data is growing exponentially, which makes it difficult for a centralized processing model with cloud computing as the core to efficiently process data generated by edge devices. Not only will the network delay increase, but the data is likely to be leaked on the upload link, and data security cannot be guaranteed. In addition, due to the diversity of edge devices and the continuous enrichment of data representation methods, multi-modal data exists widely. The processing of multi-source heterogeneous data collected by different edge devices has become an urgent problem in big data research. In order to make full use of heterogeneous data on edge devices and solve the problem of “data communication barriers” caused by data privacy in edge computing, in this paper we propose a novel fusion algorithm for multi-source heterogeneous data based on Tucker decomposition in federated learning. For the fusion problem of heterogeneous data without interaction in federated learning, the proposed algorithm introduces Tucker decomposition theory to capture the multi-dimensional characteristics of heterogeneous data by constructing a high-order tensor. Finally, the effectiveness of this algorithm is verified on the MOSI dataset.

Key words: edge computing, federated learning, deep learning, tensor theory, heterogeneous data fusion

中图分类号: