面向边缘智能的联邦学习综述

张雪晴; 刘延伟; 刘金霞; 韩言妮

doi:10.7544/issn1000-1239.202111100

面向边缘智能的联邦学习综述

张雪晴^{1, 2,},
刘延伟^1, ,,
刘金霞³,
韩言妮¹

1.
中国科学院信息工程研究所　北京　100093
2.
中国科学院大学网络空间安全学院　北京　100049
3.
浙江万里学院　浙江宁波　315100

基金项目: 国家自然科学基金项目（61771469）；重庆市属本科高校与中国科学院所属院所合作项目（HZ2021015）

详细信息

作者简介:
张雪晴: 1994年生. 硕士. 主要研究方向为机器学习

刘延伟: 1976年生. 博士，副研究员. CCF会员. 主要研究方向为无线通信、智能多媒体信息处理和网络安全

刘金霞: 1969年生. 硕士，教授. 主要研究方向为无线通信和边缘智能

韩言妮: 1981年生. 博士，副研究员. 主要研究方向为无线通信和智能数据分析

通讯作者:
刘延伟（liuyanwei@iie.ac.cn）

中图分类号: TP3
计量
- 文章访问数: 1164
- HTML全文浏览量: 179
- PDF下载量: 595
出版历程
- 收稿日期: 2021-11-07
- 修回日期: 2022-09-15
- 网络出版日期: 2023-03-16
- 刊出日期: 2023-05-31

An Overview of Federated Learning in Edge Intelligence

1.
Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093
2.
School of Cyber Security, University of Chinese Academy of Sciences, Beijing 100049
3.
Zhejiang Wanli University, Ningbo, Zhejiang 315100

Funds: This work was supported by the National Natural Science Foundation of China (61771469) and the Cooperation Project Between Chongqing Municipal Undergraduate Universities and Institutes Affiliated to CAS (HZ2021015).

More Information

Author Bio:
Zhang Xueqing: born in 1994. Master. Her main research interest includes machine learning

Liu Yanwei: born in 1976. PhD, associate professor. Member of CCF. His main research interests include wireless communication, intelligent multimedia processing, and cyber security

Liu Jinxia: born in 1969. Master, professor. Her main research interests include wireless communication and edge intelligence

Han Yanni: born in 1981. PhD, associate professor. Her main research interests include wireless communication and intelligent data analysis

摘要

摘要:
随着边缘智能需求的快速增长，联邦学习（federated learning，FL）技术在产业界受到了极大的关注. 与传统基于云计算的集中式机器学习相比，边缘网络环境下联邦学习借助移动边缘设备共同训练机器学习模型，不需要把大量本地数据发送到云端进行处理，缩短了数据处理计算节点与用户之间的距离，在满足用户低时延需求的同时,用户数据可以在本地训练进而实现数据隐私保护. 在边缘网络环境下，由于通信资源和计算资源受限，联邦学习的性能依赖于无线网络状态、终端设备资源以及数据质量的综合限制. 因此，面向边缘智能应用，首先分析了边缘智能环境下高效联邦学习面临的挑战，然后综述联邦学习在客户端选择、模型训练与模型更新等关键技术方面的研究进展，最后对边缘智能联邦学习的发展趋势进行了展望.
- 联邦学习 /
- 边缘计算 /
- 边缘智能 /
- 模型聚合 /
- 资源受限
Abstract:
With the increasing demand of edge intelligence, federated learning (FL) has been now of great concern to the industry. Compared with the traditionally centralized machine learning that is mostly based on cloud computing, FL collaboratively trains the neural network model over a large number of edge devices in a distributed way, without sending a large amount of local data to the cloud for processing, which makes the compute-extensive learning tasks sunk to the edge of the network closed to the user. Consequently, the users’ data can be trained locally to meet the needs of low latency and privacy protection. In mobile edge networks, due to the limited communication resources and computing resources, the performance of FL is subject to the integrated constraint of the available computation and communication resources during wireless networking, and also data quality in mobile device. Aiming for the applications of edge intelligence, the tough challenges for seeking high efficiency FL are analyzed here. Next, the research progresses in client selection, model training and model updating in FL are summarized. Specifically, the typical work in data unloading, model segmentation, model compression, model aggregation, gradient descent algorithm optimization and wireless resource optimization are comprehensively analyzed. Finally, the future research trends of FL in edge intelligence are prospected.
- federated learning /
- edge computing /
- edge intelligence /
- model aggregation /
- resource constraints

HTML全文

嵌入式实时系统广泛存在于生产和生活当中，例如航空航天、轨道交通、无人驾驶、互联通信等.在实时系统中，其正确性不仅仅依赖于逻辑结果的正确性，同时还依赖于结果产生的时间^[1].实时系统中的任务执行时间具有严格的约束，如果错过任务截止时间将会导致灾难性后果.因此，实时系统在任务调度设计和可调度性分析时都必须保证任务的执行在一个安全的时间上界之内，此上界的计算方法即任务的最坏情况执行时间（worst-case execution time，WCET）分析.当前主流的WCET分析方法分为静态分析和动态分析，动态分析也称为基于测试的分析方法，通过大量的测试用例来获取精确的WCET分析结果，由于无法穷尽所有的测试用例，动态分析无法保证分析结果的安全性，工业界通常会在分析结果的基础上增加一定的裕量（例如20%）^[2]作为最终的WCET分析结果.静态分析方法则是通过对处理器硬件结构和程序流信息进行分析估算程序的WCET值，静态分析方法能够保证分析结果的安全性，并且不需要运行程序便可获得WCET结果，所以大多数WCET分析工具都采用静态分析方法.单核处理器WCET分析方法经过多年的研究已经取得较高的分析精度，然而多核处理器中加速部件的使用和共享资源干扰的存在，导致原有的单核处理器WCET分析方法无法直接适用于多核处理器，这对于多核处理器的WCET分析提出了新的挑战．

在硬件结构方面，多核处理器的WCET分析主要考虑Cache、内存、核间互联和流水线等部件的影响^[3]，Rapita Systems公司的实验表明，由于L3 Cache的争用导致YOLO算法的帧率降低90%左右.当前对于Cache的研究主要集中在指令Cache带来的干扰分析^[4-7]，并已经获得了较高的分析精度，而对于数据Cache的关注则不足，其原因有2方面：1）处理器寻址方式的不同导致数据Cache的访问地址分析更加困难；2）数据Cache在循环的不同轮次中访问的数据不同，正是由于数据Cache和指令Cache在时空特性上的差异，造成数据Cache的分析更加复杂.如果简单将指令Cache的分析技术套用到数据Cache上，会导致分析结果过于悲观^[8].对于数据Cache的分析除了以上问题之外，另一个需要特别关注的影响因素就是数据一致性协议.有研究表明，由于Cache一致性协议的影响，并行程序执行比串行程序慢74%^[9].但是相关的调研表明^[10]，极少数研究人员关注Cache一致性造成的干扰．

针对现有多核处理器WCET分析方法对数据Cache一致性协议考虑不足的问题，本文主要研究一种基于多级一致性协议的多核处理器WCET分析方法.该方法通过建立多级一致性域，抽象出一致性协议嵌套的共享数据管理机制，分别从一致性域内部、跨一致性域2个层面来分析内存访问延迟，从而实现对多核处理器中数据Cache的精确分析．

1. 相关工作

Ferdinand等人^[11]提出基于抽象解释理论^[12]的Cache行为分析方法，由于其具有较高的精确度和效率，在基于Cache的WCET分析中占据了统治地位^[8].如图1所示，基于抽象解释的Cache分析方法对Cache的物理状态和替换策略进行抽象，得到抽象缓存状态（abstract cache state）和函数Join、函数Update，通过May分析、Must分析和Persistence分析3种方法得出Cache的“Cache命中/失效分类”（cache hit miss classification, CHMC），从而确定Cache的访问延迟.Huynh等人^[13]经过分析证明了原有的Persistence分析方法存在不安全性，其原因在于更新函数 ${l_h} \mapsto \hat s({l_{h - 1}}) \cup (\hat s({l_h}) - \{ m\} )$ 未将可能被替换出去的内存块的年龄进行增加，导致分析结果不正确.为解决此问题，该小组提出Younger Set的概念，并将Younger Set引入到函数Join和函数Update中，保证了Persistence分析的安全性.在过去的20多年，研究人员对Persistence分析进行了广泛的研究，仍然不能保证其发现所有的Persistent Cache块，直到Stock^[14]给出Exact Persistence分析方法，才使得Persistence分析趋于完善．

图 1 基于抽象解释的Cache分析方法

Figure 1. Cache analysis based on abstract interpretation

下载: 全尺寸图片幻灯片

模型检测^[15]方法是另一个广泛研究的Cache行为分析方法，模型检测作为实时系统中常用的时序验证手段，将各类时序分析问题转化为时间自动机的可达性问题，从而实现对问题的求解.Wilhelm^[16]讨论了模型检测技术在WCET分析中的优缺点，Metzner^[17]认为模型检测方法能够提高分析结果的精度.Dalsgaard等人^[18]基于模型检测方法实现了模块化任务执行时间分析框架；Gustavsson等人^[19]的研究表明，UPPAAL可以用于多核处理器中并行任务的WCET分析，Cassez等人^[20]将WCET分析问题转换为寻找最长路径的问题，并通过扩展模型检测工具UPPAAL实现了一个任务WCET分析工具WUPPAAL；Lü等人^[21]在开源工具Chronos的基础上，将模型检测的方法用于Cache的行为分析，实现了一个多核处理器WCET分析工具McAiT；陈芳园等人^[22]改进多核处理器中Cache冲突分析方法，在取指阶段考虑取指操作之间的时序关系，该方法能够排除部分非干扰状态，使得WCET计算精度最高提高12%.上述对于Cache行为分析多是针对指令Cache展开的，而对于数据Cache的分析研究较少，尤其是对于多核处理器中Cache一致性协议的讨论更为少见.直到2015年，Uhrig等人^[23]研究数据Cache对多核处理器WCET分析的影响，并研究了基于监听的一致性协议和TDMA总线对多核处理器中延迟时间的影响，最终得出结论：采用写失效的基于总线监听的一致性协议不具备可预测性，而采用写更新双通道直接映射的bus-snarfing一致性协议配合TDMA总线能够实现时间可预测.但文献[23]未对Cache一致性协议带来的内存访问延迟进行更为细致的分析.而Hassan等人^[24]的分析也得出了基于总线监听MSI一致性协议配合TDMA总线的多核处理器的WCET值不具有可预测性，重点分析了TDMA总线导致的不可预测性，而对于多核处理器中基于MESI一致性协议的WCET估计，同样未给出详细的分析方法.Tsoupidi^[25]针对对称多处理器，提出一种考虑Cache一致性协议的WCET分析方法，该方法在计算共享Cache读访问延迟时认为 ${h'_{{\text{sh}}}} = \max ({N_{\text{c}}},{h_{{\text{sh}}}})$ ，上述情况仅发生在Cache使用写回（write-back）策略时，对于写直达（write-through）策略，数据同时存在于私有Cache和共享Cache中，该计算方式存在较大的悲观性，且该文仅仅讨论了单级一致性协议下的WCET分析方法，而对于多核处理器中存在多级Cache一致性协议的情况，尚未见到相关的研究．

为此，本文针对数据Cache中共享数据的访问，提出了基于多级一致性协议的多核处理器WCET分析方法，解决多级一致性协议嵌套情况下的WCET分析问题，完善了多核处理器Cache的分析框架，并实现一个基于MESI（modify exclusive shared invalid）协议的高精度多核处理器WCET分析工具，为多核实时系统的WCET分析提供了支撑．

2. 基于MESI协议的跨一致性域WCET分析方法

本节的核心工作在于提出一种多级一致性域的WCET分析方法，实现对多核处理器中多级一致性域的WCET分析.通过对一致性域内部的Cache访问延迟和状态转换情况进行分析，得出一致性域内部的分析结果，根据是否进行跨域数据访问决定下一步是否需要进行跨域分析，从而实现一致性协议嵌套情况下的共享数据访问延迟分析.本文将该方法与当前主要考虑指令Cache的分析方法相结合，增加了数据Cache中共享数据的访问分析，扩展了多核处理器中Cache的干扰分析框架，进一步完善了多核处理器WCET分析方法．

2.1 Cache组织结构

共享存储结构是当前应用最为广泛的多核处理器架构，此种结构对于每一个内核而言都是对等的，每一个内核的访问时间都是相同的，因此多核处理器属于对称多处理器（symmetric multi-processor，SMP）.最初的多核处理器中只存在私有Cache和共享Cache，随着CPU制造工艺的发展，当前多核处理器已集成多级Cache．

采用SMP架构的多处理器需要支持共享数据的Cache，通过读写共享数据实现内核之间的通信．在缓存共享数据时，可能会存在多个内核对共享数据的争用，也可能会在多个Cache中复制共享数据，导致多个数据副本不一致，因此共享数据Cache引入了一个新的问题——Cache一致性问题，需要引入一致性协议保证数据一致性．

MESI是一种典型的Cache一致性协议，也被称为Illinois MESI协议，是在原MSI协议的基础上引入E状态，用于表示独占状态．在写入一个共享数据块时，通常有2种写策略：写直达和写回.对于图2所示的多级Cache组织结构，由于写回策略对存储器的带宽要求低，能够支持更多、更快速的处理器，所以最外层级别（簇（cluster）间）通常采用写回策略^[3]；而对于簇内的写入共享数据的策略，如果采用写回策略，会造成簇内L1 Cache中数据和L2 Cache中的数据不一致的情况，在维护簇内数据一致性的同时，还要考虑簇间一致性，造成一致性协议非常复杂，不利于硬件的实现，因此簇内采用写直达的策略．

图 2 多级Cache组织结构

Figure 2. Multi-level Cache arrangement

下载: 全尺寸图片幻灯片

写直达和写回适用于访问命中的情况，当访问缺失时所采用的写策略为写分配（write allocation）和非写分配（no write allocation）两种方式，通常写回与写分配策略配合使用，写直达和非写分配策略配合使用．

2.2 一致性域

多核处理器任务调度时存在多个内核共同完成一个任务，或者将某些关键等级高的任务映射到特定的内核上，因此，这些任务之间的数据交换仅仅在特定内核之间进行，如果使用一个一致性协议维护所有内核的数据副本，将会产生大量的数据交互操作，造成网络拥塞，降低执行效率, 一种可行的方法是将所有的内核进行区域划分^[26]，每个区域使用独立的一致性协议管理数据副本，该方法既能充分利用局部性原理，又能降低网络负载．

定义1. 一致性域.使用相同一致性协议管理数据副本的内核集合，称之为一致性域．

一致性域的示意图如图3所示.

图 3 一致性域

Figure 3. Coherence domain

下载: 全尺寸图片幻灯片

一致性域具有2个性质：

性质1.一致性域具有独立性，即任意2个一致性域之间不存在交集．

性质2.一致性协议在一致性域内具有唯一性．

2.3 多级一致性域

由于一致性协议只能维护域内数据的一致性，当存在跨域数据交互时，会导致无法保证域间数据的一致性，需要使用多级一致性协议进行数据维护，相应地，需要定义多级一致性域.多级一致性域采用分层划域的思想，下层一致性域是上层一致性域的一个结点（clump）．

针对如图2所示的硬件架构，根据一致性域的定义，簇内为1级一致性域，整个多核处理器构成2级一致性域.处理器内核在访问数据时，首先会访问私有Cache，如果访问缺失，则会访问下一级共享Cache，即簇内共享Cache，此时数据访问范围仍在一致性域内.如果访问命中，只需进行域内WCET分析即可；如果访问缺失，根据存储器层次结构访问次序，则需要进行跨一致性域WCET分析.

2.4 关键参数说明

在进行多核处理器的Cache行为分析时，假设总线是完美的并且不存在读和写队列的限制，在此基础上分析Cache一致性协议对内存访问时间的影响.

首先假设处理器拥有私有Cache、共享Cache和内存，Cache采用写回和写分配策略，Cache的访问分为读和写，读和写都会出现访问命中或缺失，我们从读和写2个方面对Cache行为进行分析，在开始分析之前定义如下符号：

1） ${\psi _{\text{R}}}(m,m')$ / ${\psi _{\text{W}}}(m,m')$ .表示域内Cache读/写访问时间延迟和状态转换情况，m表示被访问的数据，m'表示被访问数据的副本.

2） ${H_{{\text{L}}i}}$ .表示在第i级Cache访问命中时间延迟.

3） ${L_{{\text{L}}i \to {\text{L}}j}}$ .表示数据从第i级Cache加载到第j级Cache的时间延迟.

4） ${H_{{\text{memory}}}}$ .表示主存访问命中时间延迟.

5） $Inv$ / $Inv'$ .表示域内/跨域使其他数据副本失效的时间延迟.

6） ${\psi '_{\text{R}}}(m,m')$ / ${\psi '_{\text{W}}}(m,m')$ .表示跨域Cache读/写访问时间延迟和状态转换情况.

7） ${m_{\text{s}}}$ / ${m'_{\text{s}}}$ .表示本地/远程Cache状态.

8） ${\text{Replacement}}$ / $\neg {\text{Replacement}}$ .表示被访问共享数据被替换出Cache，导致替换缺失发生/未发生．

2.5 考虑一致性协议的WCET分析框架

现有工作对于Cache的分析主要集中在指令Cache，对于数据Cache的分析也仅限于本地数据的分析.本文在基于抽象解释的多级Cache分析框架基础上，扩展了共享数据分析这一功能．

如图4所示，该分析框架的输入为待分析程序和Cache模型.其中待分析程序提供控制流、地址信息，Cache模型中包括了Cache组织结构、替换策略、读写策略以及一致性协议，这些参数作为分析框架的输入信息，提供给Cache分析方法.Cache的分析可分为2部分：数据Cache和指令Cache，其中数据Cache中共享数据的分析为本文的主要工作，将在后文中详细介绍.最后，根据分析结果得出内存访问的分类以及访问的时间延迟，从而得出WCET的分析结果.

图 4 基于一致性协议的WCET分析框架

Figure 4. WCET analysis framework based on coherence protocol

下载: 全尺寸图片幻灯片

2.6 域内WCET分析

2.6.1 域内读访问

当内核发出读数据m的读请求时，如果m存在于L1中，其状态可能为E或者S并且未发生替换缺失，在L1中能够读命中，数据访问延迟为 ${H_{{\text{L}}1}}$ ，所有Cache块的状态不会发生改变．

如果数据m在域内，m的状态为E或者S但是发生替换缺失，此时数据访问延迟为读取L2中数据的时间 ${\psi '_{\text{R}}}$ ，同时需要将数据从L2 Cache加载到本地L1 Cache中，考虑连贯性要求，数据访问延迟为 $\max ({\psi '_{\text{R}}},{L_{{\text{L}}2 \to {\text{L}}1}})$ ，Cache状态不发生变化，如图5 (a)(b)所示.

图 5 域内读访问时间延迟及状态转换图

Figure 5. Read-visit latency and state transition diagram of intra-domain

下载: 全尺寸图片幻灯片

如果数据m在域内，m的状态为I，且副本m'的状态为E或者S时，此时数据访问延迟为读取L2中数据的时间 ${\psi '_{\text{R}}}$ ，同时需要将数据从L2 Cache加载到本地L1 Cache中，考虑连贯性要求，数据访问延迟为 $\max ({\psi '_{\text{R}}},{L_{{\text{L}}2 \to {\text{L}}1}})$ ，m的状态转换为S，副本m'的状态转换为S，如图5 (c)所示．

如果数据m不在域内，在L2 Cache中会出现访问缺失，假定访问L2 Cache的时间为 ${\psi '_{\text{R}}}$ ，对于 ${\psi '_{\text{R}}}$ 的取值我们将在2.7.1节中进行分析.此时m的状态由I转换为E，L2 Cache的状态需要根据L2的一致性协议进一步分析，如图5(d)所示．

根据以上分析，可以得出Cache域内读访问时间延迟和状态转换的状态更新函数：

$\begin{aligned}&{\psi }_{{{\rm{R}}}}(m,{m}^{\prime })=\\ &\left\{ {\begin{aligned} &{H}_{{{\rm{L}}}1},{m}_{{{\rm{s}}}}\mapsto {m}_{{{\rm{s}}}},{m}'_{{{\rm{s}}}}\mapsto {{m}'_{{{\rm{s}}}}}； \\ &\ \ \ m存在于\text{L}1\wedge \neg {{\rm{Replacement}}}.\\ &{{\rm{max}}}（{{\psi }'_{\text{R}}},{L}_{\text{L}2\to {{\rm{L}}}1}）,{m}_{{{\rm{s}}}}\mapsto {m}_{{{\rm{s}}}},{m}'_{{{\rm{s}}}}\mapsto {m}'_{{{\rm{s}}}}；\\ &\ \ \ m存在于\text{L}1\wedge {{\rm{Replacement}}}.\\ &{{\rm{max}}}({{\psi }'_{\text{R}}},{L}_{{{\rm{L}}}2\to {{\rm{L}}}1}),{m}_{{{\rm{s}}}}:{{\rm{I}}}\to {{\rm{S}}},{m}'_{{{\rm{s}}}}:{{\rm{E}}}/{{\rm{S}}}\to {{\rm{S}}}；\\ &\ \ \ m不存在于{{\rm{L}}}1\wedge m存在于{{\rm{L}}}2\wedge {m}^{\prime }存在于{{\rm{L}}}1.\\ &{{\psi } '_{\text{R}}},{m}_{{{\rm{s}}}}:{{\rm{I}}}\to {{\rm{E}}};m不在域内. \end{aligned}} \right.\end{aligned}$

(1)

2.6.2 域内写访问

当内核发出写m的请求时，如果m仅存在本地L1 Cache且状态为E时，此时会发生Cache写命中．首先需要将数据写入到L1中，由于使用写直达策略，需要同时写入到L2当中，因此数据访问时间为 ${H_{{\text{L}}1}} + {\psi '_{\text{W}}}$ ，Cache状态不发生改变，如图6(a)所示；如果m存在于L1 Cache中且状态为S时，此时会发生Cache写命中，数据访问延迟为 ${H_{{\text{L}}1}} + {\psi '_{\text{W}}}$ ，状态转换为E，由于采用写失效的一致性协议，因此需要将副本的状态修改为I，考虑连贯性要求，数据访问最终的延迟为 $\max ({H_{{\text{L}}1}} + {\psi '_{\text{W}}},Inv)$ ，如图6(b)所示．

图 6 域内写访问时间延迟及状态转换图

Figure 6. Write-visit latency and state transition diagram of intra-domain

下载: 全尺寸图片幻灯片

如果数据m位于本地L1 Cache中，m的状态为E或者S，并且发生替换缺失，此时会出现写缺失.如果m的状态为E，此时数据访问时间为 ${\psi '_{\text{W}}}$ ，并且状态转换为I，如图6(c)所示；如果m的状态为S，在进行L2 Cache写操作之后，需要对其他副本进行写失效，所以访问时间为 $\max ({\psi '_{\text{W}}},Inv)$ ，m和副本m'的状态都转换为I，如图6(d)所示．

如果数据m在域内, m的状态为I且副本m'的状态为E或者S时，此时会对L2进行写操作, 并将其他副本中的数据写失效, 考虑数据连贯性要求, 数据的写延迟为 $\max ({\psi '_{\text{W}}},Inv)$ , 由于采用非写分配的方式, 本地L1的状态不发生改变，其他副本的状态从E/S转换为I, 如图6(e)所示.

如果数据m不在域内，类比读访问的情况，假定写L2 Cache的访问时间为 ${\psi '_{\text{W}}}$ .由于域内采用非写分配法，所以此时m的状态不发生任何变化，如图6(f)所示．

根据以上分析，可以得出Cache域内写访问时间延迟和状态转换的状态更新函数：

$\begin{aligned}&{\psi }_{{{\rm{W}}}}(m,{m}^{\prime })=\\ &\left\{ {\begin{aligned} &{H}_{\text{L}1}+{\psi }'_{\text{W}},{m}_{\text{s}}\mapsto {m}_{\text{s}},{{m}}'_{\text{s}}\mapsto {{m}}'_{\text{s}}；\\ &\ \ \ m存在于\text{L}1\wedge {m}^{\prime }不存在于\text{L}1\wedge \neg \text{Replacement}.\\ &\mathrm{max}({H}_{\text{L}1}+{{\psi }}'_{\text{W}},Inv),{m}_{\text{s}}:\text{S}\to \text{E},{{m}}'_{\text{s}}:\text{S}\to \text{I}；\\ &\ \ \ m存在于\text{L}1\wedge {m}^{\prime }存在于\text{L}1\wedge \neg \text{Replacement}.\\ &{{\psi }}'_{\text{W}},{m}_{\text{s}}:\text{E}\to \text{I}；\\ &\ \ \ m存在于\text{L}1\wedge {m}^{\prime }不存在于\text{L}1\wedge \text{Replacement}.\\ &\mathrm{max}({{\psi }}'_{\text{W}},Inv),{m}_{\text{s}}:\text{S}\to \text{I},{{m}}'_{\text{s}}:\\ &\ \ \ m存在于\text{L}1\wedge {m}^{\prime }存在于\text{L}1\wedge \text{Replacement}.\\ &\mathrm{max}({{\psi }}'_{\text{W}},Inv),{m}_{\text{s}}\mapsto {m}_{\text{s}},{{m}}'_{\text{s}}:\text{E}/\text{S}\to \text{I}；\\ &\ \ \ m存在于\text{L}1\wedge {m}^{\prime }存在于\text{L}1.\\ &{{\psi }}'_{\text{W}},{m}_{\text{s}}\mapsto {m}_{\text{s}};m不在域内. \end{aligned}} \right.\end{aligned}$

(2)

2.7 跨域WCET分析

2.7.1 跨域读访问

当内核发出读数据m的读请求时，如果需要对L2 Cache进行读访问，此时将会由上级一致性协议管理共享数据．

如果数据m存在于域内，并且位于本地L2 Cache中，对应域内读访问的图5(a)~(c)，此时数据访问时间为 ${H_{{\text{L}}2}}$ ，域间所有的Cache的状态都不发生变化，如图7(a)所示．

图 7 跨域读访问时间延迟及状态转换图

Figure 7. Read-visit latency and state transition diagram of cross-domain

下载: 全尺寸图片幻灯片

当被访问数据m存在于L2 Cache中并且发生替换缺失，对应域内读访问的图5(d)，此时数据访问延迟为 ${H_{{\text{L}}3}}$ ，同时需要将数据加载到L1 Cache中，考虑到连贯性要求，数据访问的延时为 $\mathrm{max}（{H}_{\text{L}3},{L}_{\text{L}3\to \text{L}1}）$ ，如果m的状态为M或者E，则状态会转换为E，如果m的状态为S，则状态不发生改变，副本m'的状态不发生改变，如图7(b)(c)所示．

如果被访问数据m不存在于域内，对应域内读访问的图5(d)，存在2种可能性：1）数据m本身不存在Cache中；2）数据m存在于其他域的Cache中.如果数据m不存在Cache中，此时需要通过内存来读取数据m，数据访问延迟为 $H_\text{memory}$ ，同时需要将数据从内存中写回到L1 Cache中，考虑连贯性要求，数据访问延迟为 $\max ({H_\text{memory}},{L_{m \to {\text{L}}1}})$ ，L2的状态从I转换为E，如图7(d)所示．

如果数据存在于其他域中，即存在数据副本m'，如果其状态是M，首先要将数据从其他域的L2 Cache中加载到本地域的L2中，然后再加载到L1中，最后从L1中读取数据，其访问延迟为 ${L_{{\text{L}}2 \to {\text{L}}2}} + {L_{{\text{L}}2 \to {\text{L}}1}} + {H_{{\text{L}}1}}$ ，此时L2中数据m的状态从I转换为S，其他域L2中数据副本m'状态从M转换为S，如图7(e)所示.

如果数据副本m'的状态为E或者S，将会在L3中读取命中数据，此时数据访问延迟为 ${H_{{\text{L}}3}}$ ，与此同时，数据将会加载到L1中，考虑连贯性要求，此时数据访问延迟应取 $\max ({H_{{\text{L}}3}},{L_{{\text{L}}3 \to {\text{L}}1}})$ ，此时L2中数据m的状态从I转换为S，其他域L2中数据副本m'状态换为S，如图7(f)所示．

根据以上分析，可以得出Cache跨域读访问时间和状态转换的状态更新函数：

$\begin{aligned}&{{\psi }}_{\text{R}}^{\prime }(m,{m}^{\prime })=\\ &\left\{ {\begin{aligned} &{H}_{\text{L}2},{m}_{\text{s}}\mapsto {m}_{\text{s}},{{m}}'_{\text{s}}\mapsto {{m}}'_{\text{s}}； \\ &\ \ \ m存在于\text{L}2\wedge \neg \text{Replacement}.\\ &\mathrm{max}（{H}_{\text{L}3},{L}_{\text{L}3\to \text{L}1}）,{m}_{\text{s}}:\text{M}/\text{E}\to \text{E}；\\ &\ \ \ m存在于\text{L}2\wedge {m}^{\prime }不存在于\text{L}2\wedge \text{Replacement}.\\ &\mathrm{max}（{H}_{\text{L}3},{L}_{\text{L}3\to \text{L}1}）,{m}_{\text{s}}\mapsto {m}_{\text{s}},{{m}}'_{\text{s}}\mapsto {{m}}'_{\text{s}}；\\ &\ \ \ m存在于\text{L}2\wedge {m}^{\prime }存在于\text{L}2\wedge \text{Replacement}.\\ &\mathrm{max}({H}_\text{memory},{L}_{m\to \text{L}1}),{m}_{\text{s}}:\text{I}\to \text{E}；\\ &\ \ \ m不存在于\text{Cache}中.\\ &{L}_{\text{L}2\to \text{L}2}+{L}_{\text{L}2\to \text{L}1}+{H}_{\text{L}1},{m}_{\text{s}}:\text{I}\to \text{S},{{m}}'_{\text{s}}:\text{M}\to \text{S}；\\ &\ \ \ m不存在于\text{L}2\wedge {m}^{\prime }存在于\text{L}2\wedge {m}^{\prime }不存在于\text{L}3.\\ &\mathrm{max}({H}_{\text{L}3},{L}_{\text{L}3\to \text{L}1}),{m}_{\text{s}}:\text{I}\to \text{S},{{m}}'_{\text{s}}:\text{E}/\text{S}\to \text{S}；\\ &\ \ \ m不存在于\text{L}2\wedge {m}^{\prime }存在于\text{L}2\wedge m存在于\text{L}3. \end{aligned}} \right.\end{aligned}$

(3)

2.7.2 跨域写访问

当内核发出写数据m的写请求时，如果需要对L2 Cache进行写访问，此时将由上级一致性协议管理共享数据．

如果数据m在域内（对应域内写访问图6(a)~(e)的情况），并且位于本地L2 Cache中，状态为S，需要将其他副本中的数据写失效，考虑连贯性要求，数据访问时间为 $\max ({H_{{\text{L}}2}},Inv')$ ，状态转换为M，副本的状态由S转换为I，如图8(a)所示；如果m的状态为M或者E，此时数据访问时间为 ${H_{{\text{L}}2}}$ ，状态转换为M，其他Cache状态不变，如图8(b)所示．

图 8 跨域写访问时间延迟及状态转换图

Figure 8. Write latency and state diagram of cross-domain

下载: 全尺寸图片幻灯片

如果被访问数据m不存在于域内（对应域内写访问图6(f)的情况）, 存在2种可能性：1）数据m本身不存在Cache中；2）数据m存在于其他域的Cache中.如果数据m不存在Cache中, 由于域间采用写分配，首先要将数据从内存加载到L2 Cache中, 然后在L2 Cache中修改数据, 此时数据访问延迟为 ${H_{{\text{L}}2}} + {L_{m \to {\text{L}}2}}$ ，L2 Cache的状态从I转换为M, 如图8(c)所示.

如果数据存在于其他域内，数据m的状态为I，如果数据副本m'的状态为M，首先要将数据从其他域加载到本地L2中，然后修改本地L2 Cache中的数据，状态转换为M，并且要将其他副本写失效，考虑连贯性要求，数据访问延迟为 $\max ({H_{{\text{L}}2}} + {L_{{\text{L}}2 \to {\text{L}}2}},Inv')$ ，其他副本的状态转换为I，如图8(d)所示；如果副本的状态为E或者S，此时需要将数据从L3 Cache加载到本地L2 Cache中，状态转换为M，并且要将其他副本写失效，考虑连贯性要求，数据访问延迟为 $\max ({H_{{\text{L}}2}} + {L_{{\text{L}}3 \to {\text{L}}2}},Inv')$ ，副本的状态准换为I，如图8(e)所示.

如果m为M, E, S状态并且发生替换缺失时（对应域内写访问图6(f)的情况）, 此时会发生Cache写缺失.首先将数据从L3 Cache加载到L2 Cache, 然后执行写操作.对于m的状态为M和E的情况, 此时数据访问延迟为 ${H_{{\text{L}}2}} + {L_{{\text{L}}3 \to {\text{L}}2}}$ , m的状态转换为M, 如图8(f)所示；如果m的状态为S, 需要将其他副本中的数据写失效, 考虑连贯性要求, 数据访问延迟为 $\max ({H_{{\text{L}}2}} + {L_{{\text{L}}3 \to {\text{L}}2}},Inv')$ , m的状态由S变为I, 如图8(g)所示.

根据以上分析，可以得出Cache跨域写访问时间和状态转换的状态更新函数：

$\begin{aligned} &{{\psi }}'_{\text{W}}(m,{m}')=\\ &\left\{ {\begin{aligned} &\mathrm{max}({H}_{\text{L}2},In{v}'),{m}_{\text{s}}:\text{S}\to \text{M},{{m}}'_{\text{s}}:\text{S}\to \text{I}； \\ & \ \ \ m{存在于}\text{L}2\wedge {m}'{存在于}\text{L}2\wedge \neg \text{Replacement}.\\ &{H}_{\text{L}2},{m}_\text{s}:\text{M}/\text{E}\to \text{M},{m}'_{\text{s}}\mapsto {m}'_{\text{s}}；\\ &\ \ \ m{存在于}\text{L}2\wedge {m}'{不存在于}\text{L}2\wedge \neg \text{Replacement}.\\ &{H}_{\text{L}2}+{L}_{m\to \text{L}2},{m}_{\text{s}}:\text{I}\to \text{M}；\\ &\ \ \ m{不存在}\text{Cache}{中}.\\ &\mathrm{max}({H}_{\text{L}2}+{L}_{\text{L}2\to \text{L}2},In{v}'),{m}_{\text{s}}:\text{I}\to \text{M},{m}'_{\text{s}}:\text{M}\to \text{I}；\\ &\ \ \ m{不存在于}\text{L}2\wedge {m}'{存在于}\text{L}2\wedge m{不存在于}\text{L}3.\\ &\mathrm{max}({H}_{\text{L}2}+{L}_{\text{L}3\to \text{L}2},In{v}'),{m}_{\text{s}}:\text{I}\to \text{M},{m}'_{\text{s}}:\text{E}/\text{S}\to \text{I}；\\ &\ \ \ m{不存在于}\text{L}2\wedge {m}'{存在于}\text{L}2\wedge m{存在于}\text{L}3.\\ &{H}_{\text{L}2}+{L}_{\text{L}3\to \text{L}2},{m}_{\text{s}}:\text{M}/\text{E}\to \text{M}；\\ &\ \ \ m{存在于}\text{L}2\wedge {m}'{不存在于}\text{L}2\wedge \text{Replacement}.\\ &\mathrm{max}({H}_{\text{L}2}+{L}_{\text{L}3\to \text{L}2},In{v}''),{m}_{\text{s}}:\text{S}\to \text{M},{m}'_{\text{s}}:\text{S}\to \text{I}；\\ &\ \ \ m{存在于}\text{L}2\wedge {m}'{存在于}\text{L}2\wedge \text{Replacement}. \end{aligned}} \right.\end{aligned}$

(4)

3. 实验评估

3.1 实验参数设置

本文在实验室原有多核处理器WCET分析工具的基础上，扩展其中数据Cache的分析方法，增加了对MESI一致性协议的支持，设计并实现了一个支持Cache一致性协议的多核处理器WCET分析工具Roban.通过该工具对多核处理器进行WCET分析，并将分析结果与GEM5仿真工具模拟执行得到的时间进行对比，从而验证分析方法的有效性．

实验针对4核处理器展开分析，每2个内核组成1簇，即2个1级一致性域和1个2级一致性域，处理器存储结构参照图2，关键参数配置如表1所示.从Mälardalen大学WCET研究小组测试用例集中选取典型测试用例进行测试．

表 1 多核处理器参数配置

Table 1. Parameters Configuration of Multi-Core Processor

Cache 层次结构	数据访问延时/cycle			一致性协议	替换策略
Cache 层次结构	L1	L2	L3	一致性协议	替换策略
4 核, 2 簇	4	14	42	MESI	LRU
注：LRU为最近最少使用（least recently used）替换策略.

下载: 导出CSV

| 显示表格

3.2 实验结果与分析

实验一共分为2组，第1组Cache相联度为4路组相联，第2组Cache相联度为8路组相联.分别调整L1，L2，L3 Cache的容量，得出在不同Cache配置情况下的WCET值，结果如图9所示.

图 9 GEM5仿真结果与分析结果对比

Figure 9. Comparison of GEM5 simulation results and analysis results

下载: 全尺寸图片幻灯片

从图9可以看出, Roban工具得到的WCET分析结果均大于GEM5仿真得到的结果,证明了本文方法的安全性.

为了验证本文方法的有效性，采用Spearman相关性分析法对图9中Roban分析结果与GEM5仿真结果进行相关性分析，得出的相关性系数矩阵如图10所示，其中横坐标为Roban分析工具对应的测试结果，纵坐标表示GEM5仿真结果.从图10中可以看出，带有数字标签的矩阵元素为相同测试用例在不同Cache参数配置情况下得到WCET结果的相关性系数，相关性系数不小于0.98，说明Roban和GEM5两种工具得出的结果显著相关，同时表明了图9中的曲线变化趋势基本一致．

图 10 GEM5与Roban相关性系数矩阵图

Figure 10. Correlation coefficient matrix diagram of GEM5 and Roban

下载: 全尺寸图片幻灯片

为了验证本文提出分析方法的精确性，采用过估计率这一指标对分析结果进行评估，过估计率 $\mathcal{R}$ 计算方式为

$\mathcal{R} = \frac{{T_{{\text{WCET}}}^{{\text{estimation}}}}}{{T_{{\text{WCET}}}^{{\text{simulation}}}}} ,$

(5)

其中， ${{T_{{\text{WCET}}}^{{\text{estimation}}}}}$ 和 ${{T_{{\text{WCET}}}^{{\text{simulation}}}}}$ 分别表示WCET的估计结果和仿真结果.

过估计率的统计结果如图11所示.由图11可知，本文方法分析得出的过估计率最大值为1.476，最小值为1.158，平均值为1.30，对比瑞典皇家理工学院^[25]设计实现的WCET分析工具KTA（KTH’s timing analysis），其平均过估计率为2.08，本文方法的过估计率降低了0.78．

图 11 过估计率统计结果

Figure 11. Statistical results of over estimation rate

下载: 全尺寸图片幻灯片

4. 结　　论

本文通过分析多级一致性协议体系架构下的Cache状态转换和内存访问延迟，得出多级一致性域WCET分析方法.实验表明，本文方法能够精确估算出多核处理器任务WCET，在改变Cache配置参数的情况下，GEM5仿真结果与本文工具Roban分析结果相关性系数不低于0.98，表明本文方法分析结果的变化趋势与GEM5仿真结果一致；通过与当前分析方法进行对比，证明本文方法相比现有方法的过估计率降低了0.78．

本文在进行Cache行为分析时，将总线假设为理想状态，而在实际的一致性协议中，如果存在大量的数据交互，将会导致总线发生阻塞，在以后的工作中，应将Cache之间共享总线纳入到分析范围中，进一步提高WCET分析结果的准确性.另外本文在考虑一致性协议时仅考虑了MESI协议，而在实际工程领域存在多种一致性协议，如Intel i7使用的MESIF协议、AMD Opteron使用的MOESI协议等，在后续的工作中将针对不同的一致性协议展开分析．

作者贡献声明：朱怡安提出研究思路和指导意见；史先琛提出了算法并撰写论文；姚烨、李联负责实现算法并设计实验方案；任鹏远、董威振、李佳钰参与实验方案设计并整理实验数据．

图 1 边缘智能联邦学习架构

Figure 1. Edge intelligent federated learning architecture

下载: 全尺寸图片幻灯片

图 2 FedCS协议概述

Figure 2. FedCS protocol overview

下载: 全尺寸图片幻灯片

图 3 模型分割迁移框架

Figure 3. Model segmentation migration framework

下载: 全尺寸图片幻灯片

图 4 自适应模型聚合与固定频率聚合的比较

Figure 4. Comparison of adaptive model aggregation and fixed frequency aggregation

下载: 全尺寸图片幻灯片

图 5 智能交通

Figure 5. Intelligent transportation

下载: 全尺寸图片幻灯片

图 6 通过空中计算并利用空间自由度进行参数聚合^[125]

Figure 6. The parameters are aggregated by air calculation and spatial freedom^[125]

下载: 全尺寸图片幻灯片

表 1 现有联邦学习综述研究对比

Table 1 Comparison of Studies on Existing Federated Learning Reviews

研究工作	资源优化	激励机制	算法设计	从无线网络角度优化	无线应用	说明
文献[11]	√	√	√	×	×	考虑边缘网络的FL
文献[12]	×	×	√	×	×
文献[13]	×	×	√	×	×
文献[14]	×	×	×	×	×	主要考虑用于缓存和计算卸载的FL
文献[15]	×	×	√	×	×
本文	√	√	√	√	√
注：“√”表示文献中完成了该项工作，“×”表示文献中未完成该项工作.

下载: 导出CSV

表 2 联邦学习客户端选择方案比较

Table 2 Comparison of Federated Learning Client Selection Schemes

方案类型	方案思路	客户端目标	服务器端目标
计算与通信资源优化	剔除不必要的模型更新^[17]、客户端分层^[18]、控制学习节奏^[19]、基于聚类实现自组织学习^[20]、长期能耗约束下的带宽分配^[21]、设置学习期限^[25]、基于设备的计算能力进行选择^[26]
激励机制	契约理论^[28]：基于信誉进行激励反馈鼓励可靠的终端设备参与学习	奖励和能耗的平衡	最大化由全局迭代时间与补偿给客户端的报酬之间的差异所获得的利润
	Stackelberg博弈^[31]：实现高质量无线通信效率的全局模型	奖励（即准确率等级）和成本（即通信和计算成本）的平衡	获取不同准确率的凹函数
	拍卖理论^[32-33]：最大限度地降低客户端投标的成本	奖励和成本的平衡	最小化投标成本
	修订目标函数权重^[30]	为了引入潜在的公平性并降低训练精度方差，通过在q-FedAvg中分配更高的相对权重来强调具有高经验损失的本地设备

下载: 导出CSV

表 3 模型压缩技术总结

Table 3 Summary of Model Compression Techniques

方法	优化手段	优缺点
结构化和草图更新机制^[48]	压缩传输模型，提升客户端到服务器的通信效率	客户端到服务器参数压缩；代价是复杂的模型结构可能出现收敛问题
服务端-客户端更新^[49]	压缩传输模型，提升服务器到客户端的通信效率	服务器到客户端参数压缩；代价是准确性降低，可能有收敛问题
草图^[50]	使用计数草图压缩模型更新，然后利用草图的可合并性来组合来自客户端的模型更新	解决了客户端参与稀少而导致的收敛问题，建立在假设网络；已经尽了最大努力使通信效率最大化；可能遇到网络瓶颈
Adam^[1]	通过使用Adam优化和压缩方案改进了FedAvg算法	Adam优化加快了收敛速度，压缩方案降低了通信开销
模型蒸馏^[51-52]	交换模型输出模型状态信息，即其有效载荷大小仅取决于输出维度的标签数量；然后使用联邦蒸馏实现权重更新规则	解决了数据独立同分布的问题；代价是无线信道对模型训练精度的影响

下载: 导出CSV

表 4 模型训练优化方法及特点

Table 4 Optimization Methods and Characteristics of Model Training

优化方法	特点	方法来源
数据卸载	利用边缘计算服务器的强大算力加快模型训练	文献[36−38]
模型分割迁移	分割模型和隐私保护技术	文献[42−44]
模型压缩	采用不同压缩粒度对模型输出值或者中间值梯度进行压缩	文献[48−52]

下载: 导出CSV

表 5 主要联邦学习模型聚合技术的比较总结

Table 5 A Comparative Summary of Major Federated Learning Mode Aggregation Technologies

聚合技术	优化角度	主要思想	特点
FedAvg^[7]	统计异构性	客户端对其本地数据执行多个批处理更新，并与服务器传输更新的权重，而不是梯度.	从统计的角度看，FedAvg已被证明设备间数据分布不一致的情况下开始发散；从系统的角度看，FedAvg不允许参与学习的设备根据其底层系统限制执行可变数量的本地更新.
FedProx^[55]	统计异构性	在每个客户端上的本地训练子问题中添加一项，以限制每个本地模型更新对全局模型的影响.	FedProx的提出是为了提高统计异质性数据的收敛性. 与FedAvg类似，在FedProx中，所有设备在全局聚合阶段的权重相等，因为没有考虑设备功能（例如硬件、电量）的差异.
FedPAQ^[53]	通信	在与服务器共享更新之前，允许客户端在模型上执行多个本地更新.	与FedAvg类似，FedPAQ中的新全局模型为局部模型的平均值，但这在强凸和非凸设置中都需要很高的复杂性.
FedMA^[54]	统计异构性	在执行聚合前考虑神经元的排列不变性，并允许全局模型大小自适应.	使用贝叶斯非参数机制根据数据分布的异构性调整中心模型的大小；FedMA中的贝叶斯非参数机制容易受到模型中毒攻击，在这种情况下，对手可以很容易地欺骗系统扩展全局模型，以适应任何中毒的本地模型.
Turbo-Aggregate^[62]	通信和安全	一种多组策略，其中客户端被分成几个组，模型更新以循环方式在组之间共享和一种保护用户隐私数据的附加秘密共享机制.	Turbo-Aggregate非常适合无线拓扑，在这种拓扑中，网络条件和用户可用性可能会快速变化. Turbo-Aggregate中嵌入的安全聚合机制虽然能有效处理用户流失，但无法适应加入网络的新用户. 因此，通过重新配置系统规范（即多组结构和编码设置）以确保满足弹性和隐私保证，开发一种可自我配置的协议来扩展它的.
自适应聚合^[63]	通信和统计异构性	在给定的资源预算下确定局部更新和全局参数聚合之间的最佳折中的自适应控制算法.	改变了全局聚合频率，以确保期望的模型性能，同时确保在FL训练过程中有效利用可用资源，例如能量，可用于边缘计算中的FL. 自适应聚合方案的收敛性保证目前只考虑凸损失函数.
HierFAVG^[65]	通信	一种分层的客户端—边缘—云聚合体系结构，边缘服务器聚合其客户端的模型更新，然后将它们发送到云服务器进行全局聚合.	这种多层结构能够在现有的客户端—云架构上实现更高效的模型交换. HierFAVG仍然容易出现掉队和终端设备掉线的问题.
自适应任务分配^[66]	设备异构性、通信、计算	在保证异构信道上的数据分发/聚合总次数和异构设备上的本地计算，在延迟约束下最大化学习精度.	自适应任务分配方案，该方案将最大化分布式学习者的本地学习迭代次数（从而提高学习精度），同时遵守时间限制. 该方案没考虑动态参数，如变化的信道状态和数据到达时间.
公平聚合^[67]	设备异构性、任务异构性、通信、计算	一种具有自适应学习率的定制学习算法，以适应不同的精度要求，并加快本地训练过程. 为边缘服务器提出了一个公平的全局聚合策略，以最小化异构终端设备之间的精度差异.	一种学习率自适应的CuFL算法，以最小化总学习时间. 考虑到终端设备的任务异质性，CuFL允许终端设备在满足其独特的精度要求后提前退出训练. 该方案没考虑动态参数，如变化的信道状态和数据到达时间.

下载: 导出CSV

表 6 边缘网络下基于联邦学习的无人机应用

Table 6 Unmanned Aerial Vehicle Application Based on Federated Learning in Edge Network

挑战	联邦学习				结果
挑战	客户端	服务器	数据特征	本地和全局模型	结果
边缘内容缓存^[95-96]	UAVs	边缘服务器边缘服务器	内容特征（新鲜度、位置、占用内存、内容请求历史等）	内容受欢迎度预测	有效地确定哪些内容应该存储在每个缓存中
无人机作为基站^[93]	地面用户	边缘服务器边缘服务器	关于地面用户可移动性的信息（位置、方向、速度等）	地面用户模式（移动性和内容负荷）的预测	优化无人机基站部署、提高网络覆盖和连通性、有效提供热门内容.
无人机轨迹规划^[92]	UAVs	边缘服务器或云	源、目的点位置、无人机机动性信息（速度、方向、位置、高度等）、无人机能量消耗、物理障碍、服务需求等.	每条潜在路径的性能预测	无人机选择最优轨迹、优化服务性能、优化无人机能耗

下载: 导出CSV

参考文献(133)

[1]	Mills J, Hu Jia, Min Geyong. Communication-efficient federated learning for wireless edge intelligence in IoT[J]. IEEE Internet of Things Journal, 2019, 7(7): 5986−5994
[2]	Covington P, Adams J, Sargin E. Deep neural networks for YouTube recommendations[C] //Proc of the 10th ACM Conf on Recommender Systems. New York: ACM, 2016: 191−198
[3]	Parkhi O M, Vedaldi A, Zisserman A. Deep face recognition[C] //Proc of the 15th IEEE Int Conf on Computer Vision Workshop. Piscataway, NJ: IEEE, 2015: 258−266
[4]	Mowla N I, Tran N H, Doh I, et al. Federated learning-based cognitive detection of jamming attack in flying ad-hoc network[J]. IEEE Access, 2020, 8: 4338−4350 doi: 10.1109/ACCESS.2019.2962873
[5]	Brik B, Ksentini A, Bouaziz M. Federated learning for UAVs-enabled wireless networks: Use cases, challenges, and open problems[J]. IEEE Access, 2020, 8: 53841−53849 doi: 10.1109/ACCESS.2020.2981430
[6]	Abbas N, Zhang Yan, Taherkordi A, et al. Mobile edge computing: A survey[J]. IEEE Internet of Things Journal, 2017, 5(1): 450−465
[7]	Mcmahan B, Moore E, Ramage D, et al. Communication-efficient learning of deep networks from decentralized data[C] //Proc of the 20th Int Conf on Artificial Intelligence and Statistics. New York: PMLR, 2017 : 1273−1282.
[8]	Yang Qiang, Liu Yang, Chen Tianjian, et al. Federated machine learning: Concept and applications[J]. ACM Transactions on Intelligent Systems and Technology, 2019, 10(2): 1−19
[9]	Zhou Zhi, Yang Song, Pu Lingjun, et al. CEFL: Online admission control, data scheduling, and accuracy tuning for cost-efficient federated learning across edge nodes[J]. IEEE Internet of Things Journal, 2020, 7(10): 9341−9356 doi: 10.1109/JIOT.2020.2984332
[10]	Ruder S. An overview of gradient descent optimization algorithms[J]. arXiv preprint, arXiv: 1609.04747, 2016
[11]	Lim W Y B, Luong N C, Hoang D T, et al. Federated learning in mobile edge networks: A comprehensive survey[J]. IEEE Communications Surveys & Tutorials, 2020, 22(3): 2031−2063
[12]	Li Tian, Sahu A K, Talwalkar A, et al. Federated learning: Challenges, methods, and future directions[J]. IEEE Signal Processing Magazine, 2020, 37(3): 50−60 doi: 10.1109/MSP.2020.2975749
[13]	Li Qinbin, Wen Zeyi, Wu Zhaomin, et al. A survey on federated learning systems: Vision, hype and reality for data privacy and protection[J]. arXiv preprint, arXiv: 1907.09693, 2019
[14]	Wang Xiaofei, Han Yiwen, Wang Chenyang, et al. In-edge AI: Intelligentizing mobile edge computing, caching and communication by federated learning[J]. IEEE Network, 2019, 33(5): 156−165 doi: 10.1109/MNET.2019.1800286
[15]	Kairouz P, Mcmahan H B, Avent B, et al. Advances and open problems in federated learning[J]. arXiv preprint, arXiv: 1912.04977, 2019
[16]	王艳,李念爽,王希龄,等. 编码技术改进大规模分布式机器学习性能综述[J]. 计算机研究与发展,2020,57(3):542−561 doi: 10.7544/issn1000-1239.2020.20190286 Wang Yan, Li Nianshuang, Wang Xiling, et al. Coding-based performance improvement of distributed machine learning in large-scale clusters[J]. Journal of Computer Research and Development, 2020, 57(3): 542−561 (in Chinese) doi: 10.7544/issn1000-1239.2020.20190286
[17]	Jin Yibo, Jiao Lei, Qian Zhuzhong, et al. Resource-efficient and convergence-preserving online participant selection in federated learning[C] //Proc of the 40th IEEE Int Conf on Distributed Computing Systems (ICDCS). Piscataway, NJ: IEEE, 2020: 606−616
[18]	Chai Z, Ali A, Zawad S, et al. TiFL: A tier-based federated learning system[C] //Proc of the 29th Int Symp on High-Performance Parallel and Distributed Computing. New York: ACM, 2020: 125−136
[19]	Li Li, Xiong Haoyi, Guo Zhishan, et al. SmartPC: Hierarchical pace control in real-time federated learning system[C] //Proc of the 40th IEEE Real-Time Systems Symp (RTSS). Piscataway, NJ: IEEE, 2019: 406−418
[20]	Khan L U, Alsenwi M, Han Zhu, et al. Self organizing federated learning over wireless networks: A socially aware clustering approach[C] //Proc of the 34th Int Conf on Information Networking (ICOIN). Piscataway, NJ: IEEE, 2020: 453−458
[21]	Xu Jie, Wang Heqiang. Client selection and bandwidth allocation in wireless federated learning networks: A long-term perspective[J]. IEEE Transactions on Wireless Communications, 2020, 20(2): 1188−1200
[22]	Damaskinos G, Guerraoui R, Kermarrec A M, et al. Fleet: Online federated learning via staleness awareness and performance prediction[C] //Proc of the 21st Int Middleware Conf. New York: ACM, 2020: 163−177
[23]	Sprague M R, Jalalirad A, Scavuzzo M, et al. Asynchronous federated learning for geospatial applications[C] //Proc of the Joint European Conf on Machine Learning and Knowledge Discovery in Databases. Cham, Switzerland: Springer, 2018: 21−28
[24]	Wu Wentai, He Ligang, Lin Weiwei, et al. Safa: A semi-asynchronous protocol for fast federated learning with low overhead[J]. IEEE Transactions on Computers, 2020, 70(5): 655−668
[25]	Nishio T, Yonetani R. Client selection for federated learning with heterogeneous resources in mobile edge[C/OL] //Proc of the 53rd IEEE Int Conf on Communications. Piscataway, NJ: IEEE, 2019[2022-09-05].https://ieeexplore.ieee.org/document/8761315
[26]	Yoshida N, Nishio T, Morikura M, et al. Hybrid-FL for wireless networks: Cooperative learning mechanism using non-IID data[C/OL] //Proc of the 54th IEEE Int Conf on Communications (ICC). Piscataway, NJ: IEEE, 2020[2022-09-05].https://ieeexplore.ieee.org/abstract/document/9149323
[27]	Khan L U, Pandey S R, Tran N H, et al. Federated learning for edge networks: Resource optimization and incentive mechanism[J]. IEEE Communications Magazine, 2020, 58(10): 88−93 doi: 10.1109/MCOM.001.1900649
[28]	Kang Jiawen, Xiong Zehui, Niyato D, et al. Incentive mechanism for reliable federated learning: A joint optimization approach to combining reputation and contract theory[J]. IEEE Internet of Things Journal, 2019, 6(6): 10700−10714 doi: 10.1109/JIOT.2019.2940820
[29]	Kim H, Park J, Bennis M, et al. Blockchained on-device federated learning[J]. IEEE Communications Letters, 2019, 24(6): 1279−1283
[30]	Li Tian, Sanjabi M, Beirami A, et al. Fair resource allocation in federated learning[J]. arXiv preprint, arXiv: 1905.10497, 2020
[31]	Pandey S R, Tran N H, Bennis M, et al. A crowdsourcing framework for on-device federated learning[J]. IEEE Transactions on Wireless Communications, 2020, 19(5): 3241−3256 doi: 10.1109/TWC.2020.2971981
[32]	Le T H T, Tran N H, Tun Y K, et al. Auction based incentive design for efficient federated learning in cellular wireless networks[C/OL] //Proc of the IEEE Wireless Communications and Networking Conf (WCNC). Piscataway, NJ: IEEE, 2020[2022-09-05].https://ieeexplore.ieee.org/abstract/document/9120773
[33]	Jiao Yutao, Wang Ping, Niyato D, et al. Toward an automated auction framework for wireless federated learning services market[J]. IEEE Transactions on mobile Computing, 2020, 20(10): 3034−3048
[34]	Gao Xiaozheng, Wang Ping, Niyato D, et al. Auction-based time scheduling for backscatter-aided RF-powered cognitive radio networks[J]. IEEE Transactions on Wireless Communications, 2019, 18(3): 1684−1697 doi: 10.1109/TWC.2019.2895340
[35]	Ko BongJun, Wang Shiqiang, He Ting, et al. On data summarization for machine learning in multi-organization federations[C] //Proc of the 7th IEEE Int Conf on Smart Computing (SMARTCOMP). Piscataway, NJ: IEEE, 2019: 63−68
[36]	Valerio L, Passarella A, Conti M. Optimal trade-off between accuracy and network cost of distributed learning in mobile edge Computing: An analytical approach[C/OL] //Proc of the 18th Int Symp on a World of Wireless, Mobile and Multimedia Networks (WoWMoM). Piscataway, NJ: IEEE, 2017[2022-09-05].https://ieeexplore.ieee.org/abstract/document/7974310
[37]	Skatchkovsky N, Simeone O. Optimizing pipelined computation and communication for latency-constrained edge learning[J]. IEEE Communications Letters, 2019, 23(9): 1542−1546 doi: 10.1109/LCOMM.2019.2922658
[38]	Huang Yutao, Zhu Yifei, Fan Xiaoyi, et al. Task scheduling with optimized transmission time in collaborative cloud-edge learning[C/OL] //Proc of the 27th Int Conf on Computer Communication and Networks (ICCCN). Piscataway, NJ: IEEE, 2018[2022-09-05].https://ieeexplore.ieee.org/abstract/document/8487352
[39]	Dey S, Mukherjee A, Pal A, et al. Partitioning of CNN models for execution on fog devices[C] //Proc of the 1st ACM Int Workshop on Smart Cities and Fog Computing. New York: ACM, 2018: 19−24
[40]	Zhang Shigeng, Li Yinggang, Liu Xuan, et al. Towards real-time cooperative deep inference over the cloud and edge end devices[J]. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2020, 4(2): 1−24
[41]	Dey S, Mukherjee A, Pal A. Embedded deep inference in practice: Case for model partitioning[C] //Proc of the 1st Workshop on Machine Learning on Edge in Sensor Systems. New York: ACM, 2019: 25−30
[42]	Lin Bing, Huang Yinhao, Zhang Jianshan, et al. Cost-driven off-loading for DNN-based applications over cloud, edge, and end devices[J]. IEEE Transactions on Industrial Informatics, 2019, 16(8): 5456−5466
[43]	Wang Lingdong, Xiang Liyao, Xu Jiayu, et al. Context-aware deep model compression for edge cloud computing[C] //Proc of the 40th Int Conf on Distributed Computing Systems (ICDCS). Piscataway, NJ: IEEE, 2020: 787−797
[44]	Wang Ji, Zhang Jianguo, Bao Weidong, et al. Not just privacy: Improving performance of private deep learning in mobile cloud[C] //Proc of the 24th ACM SIGKDD Int Conf on Knowledge Discovery & Data Mining. New York: ACM, 2018: 2407−2416
[45]	Zhang Jiale, Wang Junyu, Zhao Yanchao, et al. An efficient federated learning scheme with differential privacy in mobile edge computing[C] //Proc of the Int Conf on Machine Learning and Intelligent Communications. Berlin: Springer, 2019: 538−550
[46]	Ivkin N, Rothchild D, Ullah E, et al. Communication-efficient distributed SGD with sketching[J]. Advances in Neural Information Processing Systems, 2019, 32: 13144−13154
[47]	Zhang Boyu, Davoodi A, Hu Yuhen. Exploring energy and accuracy tradeoff in structure simplification of trained deep neural networks[J]. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 2018, 8(4): 836−84 doi: 10.1109/JETCAS.2018.2833383
[48]	Konen J, Mcmahan H B, Yu F X, et al. Federated learning: Strategies for improving communication efficiency[J]. arXiv preprint, arXiv: 1610.05492, 2016
[49]	Caldas S, Konečny J, Mcmahan H B, et al. Expanding the reach of federated learning by reducing client resource requirements[J]. arXiv preprint, arXiv: 1812.07210, 2018
[50]	Rothchild D, Panda A, Ullah E, et al. FetchSGD: Communication-efficient federated learning with sketching[C] //Proc of the 37th Int Conf on Machine Learning. New York: PMLR, 2020: 8253−8265
[51]	Jeong E, Oh S, Kim H, et al. Communication-efficient on-device machine learning: Federated distillation and augmentation under non-IID private data[J]. arXiv preprint, arXiv: 1811.11479, 2018
[52]	Ahn J H, Simeone O, Kang J. Wireless federated distillation for distributed edge learning with heterogeneous data[C/OL] //Proc of the 30th Annual Int Symp on Personal, Indoor and Mobile Radio Communications (PIMRC). Piscataway, NJ: IEEE, 2019[2022-09-05]. https://ieeexplore.ieee.org/abstract/document/8904164
[53]	Reisizadeh A, Mokhtari A, Hassani H, et al. FedPAQ: A communication-efficient federated learning method with periodic averaging and quantization[C] //Proc of the 23rd Int Conf on Artificial Intelligence and Statistics. New York: PMLR, 2020: 2021−2031
[54]	Karimireddy S P, Kale S, Mohri M, et al. SCAFFOLD: Stochastic controlled averaging for federated learning[C] //Proc of the 37th Int Conf on Machine Learning. New York: PMLR, 2020: 5132−5143
[55]	Li Tian, Sahu A K, Zaheer M, et al. Federated optimization in heterogeneous networks[J]. Proceedings of Machine Learning and Systems, 2020, 2: 429−450
[56]	Wang Hongyi, Yurochkin M, Sun Yuekai, et al. Federated learning with matched averaging[J]. arXiv preprint, arXiv: 2002.06440, 2020
[57]	Pillutla K, Kakade S M, Harchaoui Z. Robust aggregation for federated learning[J]. IEEE Transactions on Signal Processing, 2022, 70: 1142−1154 doi: 10.1109/TSP.2022.3153135
[58]	Grama M, Musat M, Muñoz-González L, et al. Robust aggregation for adaptive privacy preserving federated learning in healthcare[J]. arXiv preprint, arXiv: 2009.08294, 2020
[59]	Ang Fan, Chen Li, Zhao Nan, et al. Robust federated learning with noisy communication[J]. IEEE Transactions on Communications, 2020, 68(6): 3452−3464 doi: 10.1109/TCOMM.2020.2979149
[60]	Lu Yanyang, Fan Lei. An efficient and robust aggregation algorithm for learning federated CNN[C/OL] //Proc of the 3rd Int Conf on Signal Processing and Machine Learning. New York: ACM, 2020[2022-09-05].https://dl.acm.org/doi/abs/10.1145/3432291.3432303
[61]	Chen Zhou, Lv Na, Liu Pengfei, et al. Intrusion detection for wireless edge networks based on federated learning[J]. IEEE Access, 2020, 8: 217463−217472 doi: 10.1109/ACCESS.2020.3041793
[62]	So J, Güler B, Avestimehr A S. Turbo-aggregate: Breaking the quadratic aggregation barrier in secure federated learning[J]. IEEE Journal on Selected Areas in Information Theory, 2021, 2(1): 479−489 doi: 10.1109/JSAIT.2021.3054610
[63]	Wang Shiqiang, Tuor T, Salonidis T, et al. Adaptive federated learning in resource constrained edge computing systems[J]. IEEE Journal on Selected Areas in Communications, 2019, 37(6): 1205−1221 doi: 10.1109/JSAC.2019.2904348
[64]	Zhang Xiongtao, Zhu Xiaomin, Wang Ji, et al. Federated learning with adaptive communication compression under dynamic bandwidth and unreliable networks[J]. Information Sciences, 2020, 540(5): 242−262
[65]	Liu Lumin, Zhang Jun, Song Shenghui, et al. Client-edge-cloud hierarchical federated learning[C/OL] //Proc of the 54th IEEE Int Conf on Communications (ICC). Piscataway, NJ: IEEE, 2020[2022-09-05].https://ieeexplore.ieee.org/abstract/document/9148862
[66]	Mohammad U, Sorour S. Adaptive task allocation for mobile edge learning[C/OL] //Proc of the Wireless Communications and Networking Conf Workshop (WCNCW). Piscataway, NJ: IEEE, 2019[2022-09-05].https://ieeexplore.ieee.org/abstract/document/8902527
[67]	Jiang Hui, Liu Min, Yang Bo, et al. Customized federated learning for accelerated edge computing with heterogeneous task targets[J]. Computer Networks, 2020, 183(12): 107569−107569
[68]	Lin Yujun, Han Song, Mao Huizi, et al. Deep gradient compression: Reducing the communication bandwidth for distributed training[J]. arXiv preprint, arXiv: 1712.01887, 2017
[69]	Liu Wei, Chen Li, Chen Yunfei, et al. Accelerating federated learning via momentum gradient descent[J]. IEEE Transactions on Parallel and Distributed Systems, 2020, 31(8): 1754−1766 doi: 10.1109/TPDS.2020.2975189
[70]	Abdi A, Saidutta Y M, Fekri F. Analog compression and communication for federated learning over wireless MAC[C/OL] //Proc of the 21st Int Workshop on Signal Processing Advances in Wireless Communications (SPAWC). Piscataway, NJ: IEEE, 2020[2022-09-05]. https://ieeexplore.ieee.org/abstract/document/9154309
[71]	Alistarh D, Grubic D, Li J, et al. QSGD: Communication-efficient SGD via gradient quantization and encoding[J]. Advances in Neural Information Processing Systems, 2017, 30: 1709−1720
[72]	Bernstein J, Wang Yuxiang, Azizzadenesheli K, et al. signSGD: Compressed optimisation for non-convex problems[C] //Proc of the 35th Int Conf on Machine Learning. New York: PMLR, 2018: 560−569
[73]	Zhu Guangxu, Wang Yong, Huang Kaibin. Broadband analog aggregation for low-latency federated edge learning[J]. IEEE Transactions on Wireless Communications, 2019, 19(1): 491−506
[74]	Amiri M M, Gündüz D. Federated learning over wireless fading channels[J]. IEEE Transactions on Wireless Communications, 2020, 19(5): 3546−3557 doi: 10.1109/TWC.2020.2974748
[75]	Wu Jiaxiang, Huang Weidong, Huang Junzhou, et al. Error compensated quantized SGD and its applications to large-scale distributed optimization[C] //Proc of the 35th Int Conf on Machine Learning. New York: PMLR, 2018: 5325−5333
[76]	Basu D, Data D, Karakus C, et al. Qsparse-local-SGD: Distributed SGD with quantization, sparsification, and local computations[J]. arXiv preprint, arXiv: 1906.02367, 2019
[77]	Xin Ran, Kar S, Khan U A. An introduction to decentralized stochastic optimization with gradient tracking[J]. arXiv preprint, arXiv: 1907.09648, 2019
[78]	Haddadpour F, Kamani M M, Mokhtari A, et al. Federated learning with compression: Unified analysis and sharp guarantees[C] //Proc of the 24th Int Conf on Artificial Intelligence and Statistics. New York: PMLR, 2021: 2350−2358
[79]	Tang Hanlin, Lian Xiangru, Yan Ming, et al. D²: Decentralized training over decentralized data[C] //Proc of the 35th Int Conf on Machine Learning. New York: PMLR, 2018: 4848−4856
[80]	Amiri M M, Gündüz D. Machine learning at the wireless edge: Distributed stochastic gradient descent over-the-air[J]. IEEE Transactions on Signal Processing, 2020, 68(1): 2155−2169
[81]	Zhu Guangxu, Du Yuqing, Gündüz D, et al. One-bit over-the-air aggregation for communication-efficient federated edge learning: Design and convergence analysis[J]. IEEE Transactions on Wireless Communications, 2020, 20(3): 2120−2135
[82]	Lu Yunlong, Huang Xiaohong, Dai Yueyue, et al. Differentially private asynchronous federated learning for mobile edge computing in urban informatics[J]. IEEE Transactions on Industrial Informatics, 2019, 16(3): 2134−2143
[83]	Sun Jun, Chen Tianyi, Giannakis G B, et al. Communication-efficient distributed learning via lazily aggregated quantized gradients[J]. arXiv preprint, arXiv: 1909.07588, 2019
[84]	Shokri R, Shmatikov V. Privacy-preserving deep learning[C] //Proc of the 22nd ACM SIGSAC Conf on Computer and Communications Security. New York: ACM, 2015: 1310−1321
[85]	Elgabli A, Park J, Bedi A S, et al. Q-GADMM: Quantized group ADMM for communication efficient decentralized machine learning[J]. IEEE Transactions on Communications, 2020, 69(1): 164−181
[86]	Elgabli A, Park J, Bedi A S, et al. GADMM: Fast and communication efficient framework for distributed machine learning[J]. Journal of Machine Learning Research, 2020, 21(76): 1−39
[87]	Elgabli A, Park J, Ahmed S, et al. L-FGADMM: Layer-wise federated group ADMM for communication efficient decentralized deep learning[C/OL] //Proc of the IEEE Wireless Communications and Networking Conf(WCNC). Piscataway, NJ: IEEE, 2020[2022-09-05].https://ieeexplore.ieee.org/abstract/document/9120758
[88]	Zhang Wei, Gupta S, Lian Xiangru, et al. Staleness-aware async-SGD for distributed deep learning[J]. arXiv preprint, arXiv: 1511.05950, 2015
[89]	Tao Zeyi, Li Qun. eSGD: Communication efficient distributed deep learning on the edge[C/OL] //Proc of the 1st USENIX Workshop on Hot Topics in Edge Computing (HotEdge 18). Berkeley, CA: USENIX Association, 2018[2022-09-05].https://www.usenix.org/conference/hotedge18/presentation/tao
[90]	Wang Luping, Wang Wei, Li Bo. CMFL: Mitigating communication overhead for federated learning[C] //Proc of the 39th Int Conf on Distributed Computing Systems (ICDCS). Piscataway, NJ: IEEE: 954−964
[91]	Xing Hong, Simeone O, Bi Suzhi. Decentralized federated learning via SGD over wireless D2D networks[C/OL] //Proc of the 21st Int Workshop on Signal Processing Advances in Wireless Communications (SPAWC). Piscataway, NJ: IEEE, 2020[2022-09-05].https://ieeexplore.ieee.org/abstract/document/9154332
[92]	Shiri H, Park J, Bennis M. Communication-efficient massive UAV online path control: Federated learning meets mean-field game theory[J]. IEEE Transactions on Communications, 2020, 68(11): 6840−6857 doi: 10.1109/TCOMM.2020.3017281
[93]	Zeng Tengchan, Semiari O, Mozaffari M, et al. Federated learning in the sky: Joint power allocation and scheduling with UAV swarms[C/OL] //Proc of the 54th IEEE Int Conf on Communications (ICC). Piscataway, NJ: IEEE, 2020[2022-09-05].https://ieeexplore.ieee.org/abstract/document/9148776
[94]	Pham Q V, Zeng Ming, Ruby R, et al. UAV communications for sustainable federated learning[J]. IEEE Transactions on Vehicular Technology, 2021, 70(4): 3944−3948 doi: 10.1109/TVT.2021.3065084
[95]	Fadlullah Z M, Kato N. HCP: Heterogeneous computing platform for federated learning based collaborative content caching towards 6G networks[J]. IEEE Transactions on Emerging Topics in Computing, 2020, 10(1): 112−123
[96]	Chen Mingzhe, Mozaffari M, Saad W, et al. Caching in the sky: Proactive deployment of cache-enabled unmanned aerial vehicles for optimized quality-of-experience[J]. IEEE Journal on Selected Areas in Communications, 2017, 35(5): 1046−1061 doi: 10.1109/JSAC.2017.2680898
[97]	Lahmeri M A, Kishk M A, Alouini M S. Artificial intelligence for UAV-enabled wireless networks: A survey[J]. IEEE Open Journal of the Communications Society, 2021, 2: 1015−1040 doi: 10.1109/OJCOMS.2021.3075201
[98]	Wang Yuntao, Su Zhou, Zhang Ning, et al. Learning in the air: Secure federated learning for UAV-assisted crowdsensing[J]. IEEE Transactions on Network Science and Engineering, 2020, 8(2): 1055−1069
[99]	Lim W Y B, Huang Jianqiang, Xiong Zehui, et al. Towards federated learning in UAV-enabled Internet of vehicles: A multi-dimensional contract-matching approach[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 22(8): 5140−5154 doi: 10.1109/TITS.2021.3056341
[100]	Samarakoon S, Bennis M, Saad W, et al. Distributed federated learning for ultra-reliable low-latency vehicular communications[J]. IEEE Transactions on Communications, 2019, 68(2): 1146−1159
[101]	Ye Dongdong, Yu Rong, Pan Miao, et al. Federated learning in vehicular edge computing: A selective model aggregation approach[J]. IEEE Access, 2020, 8: 23920−23935 doi: 10.1109/ACCESS.2020.2968399
[102]	Lu Yunlong, Huang Xiaohong, Dai Yueyue, et al. Federated learning for data privacy preservation in vehicular cyber-physical systems[J]. IEEE Network, 2020, 34(3): 50−56 doi: 10.1109/MNET.011.1900317
[103]	Du Zhaoyang, Wu Celimuge, Yoshinaga T, et al. Federated learning for vehicular Internet of things: Recent advances and open issues[J]. IEEE Open Journal of the Computer Society, 2020, 1: 45−61 doi: 10.1109/OJCS.2020.2992630
[104]	Deveaux D, Higuchi T, Uçar S, et al. On the orchestration of federated learning through vehicular knowledge networking[C/OL] //Proc of IEEE Vehicular Networking Conf (VNC). Piscataway, NJ: IEEE, 2020[2022-09-05].https://ieeexplore.ieee.org/abstract/document/9318386
[105]	Chen Mingzhe, Semiari O, Saad W, et al. Federated echo state learning for minimizing breaks in presence in wireless virtual reality networks[J]. IEEE Transactions on Wireless Communications, 2019, 19(1): 177−191
[106]	Mozaffari M, Saad W, Bennis M, et al. A tutorial on UAVs for wireless networks: Applications, challenges, and open problems[J]. IEEE Communications Surveys & Tutorials, 2019, 21(3): 2334−2360
[107]	Samarakoon S, Bennis M, Saad W, et al. Federated learning for ultra-reliable low-latency V2V communications[C/OL] //Proc of the IEEE Global Communications Conf (GLOBECOM). Piscataway, NJ: IEEE, 2018[2022-09-05].https://ieeexplore.ieee.org/abstract/document/8647927
[108]	Feyzmahdavian H R, Aytekin A, Johansson M. An asynchronous mini-batch algorithm for regularized stochastic optimization[J]. IEEE Transactions on Automatic Control, 2016, 61(12): 3740−3754 doi: 10.1109/TAC.2016.2525015
[109]	Lu Yunlong, Huang Xiaohong, Zhang Ke, et al. Blockchain empowered asynchronous federated learning for secure data sharing in Internet of vehicles[J]. IEEE Transactions on Vehicular Technology, 2020, 69(4): 4298−4311 doi: 10.1109/TVT.2020.2973651
[110]	Yin Feng, Lin Zhidi, Kong Qinglei, et al. FedLoc: Federated learning framework for data-driven cooperative localization and location data processing[J]. IEEE Open Journal of Signal Processing, 2020, 1: 187−215 doi: 10.1109/OJSP.2020.3036276
[111]	Merluzzi M, Di Lorenzo P, Barbarossa S. Dynamic resource allocation for wireless edge machine learning with latency and accuracy guarantees[C] //Proc of the 45th IEEE Int Conf on Acoustics, Speech and Signal Processing (ICASSP). Piscataway, NJ: IEEE, 2020: 9036−9040
[112]	Yang Zhaohui, Chen Mingzhe, Saad W, et al. Energy efficient federated learning over wireless communication networks[J]. IEEE Transactions on Wireless Communications, 2020, 20(3): 1935−1949
[113]	Luo Siqi, Chen Xu, Wu Qiong, et al. Hfel: Joint edge association and resource allocation for cost-efficient hierarchical federated edge learning[J]. IEEE Transactions on Wireless Communications, 2020, 19(10): 6535−6548 doi: 10.1109/TWC.2020.3003744
[114]	Abad M S H, Ozfatura E, Gunduz D, et al. Hierarchical federated learning across heterogeneous cellular networks[C] //Proc of the 45th IEEE Int Conf on Acoustics, Speech and Signal Processing (ICASSP). Piscataway, NJ: IEEE, 2020: 8866−8870
[115]	Liu Dongzhu, Zhu Guangxu, Zhang Jun, et al. Data-importance aware user scheduling for communication-efficient edge machine learning[J]. IEEE Transactions on Cognitive Communications and Networking, 2020, 7(1): 265−278
[116]	Zhan Yufeng, Li Peng, Guo Song. Experience-driven computational resource allocation of federated learning by deep reinforcement learning[C] //Proc of the 34th 2020 IEEE Int Parallel and Distributed Processing Symp (IPDPS). Piscataway, NJ: IEEE, 2020: 234−243
[117]	Zeng Qunsong, Du Yuqing, Huang Kaibin, et al. Energy-efficient radio resource allocation for federated edge learning[C/OL] //Proc of the 54th 2020 IEEE Intl Conf on Communications Workshops (ICC Workshops). Piscataway, NJ: IEEE, 2020[2022-09-05]. https://ieeexplore.ieee.org/abstract/document/9145118
[118]	Chen Mingzhe, Poor H V, Saad W, et al. Convergence time optimization for federated learning over wireless networks[J]. IEEE Transactions on Wireless Communications, 2020, 20(4): 2457−2471
[119]	Mo Xiaopeng, Xu Jie. Energy-efficient federated edge learning with joint communication and computation design[J]. Journal of Communications and Information Networks, 2021, 6(2): 110−124 doi: 10.23919/JCIN.2021.9475121
[120]	Ren Jinke, Yu Guanding, Ding Guangyao. Accelerating DNN training in wireless federated edge learning systems[J]. IEEE Journal on Selected Areas in Communications, 2020, 39(1): 219−232
[121]	Anh T T, Luong N C, Niyato D, et al. Efficient training management for mobile crowd-machine learning: A deep reinforcement learning approach[J]. IEEE Wireless Communications Letters, 2019, 8(5): 1345−1348 doi: 10.1109/LWC.2019.2917133
[122]	Nguyen H T, Luong N C, Zhao J, et al. Resource allocation in mobility-aware federated learning networks: A deep reinforcement learning approach[C/OL] //Pro of the 6th World Forum on Internet of Things (WF-IoT). Piscataway, NJ: IEEE, 2020[2022-09-05].https://ieeexplore.ieee.org/abstract/document/9221089
[123]	Zhang Xueqing, Liu Yanwei, Liu Jinxia, et al. D2D-assisted federated learning in mobile edge computing networks [C/OL] //Pro of the 2021 IEEE Wireless Communications and Networking Conf (WCNC). Piscataway, NJ: IEEE, 2021[2022-09-05].https://ieeexplore.ieee.org/abstract/document/9417459
[124]	Yang Kai, Jiang Tao, Shi Yuanming, et al. Federated learning via over-the-air computation[J]. IEEE Transactions on Wireless Communications, 2020, 19(3): 2022−2035 doi: 10.1109/TWC.2019.2961673
[125]	Qin Zhijin, Li G Y, Ye Hao. Federated learning and wireless communications[J]. IEEE Wireless Communications, 2021, 28(5): 134−140 doi: 10.1109/MWC.011.2000501
[126]	Amiria M M, Dumanb T M, Gündüzc D, et al. Collaborative machine learning at the wireless edge with blind transmitters[C/OL] //Proc of the 7th IEEE Global Conf on Signal and Information Processing. Piscataway, NJ: IEEE, 2019[2022-09-05].https://iris.unimore.it/handle/11380/1202665
[127]	Chen Mingzhe, Yang Zhaohui, Saad W, et al. A joint learning and communications framework for federated learning over wireless networks[J]. IEEE Transactions on Wireless Communications, 2020, 20(1): 269−283
[128]	Yang H H, Arafa A, Quek T Q, et al. Age-based scheduling policy for federated learning in mobile edge networks[C] //Proc of the 45th IEEE Int Conf on Acoustics, Speech and Signal Processing (ICASSP). Piscataway, NJ: IEEE: 8743−8747
[129]	Dinh C, Tran N H, Nguyen M N, et al. Federated learning over wireless networks: Convergence analysis and resource allocation[J]. IEEE/ACM Transactions on Networking, 2020, 29(1): 398−409
[130]	Yang Hao, Liu Zuozhu, Quek T Q, et al. Scheduling policies for federated learning in wireless networks[J]. IEEE Transactions on Communications, 2019, 68(1): 317−333
[131]	Shi Wenqi, Zhou Sheng, Niu Zhisheng. Device scheduling with fast convergence for wireless federated learning[C/OL] //Proc of the 54th IEEE Int Conf on Communications (ICC). Piscataway, NJ: IEEE, 2020[2022-09-05]. https://ieeexplore.ieee.org/abstract/document/9149138
[132]	Amiri M M, Gündüz D, Kulkarni S R, et al. Update aware device scheduling for federated learning at the wireless edge[C] //Proc of the 2020 IEEE Int Symp on Information Theory (ISIT). Piscataway, NJ: IEEE, 2020: 2598−2603
[133]	Bonawitz K, Ivanov V, Kreuter B, et al. Practical secure aggregation for privacy-preserving machine learning[C] //Proc of the ACM SIGSAC Conf on Computer and Communications Security. New York: ACM, 2017: 1175−1191

施引文献(2)

期刊类型引用(2)

1.	李斌，刘思尧. 基于LSM树的视频数据扩容存储系统设计. 电子设计工程. 2025(01): 31-35 . 百度学术
2.	杨勇鹏，蒋德钧. 一种日志结构块存储系统一致性模型. 高技术通讯. 2024(04): 366-378 . 百度学术