Processing math: 0%
  • 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
高级检索

联邦学习开源框架综述

林伟伟, 石方, 曾岚, 李董东, 许银海, 刘波

林伟伟, 石方, 曾岚, 李董东, 许银海, 刘波. 联邦学习开源框架综述[J]. 计算机研究与发展, 2023, 60(7): 1551-1580. DOI: 10.7544/issn1000-1239.202220148
引用本文: 林伟伟, 石方, 曾岚, 李董东, 许银海, 刘波. 联邦学习开源框架综述[J]. 计算机研究与发展, 2023, 60(7): 1551-1580. DOI: 10.7544/issn1000-1239.202220148
Lin Weiwei, Shi Fang, Zeng Lan, Li Dongdong, Xu Yinhai, Liu Bo. Survey of Federated Learning Open-Source Frameworks[J]. Journal of Computer Research and Development, 2023, 60(7): 1551-1580. DOI: 10.7544/issn1000-1239.202220148
Citation: Lin Weiwei, Shi Fang, Zeng Lan, Li Dongdong, Xu Yinhai, Liu Bo. Survey of Federated Learning Open-Source Frameworks[J]. Journal of Computer Research and Development, 2023, 60(7): 1551-1580. DOI: 10.7544/issn1000-1239.202220148
林伟伟, 石方, 曾岚, 李董东, 许银海, 刘波. 联邦学习开源框架综述[J]. 计算机研究与发展, 2023, 60(7): 1551-1580. CSTR: 32373.14.issn1000-1239.202220148
引用本文: 林伟伟, 石方, 曾岚, 李董东, 许银海, 刘波. 联邦学习开源框架综述[J]. 计算机研究与发展, 2023, 60(7): 1551-1580. CSTR: 32373.14.issn1000-1239.202220148
Lin Weiwei, Shi Fang, Zeng Lan, Li Dongdong, Xu Yinhai, Liu Bo. Survey of Federated Learning Open-Source Frameworks[J]. Journal of Computer Research and Development, 2023, 60(7): 1551-1580. CSTR: 32373.14.issn1000-1239.202220148
Citation: Lin Weiwei, Shi Fang, Zeng Lan, Li Dongdong, Xu Yinhai, Liu Bo. Survey of Federated Learning Open-Source Frameworks[J]. Journal of Computer Research and Development, 2023, 60(7): 1551-1580. CSTR: 32373.14.issn1000-1239.202220148

联邦学习开源框架综述

基金项目: 广东省重点领域研发计划项目(2021B0101420002);国家自然科学基金项目 (62072187, 61872084);广东省基础与应用基础研究重大项目(2019B030302002);广州市开发区国际合作项目(2021GH10, 2020GH10)
详细信息
    作者简介:

    林伟伟: 1980年生. 博士,教授,博士生导师. CCF高级会员. 主要研究方向为云计算、大数据、人工智能应用技术

    石方: 1993年生. 博士研究生. CCF学生会员. 主要研究方向为云计算和联邦学习

    曾岚: 2001年生. 本科生. 主要研究方向为联邦学习

    李董东: 1994年生. 博士研究生. 主要研究方向为大数据和联邦学习

    许银海: 1998年生. 硕士研究生. 主要研究方向为大数据和联邦学习

    刘波: 1968年生. 博士,教授. 主要研究方向为分布式计算和人工智能

    通讯作者:

    刘波(liugubin530@126.com

  • 中图分类号: TP393

Survey of Federated Learning Open-Source Frameworks

Funds: This work was supported by the Key-Area Research and Development Program of Guangdong Province (2021B0101420002),the National Natural Science Foundation of China (62072187, 61872084), the Guangdong Major Project of Basic and Applied Basic Research (2019B030302002), and the Guangzhou Development Zone Science and Technology (2021GH10, 2020GH10).
More Information
    Author Bio:

    Lin Weiwei: born in 1980. PhD, professor, PhD supervisor. Senior member of CCF. His main research interests include cloud computing, big data, and application technology of artificial intelligence

    Shi Fang: born in 1993. PhD candidate. Student member of CCF. Her main research interests include cloud computing and federated learning

    Zeng Lan: born in 2001. Undergraduate. Her main research interest includes federated learning

    Li Dongdong: born in 1994. PhD candidate. His main research interests include big data and federated learning

    Xu Yinhai: born in 1998. Master candidate. His main research interests include big data and federated learning

    Liu Bo: born in 1968. PhD, professor. His main research interests include distributed computing and artificial intelligence

  • 摘要:

    近年来,联邦学习作为破解数据共享壁垒的有效解决方案被广泛关注,并被逐步应用于医疗、金融和智慧城市等领域.联邦学习框架是联邦学习学术研究和工业应用的基石.虽然Google、OpenMined、微众银行和百度等企业开源了各自的联邦学习框架和系统,然而,目前缺少对这些联邦学习开源框架的技术原理、适用场景、存在问题等的深入研究和比较.为此,根据各开源框架在业界的受众程度,选取了目前应用较广和影响较大的联邦学习开源框架进行深入研究.针对不同类型的联邦学习框架,首先分别从系统架构和系统功能2个层次对各框架进行剖析;其次从隐私机制、机器学习算法、计算范式、学习类型、训练架构、通信协议、可视化等多个维度对各框架进行深入对比分析.而且,为了帮助读者更好地选择和使用开源框架实现联邦学习应用,给出了面向2个不同应用场景的联邦学习实验.最后,基于目前框架存在的开放性问题,从隐私安全、激励机制、跨框架交互等方面讨论了未来可能的研究发展方向,旨在为开源框架的开发创新、架构优化、安全改进以及算法优化等提供参考和思路.

    Abstract:

    In recent years, federated learning (FL) has gained widespread attention as an effective solution to break down the barrier to data sharing and is being progressively applied in areas such as healthcare, finance, and smart cities. FL frameworks are the cornerstones of academic research and industrial applications. Although companies such as Google, OpenMined, WeBank, and Baidu have their own open-sourced FL frameworks and systems, there is a lack of in-depth research and comparison of the technical principles, applicability scenarios, and the problems of these FL open-source frameworks. For this reason, according to the preference level of each open-source framework in the industry, we select the widely used open-source frameworks to analyze. For the different types of FL frameworks, firstly, we analyze the system architecture and system function. Secondly, we compare and analyze each framework from the aspects of privacy mechanism, machine learning algorithm, computing paradigm, learning type, training architecture, communication protocol, visualization, etc. Moreover, we present two FL experiments for different application scenarios to help the readers choose and use the open-source framework to implement FL applications. Finally, based on the openness of the current framework, we discuss the possible future research directions from the aspects of privacy security, incentive mechanism, cross-framework interaction, etc. This paper aims to provide references and ideas for developing and innovating an open-source framework, architecture optimization, security improvement, and algorithm optimization.

  • 语音增强是语音信号处理的重要组成部分,它旨在最大限度地去除背景噪声,提高语音信号质量和可懂度. 在过去几十年中,传统的语音增强技术,例如,维纳滤波法[1]、谱减法[2]、基于子空间的方法[3]等受到了研究者们的青睐,但是这些技术在处理复杂环境下的语音信号时其效果往往不尽人意.

    近年来,基于深度神经网络(deep neural network, DNN)的语音增强方法已被证实可获得比传统语音增强方法更高的性能[4-6]. 具体而言,该类方法的实质在于通过优化途径获得去噪后语音的短时傅里叶变换(short-time Fourier transform, STFT)幅度谱,并将之与原带噪语音的相位谱结合构造出完整的时频谱,再进行逆短时傅里叶变换(inverse short-time Fourier transform, ISTFT),即可生成增强后的语音信号. 其中幅度谱优化可分为2类途径:一类是间接方法,其致力于估计对带噪语音的STFT幅度谱进行掩蔽操作的时频模板,如理想二值模板(ideal binary mask,IBM)[7]、理想比值模板(ideal ratio mask,IRM)[8]等;另一类是直接映射方法,借助网络优化直接求取去噪后的语音STFT幅度谱. 研究表明,间接方法比直接映射方法可在语音增强上获得更好的性能[9],故本文采用间接方法中的IRM作为训练目标.

    近年来,卷积神经网络(convolutional neural network, CNN)被广泛应用于语音增强中[10-12]. 文献[10]提出了第1个全卷积语音增强网络,该网络证明了CNN可在消耗比DNN更少参数的情况下,获得比DNN更优越的性能.

    但需要指出,DNN和CNN存在共同的缺陷,即无法捕获语音信号的相邻连续时间帧之间的长依赖关系,这使其性能受到限制. 为解决这一问题,研究者们在这些方法中融入了循环神经网络(recurrent neural network, RNN)、长短时记忆(long short-term memory, LSTM)网络,并取得了相应的性能提升. 例如,文献[13]中提出基于RNN的深度循环神经网络(deep recurrent neural network, DRNN),实验结果表明DRNN的性能优于DNN. 另外,文献[14]通过在编码器和解码器之间插入了双向长短时记忆(bidirectional long short-term memory, Bi-LSTM)网络,证明了获取相邻连续时间帧之间的长依赖关系可提升语音增强的性能,但代价是消耗更多网络参数.

    为在不增加网络参数量的同时又可以有效捕获输入序列的长依赖关系,研究者们将时域卷积网络(temporal convolutional network, TCN)引入到语音增强中[15-18]. 文献[19]指出:由于TCN由扩张卷积构成,拥有更大的感受野,从而可在不额外增加参数量的同时,获得比LSTM更长的长期有效记忆能力. 然而,以上基于TCN的工作的缺陷在于,随着网络层数的增加,梯度消失问题变得突出,使得网络收敛速度变缓.

    为解决梯度消失问题和进一步提升语音增强质量,本文将扩张卷积和密集连接网络(densely connected convolutional network, DenseNet)[20]相结合,提出频率-时间扩张密集网络(frequency-time dilated dense network, FTDDN). 其特色在于:

    1) 在学习上下文信息方面,除了时间方向,扩张卷积同时被应用在频率方向. 通过所构造的时间扩张卷积单元(time dilated convolution unit, TDCU)和频率扩张卷积单元(frequency dilated convolution unit, FDCU),本文所提网络在时频域内均可获得较大的感受野,从而能有效提取出深层语音特征,达到提升语音增强性能的目的.

    2) 在网络效率方面,本文中各级TDCU和FDCU所提取的特征以密集连接的方式传递,不仅可缓解梯度消失问题,而且可避免经典信息论所指出的因级联信息处理模块数目增加而导致的信息丢失问题[21].

    假设含噪离散语音{{\boldsymbol{x}}}(k)表示为

    {{\boldsymbol{x}}}(k) = {{\boldsymbol{s}}}(k) + {{\boldsymbol{n}}}(k) , (1)

    其中k表示时间索引,{{\boldsymbol{s}}}(k){{\boldsymbol{n}}}(k)分别表示干净语音和加性噪声. 为实现语音增强从含噪语音{{\boldsymbol{x}}}(k)中恢复出干净的语音估计\hat {{\boldsymbol{s}}}(k)的目的,需将{{\boldsymbol{x}}}(k)进行STFT,得到时频表示:

    {{\boldsymbol{X}}}(t,f) = {{\boldsymbol{S}}}(t,f) + {{\boldsymbol{N}}}(t,f) , (2)

    其中

    \begin{gathered} {{\boldsymbol{X}}}(t,f) = \left| {{{\boldsymbol{X}}}(t,f)} \right|{{\rm{e}}^{j{{{\boldsymbol{\varPhi}} }_{{\boldsymbol{X}}}}\left( {t,f} \right)}}, \\ {{\boldsymbol{S}}}(t,f) = \left| {{{\boldsymbol{S}}}(t,f)} \right|{{\rm{e}}^{j{{{\boldsymbol{\varPhi}} }_{{\boldsymbol{S}}}}\left( {t,f} \right)}}, \\ {{\boldsymbol{N}}}(t,f) = \left| {{{\boldsymbol{N}}}(t,f)} \right|{{\rm{e}}^{j{{{\boldsymbol{\varPhi}} }_{{\boldsymbol{N}}}}\left( {t,f} \right)}}, \\ \end{gathered} (3)

    其中 t \in \left[ {0,T - 1} \right],f \in \left[ {0,F - 1} \right] TF分别是时间帧和观测频率的数量(为简化起见,后文将省略以上各时频表示的自变量 t f ). 随后将此时频表示的幅度谱\left| {{\boldsymbol{X}}} \right|作为语音特征输入到神经网络. 经过神经网络的优化,得到时频掩蔽{{\boldsymbol{M}}},并将此掩蔽{{\boldsymbol{M}}}\left| {{\boldsymbol{X}}} \right|相乘,得到增强后的语音幅度谱\left| {\hat {{\boldsymbol{S}}}} \right|,最后通过对\left| {\hat {{\boldsymbol{S}}}} \right|和含噪语音的相位谱{{{\boldsymbol{\varPhi }}}_{\boldsymbol{X}}}进行ISTFT得到增强后的语音\hat {{\boldsymbol{s}}}(k). 以上过程可用式(4)描述:

    \begin{gathered} \left| {\hat {{\boldsymbol{S}}}} \right| = {{\boldsymbol{M}}} \odot \left| {{\boldsymbol{X}}} \right|, \\ \hat {{\boldsymbol{s}}}(k) = \Re \left(\left| {\hat {{\boldsymbol{S}}}} \right|,{{{\boldsymbol{\varPhi}} }_{{\boldsymbol{X}}}}\right), \\ \end{gathered} (4)

    其中 \Re 表示ISTFT, \odot 表示矩阵对应元素相乘.

    为能够充分捕获语音时频谱在频率、时间方向上的上下文信息,同时解决随着网络深度增加带来的信息丢失问题,本文将扩张卷积与密集连接结构相结合,分别设计了频率扩张密集模块(frequency dilated dense module, FDDM)和时间扩张密集模块(time dilated dense module, TDDM).

    FDDM的结构如图1所示(图中表示卷积层的方框内第1行数字依次表示扩张因子、卷积核大小和卷积核数量),其由{\text{6}}个FDCU卷积单元以密集连接的方式组成,其中每个FDCU都包括2层2D卷积层,且每个卷积层之后都连接了1层归一化层(batch normalization, BN)和1个修正线性单元(rectified linear unit, ReLU). 但第1个卷积层使用普通2D卷积,用以减少通道数;而第2个卷积层使用频率扩张2D卷积,其只在频率方向使用扩张因子以增大卷积核尺寸,由此增大感受野来捕获频率方向的上下文信息.

    图  1  频率扩张密集模块结构
    Figure  1.  The structure of FDDM

    FDDM结构特色在于引入了密集连接结构:表现为每一级FDCU的输入都是整个FDDM的输入与其前面各级FDCU输出的汇集,从而各级FDCU的输入依次为16i \times 257 \times T,i = 1,2, … ,6. 为保证在频率方向获得足够大的感受野,需逐级增大FDCU的扩张因子{d_i},将其依次设定为{2^{i - 1}},i = 1,2, … ,6.

    TDDM则借鉴了TCN[19]的设计结构,并采用了与FDDM类似的框架结构,如图2所示:同样由6个TDCU卷积块以密集连接的方式组成,每个TDCU都包括3部分,其中前2部分的结构为1D卷积层、归一化层、带参数的线性修正单元(parametric rectified linear unit, PReLU),第3部分只有1层单独的1D卷积层. 第1部分采用普通1D卷积,用以减少通道数;第2部分使用时间扩张卷积,用以学习时间方向的上下文信息;第3部分的单独卷积层则在输出时恢复整个TDCU的通道.

    图  2  时间扩张密集模块结构
    Figure  2.  The structure of TDDM

    与FDDM同理,TDDM也融入了密集连接结构,表现为每一级TDCU的输入都是整个TDDM的输入与其前面各级TDCU输出的汇集,从而各级TDCU的输入为128i \times T,i = 1,2, … ,6,且其扩张因子{d_i}设定为{2^{i - 1}},i = 1,2, … ,6.

    从深层次意义上讲,正是因为图1所示的FDDM和图2所示的TDDM的各层级联的FDCU和TDCU的入口采用了密集连接,才避免了经典信息论所述及的“多处理模块级联会引起信息丢失”的现象(即信息不增性原理)[21],从而保证了特征重用,并促进信息流的传递.

    综合以上拥有较大感受野的FDDM和TDDM的基本模块设计,本文提出频率-时间扩张密集网络FTDDN.

    图3展示了本文所提出的网络的框架结构,其输入时频幅度谱 \left| {{\boldsymbol{X}}} \right| 首先通过2层2D卷积层. 第1个卷积层用于增加输入特征的通道数;第2个卷积层用于学习局部信息,并将其输出反馈给FDDM,以捕获频率方向的上下文信息和学习时间方向的局部信息.图3中表示卷积层的方框内的第1行数字表示卷积核大小和卷积核数量.

    图  3  频率-时间扩张密集网络结构
    Figure  3.  The structure of FTDDN

    FDDM之后连接了2层2D卷积层和1层1D卷积层,其共同的作用是实现维度转换以及减少通道数,使FDDM的输出的维度转换为128 \times T,并反馈至TDDM中以学习时间方向的上下文信息.

    经TDDM处理后,其输出会送到3个卷积单元中,前2个卷积单元由1D卷积层、归一化层和PReLU激活函数组成,用以聚合FDDM和TDDM学习到的频率、时间方向上的上下文信息,后1个卷积单元由1D卷积层和Sigmoid激活函数组成,其将网络估计到的时频掩蔽模板 {{\boldsymbol{M}}} 的维度恢复到257 \times T并将其值限制在[0, 1]区间内.

    图3总体网络采用了文献[22]中提出的噪声感知多任务损失函数,即加权平均绝对误差(weighted mean absolute error, WMAE),其定义为:

    \begin{split} {{WMAE}} = a \times \frac{1}{{T\times F}}\sum\limits_{t = 0}^{T - 1} {\sum\limits_{f = 0}^{F - 1} {| {| {\hat {{\boldsymbol{S}}}(t,f)} | - | {{{\boldsymbol{S}}}(t,f)} |} |} } + \\ (1 - a) \times \frac{1}{{T\times F}}\sum\limits_{t = 0}^{T - 1} {\sum\limits_{f = 0}^{F - 1} {| {| {\hat {{\boldsymbol{N}}}(t,f)} t| - | {{{\boldsymbol{N}}}(t,f)}|} |} } , \\[-10pt] \end{split} (5)

    其中| {\hat {{\boldsymbol{N}}}(t,f)} | = | {{{\boldsymbol{X}}}(t,f)} | - | {\hat {{\boldsymbol{S}}}(t,f)} |表示噪声的幅度谱估计,而a则为干净语音和噪声之间的能量比值,即

    a = \dfrac{{\displaystyle\sum\limits_{t = 0}^{T - 1} {\displaystyle\sum\limits_{f = 0}^{F - 1} {{{\left| {{{\boldsymbol{S}}}(t,f)} \right|}^2}} } }}{{\displaystyle\sum\limits_{t = 0}^{T - 1} {\displaystyle\sum\limits_{f = 0}^{F - 1} {{{\left| {{{\boldsymbol{S}}}(t,f)} \right|}^2}} } + \displaystyle\sum\limits_{t = 0}^{T - 1} {\displaystyle\sum\limits_{f = 0}^{F - 1} {{{\left| {{{\boldsymbol{N}}}(t,f)} \right|}^2}} } }} . (6)

    实验数据集之一采用开源的VCTK语料库[23],其训练集包括28位说话人(14位女性和14位男性),测试集则包括另外2位不同的说话人(1位女性和1位男性). 为创建含噪语音数据集,文献[23]的作者以4种信噪比(signal-noise ratio, SNR)(15 dB,10 dB,5 dB,0 dB)向干净语音训练集添加了10种常见环境噪声和人工制造的噪声[23],从而生成包含有11572个语音的含噪语音训练集;以另外4种SNR(17.5 dB,12.5 dB,7.5 dB,2.5 dB)向干净语音测试集中添加了5种常见环境噪声[23],从而生成包含有824个语音的含噪语音测试集. 为测试网络的泛化能力,测试集与训练集中所使用的噪声均不相同. 因测试集中使用的说话人和噪声类型均与训练集不同,故也将其用作验证集以优化模型参数. 为降低计算复杂度,本文将该语料库的信号采样率由48 kHz降为16 kHz.

    实验数据集之二采用LibriSpeech语料库[24]的干净语音,其采样率为16 kHz,而噪声来源取自DEMAND噪声库[25]和DNS Challenge中的噪声集[26]. 为了构造实验所用数据集,在训练阶段,本文分别从LibriSpeech干净语音训练集和干净语音验证集中随机选取13976句语音和871句语音,并采用随机选择的方式,将DEMAND噪声库中的1000种噪声以10种SNR(−7.5 dB,−6.5 dB,−4 dB,−3 dB,−1 dB,1 dB,3 dB,7 dB,−9 dB,11 dB)与这些干净语音混合,以生成含噪语音训练集和含噪语音验证集. 在测试阶段,本文从LibriSpeech干净语音测试集中随机选取740句语音,并以4种SNR(−5 dB,0 dB,5 dB,10 dB)向干净语音添加4种噪声(DEMAND噪声库:Cafter噪声、Kitchen噪声、Meeting噪声、Office噪声),生成含噪语音测试集. 为测试网络的泛化能力,该数据集中,测试集、验证集与训练集中的噪声不同:有水流声、汽车声等.

    本文使用业内普遍接受的语音质量客观评估(perceptual evaluation of speech quality,PESQ)[27]和短时客观可懂度(short-time objective intelligibility,STOI)[28],以及主观平均意见分数——信号失真的复合测度 (CSIG)、噪声失真的复合测度 (CBAK)和语音整体质量的复合测度 (COVL),作为实验结果的评价指标[29].

    本文所提出的FTDDN网络的主要参数设置为:使用汉宁(Hanning)窗作为STFT的时间窗,窗长为32 ms(帧长点数为512),帧移为16 ms(即50%重叠),由于实信号傅里叶变换具有共轭对称性,故图3输入STFT幅度谱特征的尺寸为257 \times TT取决于各条语音的长度).

    在每次训练实验中,本文将设每批处理语音的条数BatchSize=4,在每批处理中,通过补零的方式使各句语音与该Batch中最长语音长度保持一致,对于超出4 s的语音,则只取前4 s参与训练. 实验选用Adam优化器,并以学习率0.0002训练网络100次.

    实验主要包括2方面:1) 开展消融实验,以探究FDDM,TDDM内部的密集连接结构及卷积块FDCU和TDCU的数量R对本文所提模型的语音增强性能的影响;2) 分别针对3.1节所提的2个数据集,将本文所提模型与现有的语音增强网络做性能对比.

    为探究卷积块FDCU和TDCU的数量R和模块FDDM及TDDM中的密集连接结构对语音增强网络性能的影响,本文基于VCTK语料库[23]进行了消融实验. 为了简洁,消融实验结果仅使用PESQ和STOI作为客观评价指标.

    图4展示了在FDDM和TDDM均存在密集连接结构的情况下,不同的卷积块数量R对网络性能的影响.

    图  4  FDDM和TDDM中不同卷积块数量R对网络性能的影响
    Figure  4.  Influence of different number of convolutional blocks R in FDDM and TDDM on network performance

    图4可看出,随着R的增大,网络性能逐渐提高至最高点后又开始逐渐下降. 具体而言,当R从2增加到4时,PESQ和STOI分别从2.89增加到2.95和从0.9388增加到0.9442;当R从4增加到6时,PESQ和STOI虽然也呈现一定程度的增加,但增速变缓,这是由于随着R的增加,网络深度增加,感受野也随之增加,使得网络学习到的上下文信息更丰富,最终提高了网络性能; 而当R从6继续增大时,可看到PESQ变化趋势平缓、STOI开始下降,这是因为当R继续增加时,网络深度也会加深,这导致信息丢失问题加剧,而密集连接结构的信息补充作用又无法完全解决这一问题,进而导致了网络性能退化.

    表1列出了在R = 6的情况下去除TDDM和FDDM中的密集连接结构后网络的性能变化. 可发现:当分别去除TDDM和FDDM中的密集连接时,PESQ从3.02分别下降到了2.83和2.97,STOI从0.9451分别下降到了0.9409和0.9447,这反映了密集连接结构的有效性,证实了该结构可通过信息补充加强特征传递和特征重用,达到增强网络性能的效果. 从表1的PESQ和STOI的下降比例可看出,相比较而言,消融TDDM比消融FDDM影响更大,这是因为时间方向的上下文信息比频率方向的上下文信息更加丰富,从而间接证明了时间信息融合在提高网络性能方面更重要,但频率信息也不可忽略.

    表  1  密集连接对网络性能的影响
    Table  1.  Influence of Dense Connection on Network Performance
    方法评价指标
    PESQSTOI
    无密集连接TDDM2.83(↓6.29%)0.9409(↓0.48%)
    无密集连接FDDM2.97(↓1.66%)0.9447(↓0.04%)
    FTDDN3.020.9451
    注:(↓*)表示该方法的得分相比于FTDDN的得分的下降比例.
    下载: 导出CSV 
    | 显示表格

    针对LibriSpeech语料库[24]在不同噪声和信噪比的情况下,将本文网络与3种已有网络进行性能对比,这3种网络分别是:基于LSTM的语音增强方法、基于卷积循环网络(convolutional recurrent network, CRN)[30]的语音增强方法、基于时间卷积神经网络(temporal convolutional neural network, TCNN)[31]的语音增强方法.

    表2表3分别展示了对本文模型和3种对比模型测评得到的PESQ分数和STOI分数,可以看出:除了5dB Meeting噪声条件下,本文模型的STOI分数略低于CRN以外,在其他情况下,本文模型的PESQ分数和STOI分数均要高于对比模型,这表明本文模型的语音增强性能更优越.

    表  2  使用LibriSpeech语料库对FTDDN与基线模型的评测PESQ分数
    Table  2.  Evaluation PESQ Scores of FTDDN and Baseline Models Using LibriSpeech Corpus
    噪声SNR/dB对比模型
    NoisyLSTMCRNTCNNFTDDN
    Cafter−51.101.081.131.121.16
    01.141.151.301.241.39
    51.281.301.581.421.80
    101.571.501.931.622.33
    Kitchen−51.071.291.451.352.00
    01.151.441.771.522.48
    51.331.612.131.722.94
    101.661.752.511.913.34
    Meeting−51.071.091.131.111.17
    01.141.151.261.211.31
    51.291.261.491.381.60
    101.601.421.831.602.05
    Office−51.311.501.811.552.18
    01.611.622.181.762.63
    51.991.742.561.933.06
    102.841.852.892.063.49
    注:加粗的数字表示每一行中最高的PESQ分数.
    下载: 导出CSV 
    | 显示表格
    表  3  使用LibriSpeech语料库对FTDDN与基线模型的评测STOI分数
    Table  3.  Evaluation STOI Scores of FTDDN and Baseline Models Using LibriSpeech Corpus
    噪声SNR/dB对比模型
    NoisyLSTMCRNTCNNFTDDN
    Cafter−50.60330.58160.66140.63550.6679
    00.72610.72280.80440.76540.8052
    50.82960.81160.88840.85240.8885
    100.90140.85640.93370.90220.9350
    Kitchen−50.85690.81470.89210.85230.9114
    00.91100.85110.93280.89640.9418
    50.94980.87240.95810.92360.9626
    100.97410.88520.97340.93910.9772
    Meeting−50.65430.63740.67460.64550.6890
    00.76080.72950.79090.76290.7937
    50.84800.79990.8741084820.8733
    100.90970.84600.92520.90000.9254
    Office−50.93450.86440.93960.91190.9496
    00.96300.88020.96060.93440.9678
    50.97960.88890.97290.94510.9801
    100.98930.89440.98050.94980.9884
    注:加粗的数字表示每一行中最高的STOI分数.
    下载: 导出CSV 
    | 显示表格

    观察表2表3的数据可发现,所有模型在Cafter和Meeting噪声条件下的语音增强性能都低于在Kitchen和Office噪声条件下的语音增强性能,这可归结为不同噪声源的时频谱结构造成的影响. 具体解释如下:如图5所示,Cafter和Meeting噪声中以人声为主,其时频谱结构与干净语音的结构非常相似,故增加了噪声与干净语音的区分难度,导致网络去噪性能下降;而Kitchen和Office噪声结构与干净语音结构相差很大,降低了网络从含噪语音中学习干净语音结构的难度,有助于提升去噪性能.

    图  5  干净语音和不同噪声的时频谱
    Figure  5.  The time-frequency spectrograms of clean speech and different noises

    为直观反映各对比模型与本文模型的语音增强效果,图6展示了这些模型对 5dB Kitchen的含噪语音增强后的结果(其加噪前后的时频谱如图6(a)(b)所示),从中可看出,图6(c)所示的LSTM模型只是轻微地去除了噪声,只能恢复干净语音的大致结构;相比而言,图6(d)所示的CRN模型和图6(e)所示的TCNN模型去噪更显著,但其优势主要体现在低频区,而高频细节较为模糊;而本文提出的FTDDN模型在去除噪声的同时,又最大限度地保留了语音信息,见图6(f). 究其原因,各对比模型仅着重考虑了语音时间方向的上下文信息,而忽略了语音频率方向上下文信息间的联系,而语音的能量大部分聚集在低频部分,这导致模型对语音的高频信息关注度降低,使得增强后的语音高频信息丢失,而本文提出的FTDDN模型给予了语音频率方向和时间方向上下文信息同等关注度,并同时学习了语音时频谱高频信息与低频信息之间的相关性和时间帧之间的依赖关系,最终得以保留完整的语音时频谱信息. 需指出的是,以上实验所使用的测试集中的说话人、噪声种类以及信噪比皆与训练集和验证集中的完全不同. 故表2表3的实验结果证实了本文模型在数据条件完全不匹配的情况下,仍可实现高性能降噪,证实了本文模型具备较高的泛化能力,可适应不同噪声条件下的复杂环境.

    图  6  干净语音、含噪语音以及不同模型增强后的语音时频谱
    Figure  6.  The time-frequency spectrograms of clean speech, noisy speech, and enhanced by different models

    将本文提出的FTDDN模型与现有的SEGAN[32],Wave-U-Net[33],WaveCRN[34],MetricGAN[35],MB-TCN[17],NAAGN[22]模型进行性能比较. 所有模型都使用VCTK语料库进行实验. 从表4列出的对比结果可以看出,本文提出的FTDDN模型在除STOI以外,所有指标都优于其他对比模型,这是由于SEGAN,Wave-U-Net,WaveCRN这3个网络的输入为时域波形,而本文的FTDDN则以时频幅度谱作为网络的输入,但时频域的信息往往比时域更加丰富、细致,从而使得网络可学习到更丰富的信息,这有利于网络性能的提升;MetricGAN,MB-TCN,NAAGN这3个网络虽然与本文的FTDDN一样,都以时频幅度谱作为网络的输入,但MetricGAN的设计是直接基于评价指标来优化网络,未专注于学习语音信号的细节信息,从而使网络性能受到限制;MB-TCN更多关注于学习语音信号的时间方向的上下文信息,却忽略了频率方向的上下文信息的重要性;NAAGN通过扩张卷积同时学习时间和频率方向的上下文信息,但并没有进行单独学习;而本文通过融合密集连接结构和扩张卷积将学习频率和时间方向的上下文信息分开进行,并在网络末端进行信息整合,故使网络学习到的语音信息更加细致,进而提升网络性能. 特别地,可以发现NAAGN模型的STOI分数略高于本文所提模型,这是由于NAAGN模型相对于本文模型额外引入了注意力门(attention gate, AG)模块,因此可进一步学习到输入样本中的更感兴趣的特征,并对其进行修剪,以保留相关的激活,从而可获得略高的STOI分数.

    表  4  使用VCTK语料库对FTDDN与基线模型的性能评测分数
    Table  4.  Performance Evaluation Scores of FTDDN and Baseline Models Using VCTK Corpus
    模型评价指标
    PESQSTOICSIGCBAKCOVL
    Noisy1.970.92103.342.442.63
    SEGAN2.163.482.942.80
    Wave-U-Net2.403.523.242.96
    WaveCRN2.643.943.373.29
    MetricGAN2.863.993.183.42
    NAAGN2.900.94804.133.503.51
    MB-TCN2.940.93644.213.413.59
    FTDDN3.020.94514.253.493.63
    注:加粗的数字表示每一列中的最高分数.
    下载: 导出CSV 
    | 显示表格

    为高质量地恢复语音信号,本文设计了频率-时间扩张密集网络(frequency-time dilated dense network, FTDDN),其包括2个最主要的模块:FDDM和TDDM,由于这2个模块均融入了扩张卷积和密集连接结构,因而FTDDN可获得较大的感受野以捕获频率方向和时间方向的上下文信息. 基于LibriSpeech和VCTK语料库与各类现有语音增强网络性能的对比实验表明:本文提出的FTDDN网络的语音增强性能更加优越,可在有效抑制噪声的同时高质量地恢复语音,故在语音识别、文本语音转换、助听器设计、网上会议等应用中有广阔应用前景.

    作者贡献声明:黄翔东完善实验方案并修改论文;陈红红提出算法思路,并负责完成实验和撰写论文;甘霖提出指导意见.

  • 图  1   联邦学习系统中心化架构

    Figure  1.   Centralized architecture of federated learning system

    图  2   联邦学习系统去中心化架构

    Figure  2.   Decentralized architecture of federated learning system

    图  3   横向联邦学习

    Figure  3.   Horizontal federated learning

    图  4   纵向联邦学习

    Figure  4.   Vertical federated learning

    图  5   联邦迁移学习

    Figure  5.   Federated transfer learning

    图  6   链抽象模型

    Figure  6.   Chain abstract model

    图  7   张量发送示意图

    Figure  7.   Tensor sending schematic diagram

    图  8   FATE系统架构

    Figure  8.   FATE system architecture

    图  9   离线训练框架架构

    Figure  9.   Off-line training framework architecture

    图  10   FATE服务部署架构

    Figure  10.   FATE-serving deployment architecture

    图  11   TFF系统架构

    Figure  11.   TFF framework architecture

    图  12   TFF客户端架构

    Figure  12.   TFF client architecture

    图  13   TFF自顶向下结构

    Figure  13.   TFF top-down structure

    图  14   TFF API 架构

    Figure  14.   TFF API architecture

    图  15   PaddleFL的架构

    Figure  15.   PaddleFL architecture

    图  16   Data parallel训练框架

    Figure  16.   Data parallel training framework

    图  17   FedML系统架构

    Figure  17.   FedML system architecture

    图  18   MPI通信示意

    Figure  18.   MPI communication diagram

    图  19   FedML服务器与设备之间的工作流程

    Figure  19.   Workflow between FedML servers and devices

    图  20   Flower系统架构

    Figure  20.   Flower system architecture

    图  21   Flower 服务器端架构

    Figure  21.   Flower server architecture

    图  22   连接管理示意图

    Figure  22.   Connection management diagram

    图  23   gRPC网桥状态转换图

    Figure  23.   gRPC bridge state transition diagram

    图  24   Fedlearner 架构

    Figure  24.   Fedlearner architecture

    图  25   FedNLP 架构

    Figure  25.   FedNLP architecture

    图  26   FederatedScope架构

    Figure  26.   FederatedScope architecture

    图  27   FS-G架构

    Figure  27.   FS-G architecture

    图  28   MNIST数据集下模型收敛曲线

    Figure  28.   Model convergence curves under MNIST dataset

    图  29   CIFAR10数据集下模型收敛曲线

    Figure  29.   Model convergence curves under CIFAR10 dataset

    图  30   Breast Cancer横向联邦学习

    Figure  30.   Breast Cancer horizontal federated learning

    图  31   Breast Cancer纵向联邦学习

    Figure  31.   Breast Cancer vertical federated learning

    图  32   不同框架模型训练时间对比

    Figure  32.   Comparison of training time of different frame models

    表  1   传统分布式学习与联邦学习的区别

    Table  1   Difference Between Traditional Distributed Learning and Federated Learning

    对比项目 具体项目传统分布式学习联邦学习

    数据
    数据分布独立同分布非独立同分布
    数据量级相同不同
    数据安全数据控制权在服务器手中,
    具有较高的隐私泄露风险
    数据控制权在参与设备手中,
    可以最大限度地保障数据隐私和安全

    通信
    网络稳定性较强较弱
    通信代价较小较高
    系统构成数据分布、模型训练以及模型更新都是服务器进行统一
    控制,不需要考虑传输时延等因素.
    模型训练和模型更新分离,需要考虑设备间异构性所带来的影响
    以及传输时延等众多因素.
    下载: 导出CSV

    表  2   FedML具体模型和数据集的组合

    Table  2   Combination of FedML Specific Models and Datasets

    模型学习类型
    联邦学习 (FedAvg, FedOpt, FedNova, 等) 或 去中心化联邦学习FedNAS
    纵向联邦学习分割学习
    Cross-device
    CV: Federated EMNIST + CNN ,
    CIFAR100 + ResNet18 (Group Normalization);
    NLP: shakespeare/stackoverflow (NWP) +
    RNN (bi-LSTM)
    lending_club_loan+VFL,
    NUS_WIDE + VFL
    Cross-silo
    CV: CIFAR10, CIFAR100,
    CINIC10 + ResNet/MobileNet
    CV: CIFAR10,CIFAR100,CINIC10+
    ResNet/MobileNet
    CV: CIFAR10, CIFAR100, CINIC10 + ResNet/MobileNet
    LinearMNIST/Synthetic+ Logistic Regression
    下载: 导出CSV

    表  3   框架对比分析

    Table  3   Framework Comparison Analysis

    对比项目 具体项目FATEPySyftTFFPaddleFLFedMLFlower
    编程语言 PythonPythonPythonPython/C++PythonPython/Java/C++
    隐私机制同态加密DHKEPaillier,RSACKKS不支持 不支持 RSA 不支持
    多方安全计算SPDZ,OT,
    Feldman VSS
    SPDZ不支持ABY3PrivCSecret sharing不支持
    差分隐私不支持DP-SGDPATEDP-SGDDP-SGDDP-SGDDP-SGD
    机器学习算法 逻辑回归、集成学习、
    深度学习、迁移学习等
    逻辑回归、
    深度学习等
    逻辑回归、
    深度学习等
    逻辑回归、
    深度学习等
    逻辑回归、
    深度学习等
    逻辑回归、
    深度学习等
    拓扑自定义 不支持支持不支持不支持支持不支持
    计算范式 单机模拟、基于
    拓扑结构的
    分布式训练
    单机模拟、基于
    拓扑结构的
    分布式训练、移动设备端训练
    单机模拟、基于
    拓扑结构的
    分布式训练
    单机模拟、基于
    拓扑结构的
    分布式训练
    单机模拟、基于
    拓扑结构的
    分布式训练、移动设备端训练
    单机模式、基于
    拓扑结构的
    分布式训练、移动设备端训练
    训练架构中心化支持支持支持支持支持支持
    去中心化不支持不支持不支持不支持支持不支持
    联邦学习类型 横向联邦、纵向
    联邦、联邦迁移
    横向联邦 横向联邦 横向联邦、
    纵向联邦
    横向联邦、
    纵向联邦
    横向联邦
    通信后端 gRPC自定义gRPCgRPCgRPC、MQTTMPI、自定义gRPC、自定义
    可视化 支持不支持不支持不支持支持不支持
    受众定位 工业/学术学术学术工业/学术学术学术
    跨系统 Linux/MacWindows/
    Linux/Mac
    Windows/
    Linux/Mac
    Windows/
    Linux/Mac
    Linux/Mac/
    Android/iOS
    Windows/Linux/
    Mac/Android/iOS
    网络拥塞模拟 不支持不支持不支持不支持不支持支持
    聚合算法 Fed-SMPC,
    FedAVG等
    Fed-MPC,Fed-
    DPFed-HE等
    FedAvg,
    Fed-SGD等
    FedAvg,DPSGD,SECAGG,PFM,LR With Privc,
    NN With MPC等
    FedAvg,FedOpt,FedGKT,FedNAS,FedNova等FedAvg,FedProxQ,FedAvg,FedOptfed等
    场景异步聚合支持不支持不支持不支持不支持不支持
    用户掉线不支持不支持不支持不支持不支持不支持
    下载: 导出CSV
  • [1]

    Kang Yiping, Hauswald J, Gao Cao, et al. Neurosurgeon: Collaborative intelligence between the cloud and mobile edge[J]. ACM SIGARCH Computer Architecture News, 2017, 45(1): 615−629 doi: 10.1145/3093337.3037698

    [2]

    McMahan B, Ramage D. Federated learning: Collaborative machine learning without centralized training data [CP/OL]. [2021-12-26].https://ai.googleblog.com/2017/04/federated-learning collaborative.html

    [3]

    McMahan H B, Moore E, Ramage D, et al. Federated learning of deep networks using model averaging [J]. arXiv preprint, arXiv: 1602.05629, 2016

    [4]

    McMahan B, Moore E, Ramage D, et al. Communication-efficient learning of deep networks from decentralized data [G] //Artificial Intelligence and Statistics. New York: PMLR, 2017: 1273−1282

    [5]

    Liu Yang, Fan Tao, Chen Tianjian, et al. FATE: An industrial grade platform for collaborative learning with data protection[J]. Journal of Machine Learning Research, 2021, 22(226): 1−6

    [6]

    Bonawitz K, Eichner H, Grieskamp W, et al. Towards federated learning at scale: System design[J]. Proceedings of Machine Learning and Systems, 2019, 1: 374−388

    [7]

    Ingerman A, Ostrowski K. TensorFlow Federated [CP/OL]. (2019-03-06)[2021-12-28].https://blog.tensorflow.org/2019/03/introducing-tensorflow-federated.html

    [8]

    Ryffel T, Trask A, Dahl M, et al. A generic framework for privacy preserving deep learning [J]. arXiv preprint, arXiv: 1811.04017, 2018

    [9]

    He Chaoyang, Li Songze, So Jinhyun, et al. FedML: A research library and benchmark for federated machine learning [J]. arXiv preprint, arXiv: 2007.13518, 2020

    [10]

    Beutel J, Topal T, Mathur A, et al. Flower: A friendly federated learning research framework [J]. arXiv preprint, arXiv: 2007.14390, 2020

    [11]

    Ma Yanjun, Yu Dianhai, Wu Tian, et al. PaddlePaddle: An open-source deep learning platform from industrial practice[J]. Frontiers of Data and Computing, 2019, 1(1): 105−115

    [12] He Chaoyang, Tan Conghui, Tang Hanlin, et al. Central server free federated learning over single-sided trust social networks [J]. arXiv preprint, arXiv: 1910.04956, 2019
    [13]

    Yang Qiang, Liu Yang, Chen Tianjian, et al. Federated machine learning: Concept and applications[J]. ACM Transactions on Intelligent Systems and Technology, 2019, 10(2): 1−19

    [14]

    Vaidya J, Clifton C. Privacy preserving association rule mining in vertically partitioned data [C] //Proc of the 8th ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining. New York: ACM, 2002: 639–644

    [15]

    Pan S J, Yang Qiang. A survey on transfer learning[J]. IEEE Transactions on knowledge and data engineering, 2010, 22(10): 1345−1359 doi: 10.1109/TKDE.2009.191

    [16]

    Leroy D, Coucke A, Lavril T, et al. Federated learning for keyword spotting [C] //Proc of the 2019 IEEE Int Conf on Acoustics, Speech and Signal Processing. Piscataway, NJ: IEEE, 2019: 6341–6345

    [17]

    Hard A, Rao K, Mathews R, et al. Federated learning for mobile keyboard prediction [J]. arXiv preprint, arXiv: 1811.03604, 2018

    [18]

    Li Qinbin, Wen Zeyi, He Bingsheng. Practical federated gradient boosting decision trees [J]. arXiv preprint, arXiv: 1911.04206, 2019

    [19]

    Yang Kai, Fan Tao, Chen Tianjian, et al. A quasinewton method based vertical federated learning framework for logistic regression [J]. arXiv preprint, arXiv: 1912.00513, 2019

    [20]

    Yang Shengwen, Ren Bing, Zhou Xuhui, et al. Parallel distributed logistic regression for vertical federated learning without third-party coordinator [J]. arXiv preprint, arXiv: 1911.09824, 2019

    [21]

    Zhu Xinghua, Wang Jianzong, Hong Zhenhou, et al. Federated learning of unsegmented Chinese text recognition model [C] //Proc of the 31st IEEE Int Conf on Tools with Artificial Intelligence. Piscataway, NJ: IEEE, 2019: 1341−1345

    [22]

    Caldas S, Duddu S M K, Wu P, et al. LEAF: A benchmark for federated settings [J]. arXiv preprint, arXiv: 1812.01097, 2018

    [23]

    Mohassel P, Rindal P. ABY3: A mixed protocol framework for machine learning [C] //Proc of the 2018 ACM SIGSAC Conf on Computer and Communications Security. New York: ACM, 2018: 35−52

    [24]

    Smith V, Chiang C K, Sanjabi M, et al. Federated multi-task learning [C/OL] //Proc of the 31st Int Conf on Neural Information Processing Systems. 2017: 4427–4437. [2021-12-29]. https://proceedings.neurips.cc/paper/2017/hash/6211080fa89981f66b1a0c9d55c61d0f-Abstract.html

    [25]

    Abadi M, Chu A, Goodfellow I, et al. Deep learning with differential privacy [C] //Proc of the 2016 ACM SIGSAC Conf on Computer and Communications Security. New York: ACM, 2016: 308–318

    [26]

    Wainakh A, Guinea A S, Grube T. Enhancing privacy via hierarchical federated learning [J]. arXiv preprint, arXiv: 2004.11361, 2020

    [27]

    Liao Feng, Zhuo H H, Huang Xiaoling, et al. Federated hierarchical hybrid networks for clickbait detection [J]. arXiv preprint, arXiv: 1906.00638, 2019

    [28]

    Hardy S, Henecka W, Ivey-Law H, et al. Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption [J]. arXiv preprint, arXiv: 1711.10677, 2017

    [29]

    Gupta O, Raskar R. Distributed learning of deep neural network over multiple agents[J]. Journal of Network and Computer Applications, 2018, 116: 1−8 doi: 10.1016/j.jnca.2018.05.003

    [30]

    Sun Z, Kairouz P, Suresh A T, et al. Can you really backdoor federated learning? [J]. arXiv preprint, arXiv: 1911.07963, 2019

    [31]

    Pillutla K, Kakade S M, Harchaoui Z. Robust aggregation for federated learning [J]. arXiv preprint, arXiv: 1912.13445, 2019

    [32]

    Blanchard P, Mhamdi E, Guerraoui R, et al. Machine learning with adversaries: Byzantine tolerant gradient descent [C] //Proc of the 2017 Advances in Neural Information Processing Systems. 2017: 119–129.[2021-12-28]. https://proceedings.neurips.cc/paper/2017/hash/f4b9ec30ad9f68f89b29639786cb62ef-Abstract.html

    [33]

    Bagdasaryan E, Veit A, Hua Yiqing, et al. How to backdoor federated learning [C] //Proc of the 23rd Int Conf on Artificial Intelligence and Statistics. 2018: 2938–2948.[2021-12-29]. https://proceedings.mlr.press/v108/bagdasaryan20a.html

    [34]

    Wang Hongyi, Sreenivasan K, Rajput S, et al. Attack of the tails: Yes, you really can backdoor federated learning[J]. arXiv preprint, arXiv: 2007.05084, 2020

    [35]

    He Chaoyang, Annavaram M, Avestimehr S. FedNAS: Federated deep learning via neural architecture search [J]. arXiv preprint, arXiv: 2004.08546, 2020

    [36]

    LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278−2324 doi: 10.1109/5.726791

    [37]

    Reddi S, Charles Z, Zaheer M, et al. Adaptive federated optimization [J]. arXiv preprint, arXiv: 2003.00295, 2020

    [38]

    Li Tian, Sahu A K, Zaheer M, et al. Federated optimization in heterogeneous networks[J]. arXiv preprint, arXiv: 1812.06127, 2018

    [39]

    Krizhevsky A, Hinton G. Learning multiple layers of features from tiny images [J/OL]. 2009[2022-01-08]. https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.222.9220

    [40]

    Liu Yang, Ma Zhuo, Liu Ximeng, et al. Boosting privately: Privacy-preserving federated extreme boosting for mobile crowdsensing [J]. arXiv preprint, arXiv: 1907.10218, 2019

    [41]

    Agarwal N, Kairouz P, Liu Ziyu. The skellam mechanism for differentially private federated learning [C] //Proc of the 2021 Advances in Neural Information Processing Systems. 2021: 5052−5264. [2022-01-08]. https://proceedings.neurips.cc/paper/2021/hash/285baacbdf8fda1de94b19282acd23e2-Abstract.html

    [42]

    Bytedance. A multi-party collaborative machine learning framework [EB/OL]. (2020-10-26)[2022-01-08]. https://github.com/bytedance/fedlearner

    [43]

    Lin Yuchen, He Chaoyang, Zeng Zihang, et al. FedNLP: A research platform for federated learning in natural language processing [J]. arXiv preprint, arXiv: 2104.08815, 2021

    [44]

    Xie Yuexiang, Wang Zhen, Chen Daoyuan, et al. FederatedScope: A flexible federated learning platform for heterogeneity[J]. arXiv preprint, arXiv: 2204.05011, 2022

    [45]

    Wang Zhen, Kuang Weirui, Xie Yuexiang, et al. FederatedScope-GNN: Towards a unified, comprehensive and efficient package for federated graph learning[J]. arXiv preprint, arXiv: 2204.05562, 2022

    [46]

    Powell K. NVIDIA Clara Federated Learning to Deliver AI to Hospitals While Protecting Patient Data [CP/OL]. (2019-12-01)[2021-01-08]. https://blogs.nvidia.com/blog/2019/12/01/clara-federated-learning/

    [47]

    Yosinski J, Clune J, Nguyen A, et al. Understanding neural networks through deep visualization [J]. arXiv preprint, arXiv: 1506.06579, 2015

    [48]

    Du Memgnan, Liu Ninghao, Hu Xia. Techniques for interpretable machine learning [J]. arXiv preprint, arXiv: 1808.00033, 2018

    [49]

    Street W N, Wolberg W H, Mangasarian O L. Nuclear feature extraction for breast tumor diagnosis [G] //SPIE 1905: Proc of the 1993 Biomedical Image Processing and Biomedical Visualization. Bellingham: SPIE, 1993: 861−870

    [50]

    Bhagoji A N, Chakraborty S, Mittal P, et al. Analyzing federated learning through an adversarial lens [C/OL] //Proc of the 36th Int Conf on Machine Learning. 2019: 634−643.[2022-01-08]. https://proceedings.mlr.press/v97/bhagoji19a.html

    [51]

    Fang Minghong, Cao Xiaoyu, Jia Jinyuan, et al. Local model poisoning attacks to Byzantine-robust federated learning [C] //Proc of the 29th USENIX Security Symp. Berkeley, CA: USENIX Association, 2020: 1605−1622

    [52]

    Hitaj B, Ateniese G, Perez-cruz F. Deep models under the GAN: Information leakage from collaborative deep learning [C] //Proc of the 2017 ACM SIGSAC Conf on Computer and Communications Security. New York: ACM, 2017: 603−618

    [53]

    Zhang Jiale, Chen Junjun, Wu Di, et al. Poisoning attack in federated learning using generative adversarial nets [C] //Proc of the 18th IEEE Int Conf on Trust, Security and Privacy in Computing and Communications and 13th IEEE Int Conf on Big Data Science and Engineering. Piscataway, NJ: IEEE, 2019: 374–380

    [54]

    Melis L, Song Congzheng, De Cristofaro E, et al. Exploiting unintended feature leakage in collaborative learning [C] //Proc of the 2019 IEEE Symp on Security and Privacy. Piscataway, NJ: IEEE, 2019: 691–706

    [55]

    Phong L T, Aono Y, Hayashi T, et al. Privacy-preserving deep learning via additively homomorphic encryption[J]. IEEE Transactions on Information Forensics and Security, 2017, 13(5): 1333−1345

    [56]

    Zhu Ligeng, Liu Zhijian, Han Song. Deep leakage from gradients[C/OL] //Proc of the 2019 Advances in Neural Information Processing Systems. 2019 [2022-01-08]. https://proceedings.neurips.cc/paper/2019/hash/60a6c4002cc7b29142def8871531281a-Abstract.html

    [57]

    Wei Kang, Li Jun, Ding Ming, et al. Federated learning with differential privacy: Algorithms and performance analysis[J]. IEEE Transactions on Information Forensics and Security, 2020, 15: 3454−3469 doi: 10.1109/TIFS.2020.2988575

    [58]

    Bhowmick A, Duchi J, Freudiger J, et al. Protection against reconstruction and its applications in private federated learning [J]. arXiv preprint, arXiv: 1812.00984, 2018

    [59]

    Xu Guowen, Li Hongwei, Liu Sen, et al. VerifyNet: Secure and verifiable federated learning[J]. IEEE Transactions on Information Forensics and Security, 2019, 15: 911−926

    [60]

    Yuan Jiawei, Yu Shucheng. Privacy preserving back-propagation neural network learning made practical with cloud computing[J]. IEEE Transactions on Parallel and Distributed Systems, 2013, 25(1): 212−221

    [61]

    Wan Li, Ng W K, Han Shuguo, et al. Privacy-preservation for gradient descent methods [C] //Proc of the 13th ACM SIGKDD Int Conf Knowledge Discovery and Data Mining. New York: ACM, 2007: 775−783

    [62]

    Bonawitz K, Ivanov V, Kreuter B, et al. Practical secure aggregation for privacy-preserving machine learning [C] //Proc of the 2017 ACM SIGSAC Conf on Computer and Communications Security. New York: ACM, 2017: 1175−1191

    [63]

    Hamer J, Mohri M, Suresh A T, FedBoost: A communication-efficient algorithm for federated learning [C/OL] //Proc of the 37th Int Conf on Machine Learning. 2020: 3973−3983.[2022-01-08]. https://proceedings.mlr.press/v119/hamer20a.html

    [64]

    Gogineni V C, Werner S, Huang Y F, et al. A communication-efficient online federated learning framework for nonlinear regression [C/OL] //Proc of the 2022 IEEE Int Conf on Acoustics, Speech and Signal Processing. 2022: 5228−5232.[2022-01-08]. https://ieeexplore.ieee.org/abstract/document/9746228

    [65]

    Song Jincheng, Wang Weizheng, Gadekallu T R, et al. EPPDA: An efficient privacy-preserving data aggregation federated learning scheme [J/OL]. IEEE Transactions on Network Science and Engineering, 2022[2022-01-08]. https://ieeexplore.ieee.org/abstract/document/9721557

    [66]

    Chen Hao, Huang Shaocheng, Zhang Deyou, et al. Federated learning over wireless IoT networks with optimized communication and resources[J]. IEEE Internet of Things Journal, 2022, 9(17): 16592−16605

    [67]

    Yu Han, Liu Zelei, Liu Yang, et al. A sustainable incentive scheme for federated learning[J]. IEEE Intelligent Systems, 2020, 35(4): 58−69 doi: 10.1109/MIS.2020.2987774

    [68]

    Zhan Yufeng, Zhang Jiang, Li Peng. Crowdtraining: Architecture and incentive mechanism for deep learning training in the Internet of things[J]. IEEE Network, 2019, 33(5): 89−95 doi: 10.1109/MNET.001.1800498

    [69]

    Kim H, Park J, Bennis M, et al. Blockchained on-device federated learning[J]. IEEE Communications Letters, 2020, 24(6): 1279−1283 doi: 10.1109/LCOMM.2019.2921755

    [70]

    Lu Yunlong, Huang Xiaohong, Dai Yueyue, et al. Blockchain and federated learning for privacy-preserved data sharing in industrial IoT[J]. IEEE Transactions on Industrial Informatics, 2020, 16(6): 4177−4186 doi: 10.1109/TII.2019.2942190

    [71]

    Xie Cong, Koyejo S, Gupta I. Asynchronous federated optimization[J]. arXiv preprint, arXiv: 1903.03934, 2019

    [72]

    Wu Qiong, He Kaiwen, Chen Xu. Personalized federated learning for intelligent IoT applications: A cloud-edge based framework[J]. IEEE Open Journal of the Computer Society, 2020, 1: 35−44 doi: 10.1109/OJCS.2020.2993259

    [73]

    Tan A Z, Yu Han, Cui Lizhen, et al. Towards personalized federated learning [J/OL]. IEEE Transactions on Neural Networks and Learning Systems. 2022 [2022-01-18]. https://ieeexplore.ieee.org/abstract/document/9743558

    [74]

    Pei Jiaming, Zhong Kaiyang, Jan M A, et al. Personalized federated learning framework for network traffic anomaly detection [J/OL]. Computer Networks, 2022[2022-05-24]. https://www.sciencedirect.com/science/article/abs/pii/S1389128622001001

  • 期刊类型引用(2)

    1. 李静莹. 基于模糊理论和卷积神经网络的翻译机器人交互语音降噪方法研究. 自动化与仪器仪表. 2025(01): 286-289+294 . 百度学术
    2. 张晨辉 ,原之安 ,钱宇华 . 结合卷积增强窗口注意力的双分支语音增强神经网络. 计算机研究与发展. 2025(04): 852-862 . 本站查看

    其他类型引用(1)

图(32)  /  表(3)
计量
  • 文章访问数:  976
  • HTML全文浏览量:  307
  • PDF下载量:  461
  • 被引次数: 3
出版历程
  • 收稿日期:  2022-02-14
  • 修回日期:  2022-08-07
  • 网络出版日期:  2023-02-26
  • 刊出日期:  2023-06-30

目录

/

返回文章
返回