Survey of Collaborative Inference for Edge Intelligence
-
摘要:
近年来,信息技术的不断变革伴随数据量的急剧爆发,使主流的云计算解决方案面临实时性差、带宽受限、高能耗、维护费用高、隐私安全等问题. 边缘智能的出现与快速发展有效缓解了此类问题,它将用户需求处理下沉到边缘,避免了海量数据在网络中的流动,得到越来越多的关注. 由于边缘计算中资源性能普遍较低,通过资源实现协同推理正成为热点.通过对边缘智能发展的趋势分析,得出边缘协同推理目前仍处于增长期,还未进入稳定发展期. 因此,在对边缘协同推理进行充分调研的基础上,将边缘协同推理划分为智能化方法与协同推理架构2个部分,分别对其中涉及到的关键技术进行纵向归纳整理,并从动态场景角度出发,对每种关键技术进行深入分析,对不同关键技术进行横向比较以及适用场景分析.最后对动态场景下的边缘协同推理给出值得研究的若干发展方向.
Abstract:At present, the continuous change of information technology along with the dramatic explosion of data quantity makes the cloud computing solutions face many problems such as high latency, limited bandwidth, high carbon footprint, high maintenance cost, and privacy concerns. In recent years, the emergence and rapid development of edge computing has effectively alleviated such dilemmas, sinking user demand processing to the edge and avoiding the flow of massive data in the network. As a typical scenario of edge computing, edge intelligence is gaining increasing attention, in which one of the most important stages is the inference phase. Due to the general low performance of resources in edge computing, collaborative inference through resources is becoming a hot topic. By analyzing the trends of edge intelligence development, we conclude that collaborative inference at the edge is still in the increasing phase and has not yet entered a stable phase. We divide edge-edge collaborative inference into two parts: Intelligent methods and collaborative inference architecture, based on a thorough investigation of edge collaborative inference. The involved key technologies are summarized vertically and organized from the perspective of dynamic scenarios. Each key technology is analyzed in more detail, and the different key technologies are compared horizontally and analyzed on the application scenarios. Finally, we propose several directions that deserve further studying in collaborative edge inference in dynamic scenarios.
-
聚类旨在发现数据的内部聚类结构,将相似的样本划分为一类,现已有许多高效的聚类算法被提出,如原型聚类、密度聚类、层次聚类等. 聚类作为机器学习的核心技术之一,已在信息检索[1]、医学诊断[2]、社交网络分析[3]、图像分割[4]等多个领域中发挥着重要作用.
聚类属于无监督学习,对大量无标签样本进行划分,特别是外观相似、语义不同的样本,是一项重要挑战. 经典聚类算法,如谱聚类[5]、k-means[6]等,由于特征提取能力缺陷导致特征区分性差的问题,已难以满足现有复杂场景上的任务.
深度聚类方法依靠深度学习[7]强大的特征提取能力,并采用同步式训练策略来解决传统聚类算法的不足. DEC(deep embedding clustering)[8]采用1步策略优化特征学习和聚类分析来解决错误传播. DFCN(deep fusion clustering)[9]将自动编码器和图自动编码器学习的特征信息分别在属性和结构上融合来弥补单个的不足. 但是,文献[8-9]方法学习的特征区分性不强,难以取得优异的聚类性能.
近些年,改进的对比学习技术[10]在下游任务上取得优异表现,不断缩近无监督与有监督间的差距,并显现出超越有监督学习的潜力. 大部分对比学习方法采用单一、简单的数据增强技术,如MOCO(momentum contrast)[11],SimCLR(simple framework for contrastive learning of visual representation)[12]等,本文称之为弱数据增强,将同一样本的不同数据增强视图视作正样本对、其他为负样本对,利用对比损失将正样本对拉近、负样本对远离,有效提高特征的区分度. 当对比学习和聚类任务结合时,以正负样本为纽带,特征模块和聚类模块呈现较强的相互促进、相互影响的关系. 然而,从特征对比和聚类分析的角度,现有的对比聚类分析仍存在一些不足:1)单一的弱数据增强在保留原始样本大部分特征信息的同时,增强后的视图继承了语义信息和非语义信息相融交织的特性且具有单一的视图模式(本文将聚类主体相关信息视作语义信息,其他无关信息视为非语义信息,如背景信息、噪声信息等),这种特性和视图模式阻碍模型对语义信息的学习能力;2)同一样本的不同数据增强视图组成正样本对,正样本对本质上具有等价的语义概念,降低特征对比的多样性和语义性,同时,这样的正样本对划分到同一类簇也让聚类分析缺乏语义性.
为提升特征对比、聚类分析的多样性和语义性,从而改善模型的语义感知能力,本文提出了联合数据增强的语义对比聚类方法(semantic contrastive clustering with federated data augmentation,SCCFA),在采用弱数据增强方法的基础上,再引入一类差异化的数据增强方法,称为强数据增强(更大的旋转角度、更综合的属性抖动和CutOut),其增强后的样本称为强视图,弱数据增强生成的样本称为弱视图. 由于强数据增强对图片结构破坏性大,强视图丢失原始样本的许多特征信息,形成与弱视图不同的视图模式,利用2种视图模式的差异降低非语义信息的干扰,提升模型的语义信息学习能力. 此外,基于相邻样本属于同一类别的假设,借助k近邻(k-nearest neighbor, KNN)算法将同一类别的不同样本组成正样本对、其他为负样本对. 如图1所示,基于KNN图提供的类别信息确定锚点样本的邻居样本,锚点样本与邻居样本构建正样本对,其余样本组成负样本对,正样本对靠近用“吸引”表示,负样本对远离用“排斥”表示. 经过强和弱2种数据增强后,锚点样本和邻居样本分别拥有强和弱2种形态,并将两两形态用于对比学习而改善模型所学特征的一致性和语义性. 锚点样本强视图与邻居样本弱视图的对比(“强 vs 弱”)等价于锚点样本弱视图与邻居样本强视图的对比(“弱 vs 强”,其他视图间对比类似),两者是一类视图间的对比,本文将前者以参数形式融合到后者.
在公开的6个挑战性数据集上的大量实验结果表明了SCCFA的有效性和优越性.
综上,本文的主要贡献包括5个方面:
1)提出了一种语义感知的对比聚类框架,利用联合数据增强技术和KNN算法,赋予特征对比和聚类分析语义性,改善模型的语义感知能力.
2)采用差异化的强和弱2类数据增强方法,在改善特征多样性和泛化性的同时,利用视图模式差异降低非语义信息的干扰,提高模型对语义信息的学习能力.
3)利用全局特征构造全局KNN图,将相同语义的不同样本组成正样本对,提升特征对比和聚类分析的多样性和语义性.
4)在多个挑战性数据集上进行了大量实验,实验结果表明SCCFA在其中大部分数据集上都取得明显的性能提升.
5)大量消融实验表明联合数据增强技术能够缓解非语义信息的干扰,改善模型对语义信息的学习能力;也表明类别信息能有效改善特征对比和聚类分析的多样性和语义性.
1. 相关工作
1.1 深度聚类
传统的聚类算法能成熟地应用于简单的低维数据集,取得优异的聚类性能. 但这些算法由于特征提取能力缺陷,无法处理复杂的高维数据集. 深度聚类依靠深度学习强大的特征提取能力,在复杂数据集上不断拉大与传统聚类算法间的性能差距.DEC[8]通过特征嵌入空间的解码和编码进行表征学习. DCCM(deep comprehensive correlation mining)[13]利用多类特征间的相互关系来学习特征信息. 从训练策略划分,这些方法属于交替式训练模型,将特征学习和聚类分析分2步优化,交替式训练模型由于存在错误传播,只能学到次优解. 另一种训练策略称为同步式训练,将特征学习和聚类分析进行1步优化,由于解决了错误传播问题,聚类效果更好. 比如,IIC(invariant information clustering)[14]在概率分布矩阵上提取类别的互信息. PICA(partition confidence maximization)[15]提出划分不确定性指标(partition uncertainty index, PUI)来衡量解的全局语义性,以便找到语义最合理的一个类簇划分. 尽管文献[14-15]方法取得了不错的聚类效果,却忽略了特征学习与聚类分析间的关系. 本文利用正负样本对为桥梁,强化特征学习和聚类分析间相互促进、相互影响的关系,让学习的特征同时具备区分性和聚类友好性.
1.2 对比学习
对比学习作为一种自监督学习技术,近年来受到广泛关注,许多改进模型在下游任务上取得优异性能,表现出巨大的潜能.MOCO(momentum contrast)[11]将对比学习视作字典查询,以队列存储负样本,以动量编码器保持迭代前后负样本的稳定性. MOCO v2(improved baselines with momentum contrastive learning)[16]在MOCO的结构上添加一个多层感知机(multilayer perceptron,MLP),并将高斯模糊加入到原有数据增强方法中.SimCLR[12]简化网络结构,以big-batch形式增加负样本数量. 这些方法虽然能学习到区分性的特征,但如何与下游任务结合并取得理想性能仍充斥着巨大的挑战. 此外这些方法也存在2点不足:1)数据增强策略单一、简单,导致语义信息学习效率低;2)正负样本对构建策略缺乏语义概念,不利于聚类性能的进一步提升. 针对这2点不足,本文引入差异化的强和弱2类数据增强方法,利用联合数据增强技术降低非语义信息对模型学习语义信息的干扰. 此外借助KNN算法将同类的不同样本组成正样本对,完善正负样本对构建策略.
2. 本文模型
2.1 数据增强:弱和强
数据增强是一种通过让有限的数据产生更多的等价数据来人工扩展数据集的技术. 在对比学习中,数据增强不仅可以生成正样本,也能改善对比学习在源领域与目标领域间的迁移能力[17].
在传统对比学习方法中,弱数据增强除拥有数据增强的基本特性,如增加训练样本数量外,主要用于生成锚点样本的正样本. 为使网络尽可能学习到正样本对间的共有信息,弱数据增强方法应尽量保留锚点样本的特征信息,因此弱数据增强策略具备变换种类单一、变换力度小的特点,该策略一般由4种基本类型组成,分别是ColorJitter,ResizedCrop,HorizontalFlip,GaussianBlur.
文献[18]通过实验表明在全监督或半监督学习中,变换力度大的数据增强方法能提升网络性能. 文献[19]通过对多种基本类型的数据增强方法随机组合,探索多种数据增强策略在自监督学习中对网络的影响. 受文献[19]的启发,本文采用相似的数据增强随机组合策略,称之为强数据增强策略. 强数据增强策略由14种基本类型组成,分别是Sharpness,Brightness,Rotate,TranslateX,TranslateY,Color,Solarize,Invert,Equalize,Contrast,Posterise,AutoContrast,ShearX,ShearY.强数据增强对样本的破坏程度高于弱数据增强,导致增强后的视图丢失原始样本的较多特征信息,产生与弱数据增强不同的视图模式. 为了与弱数据增强形成差异性以及限制强数据增强对样本的破坏程度,本文的强数据增强由5种基本类型的数据增强方法组成,共有C(14,5)种形态应用在数据集上. 强数据增强策略和弱数据增强策略对样本的影响可视化具体如图2所示. 强数据增强样本称为强视图,如图2(b);弱数据增强样本称为弱视图,如图2(c). 强视图和弱视图明显地体现出强数据增强和弱数据增强的特性. 另外从图2(b)中可以看到,某些随机组合的强数据增强对样本的特征破坏大,有些样本的内在结构甚至被完全破坏,如图2(b)右下角样本所示. 如果只存在强数据增强,网络无法从强视图模式中学习到样本间的相似性和差异性,导致网络无法收敛,但是强数据增强和弱数据增强的联合能达到预期目标,即利用强视图和弱视图间的模式差异作为监督信息,引导模型学习嵌入在强视图中的特征信息,提高网络的学习能力.
2.2 框架总览
SCCFA的网络结构如图3所示,输入样本由锚点样本和邻居样本组成. 联合数据增强器(union data augmentor,UDA)对样本进行强和弱2种数据的增强. 第1个MLP产生嵌入空间,用于特征学习;第2个MLP产生概率分布空间,用于聚类分析. 联合特征对比器(federated feature comparator,FFC)和联合类别对比器(federated category comparator,FCC)分别对应特征对比和类别对比.
假设小批次为n的锚点样本集合为X={xi},i∈{1,2,⋯,n},邻居样本集合为Y={yij},i∈{1,2,⋯,n},j∈{1,2,⋯,k},其中yij表示锚点样本xi的第j个邻居样本,邻居样本借助KNN图确定,则输入样本集合为X∪Y. 集合X∪Y在UDA下进行强数据增强和弱数据增强,生成强视图Xs∪Ys和弱视图Xw∪Yw(其中s和w分别表示强和弱的含义)后输入卷积网络进行特征提取. 为便于表达,锚点样本集合和邻居样本集合的强视图、弱视图分别用Xa,a∈{s,w}和Yb,b∈{s,w}表示. 特征矩阵生成后,FFC和FCC在KNN图的指导下分别进行特征学习和聚类分析,FFC和FCC这2个模块以正负样本对为纽带,彼此形成较强的相互促进、相辅相成的关系. KNN图的质量好坏直接影响网络性能的高低,因此在训练的过程中,利用迭代的全局特征库更新KNN图,使KNN图的质量随着迭代而改善.
2.3 联合特征对比器
假设锚点样本和邻居样本的视图集合Xa和Yb的特征空间分别是 {{{{\boldsymbol{F}}}}^a} \in {\mathbb{R}^{n \times m}},a \in \{ {\text{s,w}}\} 和 {{{\tilde {\boldsymbol{F}}}}^b} \in {\mathbb{R}^{n \times m}}{\text{,}} b∈{s,w},m为卷积网络所提特征的维度,在MLP的非线性变换下, {{{{\boldsymbol{F}}}}^a} 和 {{{\tilde {\boldsymbol{F}}}}^b} 对应的嵌入矩阵分别是 {{{{\boldsymbol{Z}}}}^a} \in {\mathbb{R}^{n \times d}} , a \in \{ {\text{s,w}}\} 和 {\tilde{{\boldsymbol{Z}}}}^{b}\in \mathbb{R}^{n\times d}\text{,} b \in \{ {\text{s,w}}\} ,d是特征维度m降维后的大小,那么 {{{{\boldsymbol{Z}}}}^a} 和 {{{\tilde {\boldsymbol{Z}}}}^b} 的矩阵形式为
{{\boldsymbol{Z}}}^{a}={\left(\begin{array}{c} {{\boldsymbol{z}}}_{1}^{a} \\ {{\boldsymbol{z}}}_{2}^{a} \\ \vdots \\ {{\boldsymbol{z}}}_{n}^{a} \end{array}\right)}_{ n\times d}和{\tilde{{\boldsymbol{Z}}}}^{b}={\left(\begin{array}{c} {\tilde{{\boldsymbol{z}}}}_{1}^{b} \\ {\tilde{{\boldsymbol{z}}}}_{2}^{b} \\ \vdots \\ {\tilde{{\boldsymbol{z}}}}_{n}^{b} \end{array}\right)}_{ n\times d} . 本文采用余弦相似性度量特征间的相似程度,相似性函数为:
sim(\boldsymbol{z}_i^a,\tilde{\boldsymbol{z}}_j^b)=\frac{\boldsymbol{z}_i^a\cdot\tilde{\boldsymbol{z}}_j^b}{ \left\| \boldsymbol{z}_i^a \right\| \times \left\| \tilde{\boldsymbol{z}}_j^b \right\| }\text{.} (1) 为改善聚类准确率而对特征进行归一化[20],有 \left\| {{{{\boldsymbol{z}}}}_i^a} \right\| = 1 和 \left\| {{{\tilde {\boldsymbol{z}}}}_j^b} \right\| = 1 ,则式(1)转化为
sim(\boldsymbol{z}_i^a,\tilde{\boldsymbol{z}}_j^b)=\boldsymbol{z}_i^a\cdot\tilde{\boldsymbol{z}}_j^b\text{.} (2) 再引入温度超参 \tau ,相似性函数的最终形式为
sim({{{\boldsymbol{z}}}}_i^a,{{\tilde {\boldsymbol{z}}}}_j^b) = \frac{{{{{\boldsymbol{z}}}}_i^a \cdot {{\tilde {\boldsymbol{z}}}}_j^b}}{\tau } . (3) 采用和GCC(graph contrastive clustering)[21]相似的损失函数形式,FFC的特征对比损失函数定义为:
\mathcal{L}_{{\text{ins}}}^{a,b} = - \frac{1}{n}\sum\limits_{i = 1}^n {{\text{ln }}\frac{{\sum\limits_{r \in R(i)} {{{\text{e}}^{sim({{{\boldsymbol{z}}}}_i^a,{{\tilde {\boldsymbol{z}}}}_r^b)}}} }}{{\sum\limits_{g \in G(i)} {{{\text{e}}^{sim({{{\boldsymbol{z}}}}_i^a,{{\tilde {\boldsymbol{z}}}}_g^a)}}} + \sum\limits_{t \in T(i)} {{{\text{e}}^{sim({{{\boldsymbol{z}}}}_i^a,{{\tilde {\boldsymbol{z}}}}_t^b)}}} }}} \text{,} (4) 其中 a,b \in \{ {\text{s,w}}\} , R(i) 表示锚点样本 {x_i} 的邻居样本, G(i) 表示锚点样本 {x_i} 所在集合下的负样本, T(i) 表示邻居样本所在集合下的负样本. R(i) , G(i) , T(i) 的数学表达为:
\left\{ {\begin{array}{*{20}{l}} {R(i) = \{ {y_{ij}}|j \in \{ 1,2, \cdots ,k\} \} {\text{,}}} \\ {G(i) = \{ {x_t}|t \ne i \wedge t \in \{ 1,2, \cdots ,n\} \} {\text{,}}} \\ {T(i) = \{ {y_{tj}}|t \ne i \wedge t \in \{ 1,2, \cdots ,n\} \wedge j \in \{ 1,2, \cdots ,k\} \} {\text{. }}} \end{array}} \right. 那么锚点样本与邻居样本间“弱 vs 弱”的损失函数为
\mathcal{L}_{{\text{ins}}}^{{\text{w,w}}} = - \frac{1}{n}\sum\limits_{i = 1}^n {{\text{ln }}\frac{{\sum\limits_{r \in R(i)} {{{\text{e}}^{sim({{{\boldsymbol{z}}}}_i^{\text{w}},{{\tilde {\boldsymbol{z}}}}_r^{\text{w}})}}} }}{{\sum\limits_{g \in G(i)} {{{\text{e}}^{sim({{{\boldsymbol{z}}}}_i^{\text{w}},{{\tilde {\boldsymbol{z}}}}_g^{\text{w}})}}} + \sum\limits_{t \in T(i)} {{{\text{e}}^{sim({{{\boldsymbol{z}}}}_i^{\text{w}},{{\tilde {\boldsymbol{z}}}}_t^{\text{w}})}}} }}} \text{,} (5) 同理,“弱 vs 强”和“强 vs 强”的损失分别为:
\mathcal{L}_{{\text{ins}}}^{{\text{w,s}}} = - \frac{1}{n}\sum\limits_{i = 1}^n {{\text{ln }}\frac{{\sum\limits_{r \in R(i)} {{{\text{e}}^{sim({{{\boldsymbol{z}}}}_i^{\text{w}},{{\tilde {\boldsymbol{z}}}}_r^{\text{s}})}}} }}{{\sum\limits_{g \in G(i)} {{{\text{e}}^{sim({{{\boldsymbol{z}}}}_i^{\text{w}},{{\tilde {\boldsymbol{z}}}}_g^{\text{w}})}}} + \sum\limits_{t \in T(i)} {{{\text{e}}^{sim({{{\boldsymbol{z}}}}_i^{\text{w}},{{\tilde {\boldsymbol{z}}}}_t^{\text{s}})}}} }}} \text{,} (6) \mathcal{L}_{{\text{ins}}}^{{\text{s,s}}} = - \frac{1}{n}\sum\limits_{i = 1}^n {{\text{ln }}\frac{{\sum\limits_{r \in R(i)} {{{\text{e}}^{sim({{{\boldsymbol{z}}}}_i^{\text{s}},{{\tilde {\boldsymbol{z}}}}_r^{\text{s}})}}} }}{{\sum\limits_{g \in G(i)} {{{\text{e}}^{sim({{{\boldsymbol{z}}}}_i^{\text{s}},{{\tilde {\boldsymbol{z}}}}_g^{\text{s}})}}} + \sum\limits_{t \in T(i)} {{{\text{e}}^{sim({{{\boldsymbol{z}}}}_i^{\text{s}},{{\tilde {\boldsymbol{z}}}}_t^{\text{s}})}}} }}} . (7) 因此,FFC的总的损失函数为
{\mathcal{L}_{{\text{FFC}}}} = \mathcal{L}_{{\text{ins}}}^{{\text{w,w}}} + 2 \times \mathcal{L}_{{\text{ins}}}^{{\text{w,s}}} + \mathcal{L}_{{\text{ins}}}^{{\text{s,s}}} . (8) 2.4 联合类别对比器
假设锚点样本和邻居样本的特征空间 {{{{\boldsymbol{F}}}}^a} 和 {{{\tilde {\boldsymbol{F}}}}^b} 对应的概率分布矩阵分别是 {{{{\boldsymbol{P}}}}^a} \in {\mathbb{R}^{n \times c}},a \in \{ {\text{s,w}}\} 和 {{{\tilde {\boldsymbol{P}}}}^b} \in {\mathbb{R}^{n \times c}}, b \in \{ {\text{s,w}}\} ,其中 c 是类别数量, {{{{\boldsymbol{P}}}}^a} 和 {{{\tilde {\boldsymbol{P}}}}^b} 的矩阵表现形式为
{{\boldsymbol{P}}}^{a}={\left(\begin{array}{c} {{\boldsymbol{p}}}_{1}^{a} \\ {{\boldsymbol{p}}}_{2}^{a} \\ \vdots \\ {{\boldsymbol{p}}}_{n}^{a} \end{array}\right)}_{ n\times c}和{\tilde{{\boldsymbol{P}}}}^{\text{b}}={\left(\begin{array}{c} {\tilde{{\boldsymbol{p}}}}_{1}^{b} \\ {\tilde{{\boldsymbol{p}}}}_{2}^{b} \\ \vdots \\ {\tilde{{\boldsymbol{p}}}}_{n}^{b} \end{array}\right)}_{ n\times c} . 和CC[22]的思想相似,本文将标签信息视作一种特殊的特征信息,为便于计算,将 {{{{\boldsymbol{P}}}}^a} 和 {{{\tilde {\boldsymbol{P}}}}^b} 转置,得到 {{{{\boldsymbol{Q}}}}^a} \in {{{\boldsymbol{R}}}^{c \times n}} , {{{\tilde {\boldsymbol{Q}}}}^b} \in {\mathbb{R}^{c \times n}},a,b \in \{ {\text{s,w}}\} :
\boldsymbol{Q}^a=\left(\begin{array}{*{20}{c}}\boldsymbol{q}_1^a, & \boldsymbol{q}_2^a, & \cdots & ,\boldsymbol{q}_c^a\end{array}\right)_{c\times n}^{\text{T}}\text{,} \tilde{\boldsymbol{Q}}^b=\left(\begin{array}{*{20}{c}}\tilde{\boldsymbol{q}}_{1,}^b & \tilde{\boldsymbol{q}}_{2,}^b & \cdots & \tilde{,\boldsymbol{q}}_c^b\end{array}\right)_{c\times n}^{\text{T}}. FCC中类别对比损失函数定义为
\mathcal{L}_{{\text{clus}}}^{a,b} = - \frac{1}{c}\sum\limits_{i = 1}^c {{\text{ln }}\frac{{{{\text{e}}^{sim({{{\boldsymbol{q}}}}_i^a,{{\tilde {\boldsymbol{q}}}}_i^b)}}}}{{\sum\limits_{j = 1}^c {{{\text{e}}^{sim({{{\boldsymbol{q}}}}_i^a,{{\tilde {\boldsymbol{q}}}}_j^b)}}} }}} ,a,b \in \{ {\text{s,w}}\} . (9) 和FFC的思路类似,样本间的“弱 vs 弱”“弱 vs 强”“强 vs 强”的类别对比损失分别依次为:
\mathcal{L}_{{\text{clus}}}^{{\text{w,w}}} = - \frac{1}{c}\sum\limits_{i = 1}^c {{\text{ln }}\frac{{{{\text{e}}^{sim({{{\boldsymbol{q}}}}_i^{\text{w}},{{\tilde {\boldsymbol{q}}}}_i^{\text{w}})}}}}{{\sum\limits_{j = 1}^c {{{\text{e}}^{sim({{{\boldsymbol{q}}}}_i^{\text{w}},{{\tilde {\boldsymbol{q}}}}_j^{\text{w}})}}} }}} \text{,} (10) \mathcal{L}_{{\text{clus}}}^{{\text{w,s}}} = - \frac{1}{c}\sum\limits_{i = 1}^c {{\text{ln }}\frac{{{{\text{e}}^{sim({{{\boldsymbol{q}}}}_i^{\text{w}},{{\tilde {\boldsymbol{q}}}}_i^{\text{s}})}}}}{{\sum\limits_{j = 1}^c {{{\text{e}}^{sim({{{\boldsymbol{q}}}}_i^{\text{w}},{{\tilde {\boldsymbol{q}}}}_j^{\text{s}})}}} }}} \text{,} (11) \mathcal{L}_{{\text{clus}}}^{{\text{s,s}}} = - \frac{1}{c}\sum\limits_{i = 1}^c {{\text{ln }}\frac{{{{\text{e}}^{sim({{{\boldsymbol{q}}}}_i^{\text{s}},{{\tilde {\boldsymbol{q}}}}_i^{\text{s}})}}}}{{\sum\limits_{j = 1}^c {{{\text{e}}^{sim({{{\boldsymbol{q}}}}_i^{\text{s}},{{\tilde {\boldsymbol{q}}}}_j^{\text{s}})}}} }}} . (12) 因此,总的类别对比损失函数为
{\mathcal{L}_{{\text{clus}}}} = \mathcal{L}_{{\text{clus}}}^{{\text{w,w}}} + \mathcal{L}_{{\text{clus}}}^{{\text{w,s}}} + \mathcal{L}_{{\text{clus}}}^{{\text{s,s}}} . (13) 同时,为解决聚类分析中稳定性问题,即避免所有样本划分到同一类别这种极端情况,本文添加一个额外的聚类约束损失:
{\mathcal{L}_{{\text{cc}}}} = {\text{ln }}c - H({{{\boldsymbol{I}}}}) \text{,} (14) 其中 H( \cdot ) 为熵函数, {{{\boldsymbol{I}}}} 表示包含 c 个元素的分布. {{{\boldsymbol{I}}}} 中元素定义为:
I_j=\frac{\sum\limits_{t=1}^nq_{\mathit{jt}}^{\text{w}}}{\sum\limits_{i=1}^c\sum\limits_{t=1}^nq_{it}^{\text{w}}}. (15) 于是FCC的损失函数由类别对比损失和聚类约束损失2部分组成:
{\mathcal{L}_{{\text{FCC}}}} = {\mathcal{L}_{{\text{clus}}}} + \lambda \times {\mathcal{L}_{{\text{cc}}}} \text{,} (16) 其中 \lambda 为平衡超参.
综合FFC和FCC,SCCFA的目标函数为
\mathcal{L} = \alpha \times {\mathcal{L}_{{\text{FFC}}}} + {\mathcal{L}_{{\text{FCC}}}} \text{,} (17) 其中 \alpha 为平衡超参.
2.5 伪代码
锚点样本的邻居样本准确率(正样本准确率)取决于KNN图的质量. 为创建一个质量较好的KNN图,设计前置任务用于特征学习. 第1阶段进行特征对比学习,同一样本的数据增强视图作为正样本对;第2阶段引入类别对比学习,构建KNN图并不断更新图结构.SCCFA的训练过程具体如算法1所示.
算法1. SCCFA算法.
输入:数据集 X ,最小批次 n ,第1阶段次数 t ,迭代次数epochs,类别数量 c ,特征对比中相似性度量的温度超参 {\tau _{{\text{ins}}}} ,类别对比中相似性度量的温度超参 {\tau _{{\text{clus}}}} ;
输出:逼近最优解的对比聚类模型 {\varPhi } .
/*训练阶段*/
① for epoch = 1 to epochs do
② 从数据集 X 中抽样 n 个随机样本 X' ;
③ if epoch < t /*第1阶段*/
④ X' 进行强和弱数据增强处理;
⑤ 通过式(8)计算损失 {\mathcal{L}_{{\text{FFC}}}} ;
⑥ {\mathcal{L}_{{\text{FCC}}}} = 0 ;
⑦ else /*第2阶段*/
⑧ if epoch = t
⑨ 初始化KNN图;
⑩ end if
⑪ 根据KNN,抽样 X' 的邻居样本 Y' ;
⑫ X' \cup Y' 进行强和弱数据增强;
⑬ 通过式(8)计算损失值 {\mathcal{L}_{{\text{FFC}}}} ;
⑭ 通过式(16)计算损失值 {\mathcal{L}_{{\text{FCC}}}} ;
⑮ end if
⑯ 通过式(17)计算总体损失 \mathcal{L} ;
⑰ SGD算法最小化 \mathcal{L} 来更新网络参数 \theta ;
⑱ 重新计算KNN图;
⑲ end for
/*测试阶段*/
① for X' in X do
② X' 进行数据增强得 X'' ;/*不同于训练
阶段的数据增强方法*/
③ X'' 输入对比聚类模型 {\varPhi } ,计算软标签;
④ end for
2.6 时间复杂度分析
假设算法①第1阶段迭代次数为t1,第2阶段迭代次数为t2,数据集 X 和小批次 X' 大小分别为 N 和 n ,数据集 X 划分为N/n批,近邻样本个数为 k ,类别数量为 c ,嵌入矩阵维度为d,且不考虑神经网络前向和反向传播的时间开销,以及忽略Faiss工具包构建KNN图的时间开销.
1)第1阶段. 小批次 X' 在每次迭代下,数据增强的时间复杂度为 O(n) ,利用式(8)计算特征对比损失的时间复杂度为 O(d{n^2}) ,因为迭代次数为 {t_1} ,每次迭代有N/n批,所以第1阶段总的时间复杂度为 O({t_1}dnN) .
2)第2阶段. 计算特征对比损失的总时间复杂度是 O({t_2}kdnN) ,通过式(16)计算类别对比损失的总时间复杂度为 O({t_2}{c^2}N) ,所以第2阶段总时间复杂度为 O({t_2}kdnN + {t_2}{c^2}N) .
因此算法1的总时间复杂度为 O(({t_1} + k{t_2}) dnN + {t_2}{c^2}N) .
3. 实验与分析
3.1 数据集
为充分评估本文方法的有效性和先进性,在6个广泛使用的基准数据集上进行实验以验证本文方法的聚类性能. 表1列出了这些数据集的统计信息.
表 1 数据集统计Table 1. Statistics of Datasets数据集 图片数量 图片尺寸 类别个数 CIFAR-10 6万 3×32×32 10 CIFAR-100 6万 3×32×32 20 STL-10 1.3万 3×96×96 10 ImageNet-10 1.3万 3×96×96 10 ImageNet-Dogs 1.95万 3×96×96 15 Tiny-ImageNet 11万 3×64×64 200 1)CIFAR-10[23]是一个由10个类别的共6万张32×32彩色图片组成的数据集,每类有0.6万张图片.
2)CIFAR-100[23]包含100个类别,可归纳为20个超类,每类别含600张图片,其中500张用于训练,100张用于测试. 本文将这20个超类作为该数据集的标签信息.
3)STL-10[24]是一个数量为1.3万的数据集,包含额外10万张无标签信息的图片,挑战性更大. 本文将无标签信息的图片舍弃,只选择有标签信息的1.3万张图片用于网络训练.
4)ImageNet-10[25]是从ImageNet[26]中挑选10个类别的图片组成的数据集,共包含1.3万张图片.
5)ImageNet-Dogs[25]与ImageNet-10同出于ImageNet,共1.95万张图片.
6)Tiny-ImageNet[27]有200个类别,共12万张图片,其中含有10万张训练图片、1万张验证图片以及1万张测试图片,由于测试图片无标签信息被舍弃,只有11万张图片用于本文方法的训练和测试.
3.2 评价指标
从公平性和全面性的角度出发,本文采用广泛使用的聚类有效性指标:ACC(accuracy),NMI(normalized mutual information),ARI(adjusted rand index)评价本文方法在6个数据集上的性能.
1)ACC指模型预测的准确率,用于衡量预测标签与真实标签之间一对一关系的正确程度,计算式为:
ACC(Y,\hat Y) = \frac{1}{n}\sum\limits_{i = 1}^n {\delta ({y_i},{{\hat y}_i})} \text{,} (18) 其中 Y 表示真实标签, \hat Y 是预测标签, \delta ( \cdot ) 是指示函数,当 {y_i} = {\hat y_i} 时, \delta ( \cdot ) = 1 ,否则 \delta ( \cdot ) = 0 .
2)NMI用于度量预测标签与真实标签之间的相近程度. 给定数据集 X ,假设真实标签为 Y ,预测标签为 \hat Y ,NMI的计算式为
NMI(Y,\hat Y) = \frac{{2 \times MI(Y,\hat Y)}}{{H(Y) + H(\hat Y)}} \text{,} (19) 其中 H( \cdot ) 表示熵函数, H(Y) 和 H(\hat Y) 的定义为:
H(Y) = - \sum\limits_{i = 1}^{\left| Y \right|} {P(i) \times {\text{lb (}}P(i))} \text{,} (20) H(\hat Y) = - \sum\limits_{j = 1}^{\left| {\hat Y} \right|} {P(j) \times {\text{lb }}(P(j))} \text{,} (21) 其中P(·)表示联合概率,此外,式(19)中的 MI( \cdot ) 指预测值与真实值间的互信息,定义为
MI(Y,\hat Y) = \sum\limits_{i = 1}^{\left| Y \right|} {\sum\limits_{j = 1}^{\left| {\hat Y} \right|} {P(i,j) \times {\text{lb }}\left( {\frac{{P(i,j)}}{{P(i) \times P(j)}}} \right)} } . (22) 3)ARI为调整兰德指数,用于评价预测标签与真实标签间的匹配程度,定义为
ARI = \frac{{RI - E(RI)}}{{{\text{max}}(RI) - E(RI)}} \text{,} (23) RI=\sum\limits_{ij}^{ }C_{n_{ij}}^2, (24) 其中,E(·)表示期望.
3.3 实现细节
本文方法采用Python语言和Pytorch框架实现,在Nvidia Geforce RTX 3090 24 GB服务器上运行.
将残差网络ResNet-18用于特征提取,特征学习和聚类分析MLP的深度分别是2层和1层. 此外,模型采用SGD优化算法,其学习率、权值衰减以及动量系数分别为0.4,1E−4,0.9.考虑收敛的稳定性,学习率以0.1的衰减率进行余弦衰减. 模型的超参共有特征对比中相似性度量的温度超参 {\tau _{{\text{ins}}}} 、类别对比中相似性度量的温度超参 {\tau _{{\text{clus}}}} 以及平衡超参 \alpha 和 \lambda ,它们的初始值分别为0.1,1.0,2.0,3.0.开源库Faiss计算6个数据集的KNN图(k=5),鉴于Faiss高效的特性,KNN图的计算开销可以忽略不计. 由于数据集的图片尺寸、学习难度等各有差异,再加上物理限制,第1阶段次数t、小批次n根据数据集的实际情况而定. 最后,所有数据集上的训练次数为800.
3.4 实验结果分析
在相同的6个数据集上,本文方法分别从ACC,NMI,ARI这3个方面与其他17个聚类方法进行对比,实验结果记录在表2~4中. 其他方法数据来源于GCC[21].
表 2 不同方法在6个数据集上的ACC比较Table 2. ACC Comparison of Different Methods on Six Datasets方法 CIFAR-10 CIFAR-100 STL-10 ImageNet-10 ImageNet-Dogs Tiny-ImageNet k-means[6] 0.229 0.130 0.192 0.241 0.105 0.025 SC[5] 0.247 0.136 0.159 0.274 0.111 0.022 AC[28] 0.228 0.138 0.332 0.242 0.139 0.027 NMF[29] 0.190 0.118 0.180 0.230 0.118 0.029 AE[30] 0.314 0.165 0.303 0.317 0.185 0.041 DAE[31] 0.297 0.151 0.302 0.304 0.190 0.039 GAN[32] 0.315 0.151 0.298 0.346 0.174 0.041 DeCNN[33] 0.282 0.133 0.299 0.313 0.175 0.035 VAE[34] 0.291 0.152 0.282 0.334 0.179 0.036 JULE[35] 0.272 0.137 0.277 0.300 0.138 0.033 DEC[8] 0.301 0.185 0.359 0.381 0.195 0.037 DAC[25] 0.522 0.238 0.470 0.527 0.275 0.066 DCCM[13] 0.623 0.327 0.482 0.710 0.383 0.108 IIC[14] 0.617 0.257 0.610 PICA[15] 0.696 0.337 0.713 0.870 0.352 0.098 CC[22] 0.766 0.426 0.747 0.895 0.342 0.140 GCC[21] 0.856 0.472 0.788 0.901 0.526 0.138 SCCFA(本文) 0.882 0.506 0.802 0.963 0.543 0.132 注:最好的结果用粗体数字标注. 表 3 不同方法在6个数据集上的NMI比较Table 3. NMI Comparison of Different Methods on Six Datasets方法 CIFAR-10 CIFAR-100 STL-10 ImageNet-10 ImageNet-Dogs Tiny-ImageNet k-means[6] 0.087 0.084 0.125 0.119 0.055 0.065 SC[5] 0.103 0.090 0.098 0.151 0.038 0.063 AC[28] 0.105 0.098 0.239 0.138 0.037 0.069 NMF[29] 0.081 0.079 0.096 0.132 0.044 0.072 AE[30] 0.239 0.100 0.250 0.210 0.104 0.131 DAE[31] 0.251 0.111 0.224 0.206 0.104 0.127 GAN[32] 0.265 0.120 0.210 0.225 0.121 0.135 DeCNN[33] 0.240 0.092 0.227 0.186 0.098 0.111 VAE[34] 0.245 0.108 0.200 0.193 0.107 0.113 JULE[35] 0.192 0.103 0.182 0.175 0.054 0.102 DEC[8] 0.257 0.136 0.276 0.282 0.122 0.115 DAC[25] 0.396 0.185 0.366 0.394 0.219 0.190 DCCM[13] 0.496 0.285 0.376 0.608 0.321 0.224 PICA[15] 0.591 0.310 0.611 0.802 0.352 0.277 CC[22] 0.681 0.424 0.674 0.862 0.401 0.340 GCC[21] 0.764 0.472 0.684 0.842 0.490 0.347 SCCFA(本文) 0.808 0.511 0.733 0.910 0.525 0.343 注:最好的结果用粗体数字标注. 表 4 不同方法在6个数据集上的ARI比较Table 4. ARI Comparison of Different Methods on Six Datasets方法 CIFAR-10 CIFAR-100 STL-10 ImageNet-10 ImageNet-Dogs Tiny-ImageNet k-means[6] 0.049 0.028 0.061 0.057 0.020 0.005 SC[5] 0.085 0.022 0.048 0.076 0.013 0.004 AC[28] 0.065 0.034 0.140 0.067 0.021 0.005 NMF[29] 0.034 0.026 0.046 0.065 0.016 0.005 AE[30] 0.169 0.048 0.161 0.152 0.073 0.007 DAE[31] 0.163 0.046 0.152 0.138 0.078 0.007 GAN[32] 0.176 0.045 0.139 0.157 0.078 0.007 DeCNN[33] 0.174 0.038 0.162 0.142 0.073 0.006 VAE[34] 0.167 0.040 0.146 0.168 0.079 0.006 JULE[35] 0.138 0.033 0.164 0.138 0.028 0.006 DEC[8] 0.161 0.050 0.186 0.203 0.079 0.007 DAC[25] 0.306 0.088 0.257 0.302 0.111 0.017 DCCM[13] 0.408 0.173 0.262 0.555 0.182 0.038 PICA[15] 0.512 0.171 0.531 0.761 0.201 0.040 CC[22] 0.606 0.282 0.606 0.825 0.225 0.071 GCC[21] 0.728 0.305 0.631 0.822 0.362 0.075 SCCFA(本文) 0.777 0.347 0.680 0.920 0.368 0.062 注:最好的结果用粗体数字标注. 综合表2~4可知,传统聚类算法在高维数据集上的性能远低于深度聚类方法. 此外,基于互信息的方法比基于重构的方法取得了更优的聚类性能,比如IIC[14]和AE(autoencoder)[30]. 对比聚类方法学习到强区分性的特征,并通过正负样本强化了特征学习和聚类分析间相互促进的关系,进一步提升了聚类效果,如CC[22]比IIC有接近10%的提升. 同时,GCC与CC的对比说明对比学习应用在下游任务时,样本对构建策略应与下游任务实际结合. 更重要的是,相较于已有的对比聚类方法,如PICA[15],GCC,SCCFA在除了Tiny-ImageNet数据集外的其他5个基准数据集上都取得了明显的性能提升,在ImageNet-10数据集上ACC和ARI的提升分别有6个百分点和10个百分点,且所有数据集上的训练次数缩减了20个百分点. 由此可见,联合数据增强技术和KNN算法有效提升了模型的学习能力和学习效率. 在Tiny-ImageNet数据集上只取得次优结果的可能原因有:
1)Tiny-ImageNet数据集有200个类别,相对其他数据集拥有更多的语义类别,此时,假负样本对模型性能的影响更大[36],干扰模型学习不同类簇之间的差异性,然而本文方法未对假负样本作进一步处理.
2)Tiny-ImageNet数据集的正样本对准确率相对其他数据集较低,说明假正样本较多,假正样本会干扰模型对同类样本间一致性的学习.
3.5 联合数据增强影响分析
关于联合数据增强的消融实验结果如表5所示. “弱 vs 弱”实验作为对照实验,用以说明弱数据增强尽管保留了原始样本的大部分特征信息,但也继承了语义信息与非语义信息消融交织的特性,在单一的视图模式下,该特性阻碍了模型对语义信息的学习. 从“弱 vs 强”实验可知,由于强数据增强对样本破坏大,生成与弱数据增强不同的视图模式,利用2类差异性的视图模式间的对比缓解“弱 vs 弱”存在的不足. “强vs 强”实验说明在单一的强视图模式下,对比学习范式无法从破坏严重的样本中学习到有效的共有信息. 最后,“弱/强 vs 强/弱”实验是本文方法采用的联合数据增强策略,可以有效解决单个实验的不足,既利用“弱 vs 强”解决“弱vs 弱”中语义学习能力不足的问题,又利用“弱 vs 弱”缓解“弱vs 强”中非语义信息考虑不足的问题,改善模型的泛化性. “强 vs 强”的引入是为了和“弱 vs 弱”“弱 vs 强”相互配合,在保持模型能继续学习的前提下,从多种视图模式对比中捕获更多的语义信息,改善特征一致性.
表 5 强数据增强和弱数据增强间的不同联合Table 5. Different Combinations Between Strong Data Augmentation and Weak Data Augmentation策略 符号表示 ACC NMI ARI 弱 vs 弱 (w,w) 0.841 0.743 0.704 弱 vs 强 (w,s) 0.855 0.766 0.733 强 vs 强 (s,s) − − − 弱/强 vs 强/弱 (w,w)+(w,s)+(s,s) 0.882 0.808 0.777 注:最好的结果用粗体数字标注;“vs”的2个对象表示数据增强的类型;“−”表示模型无法学习,聚类失败. 3.6 类别信息影响分析
围绕类别信息展开的消融实验结果如表6所示. 第1行实验在特征学习和聚类分析2个模块上都没有引入类别信息,代表传统的正样本对构建策略,即将同一样本的不同数据增强视图作为正样本对. FFC没有引入类别信息时,仅学习语义等价的样本对间的共有信息. FCC没有引入类别信息时,仅具有将同一样本的不同数据增强视图划分为一类的能力. 当FFC和FCC都引入类别信息时,即在2个模块上将同一类的不同样本作为正样本对. 从ACC数据可以看出,这明显改善了模型的语义感知能力,同时,ACC的增长幅度反映了FFC和FCC间相互促进、相辅相成的关系.
表 6 全局类别信息的影响Table 6. Effect of the Global Category InformationFFC FCC ACC NMI ARI − − 0.852 0.768 0.726 √ − 0.863 0.785 0.744 − √ 0.861 0.777 0.735 √ √ 0.882 0.808 0.777 注:最好的结果用粗体数字标注;“√”表示引入类别信息,“−”表示没有引入类别信息. 4. 定性研究
4.1 案例分析
为了知晓本文方法在不同类别图像上的预测能力,本文引入聚类结果的混淆矩阵,也通过具体的案例以更加直观的方式分析本文方法在各个类别上的不足. 图4是CIFAR-10数据集、ImageNet-10数据集以及STL-10数据集上聚类结果的混淆矩阵,横坐标和纵坐标分别表示预测类别和真实类别,矩阵元素数字表示预测准确度. 图5是关于CIFAR-10数据集和ImageNet-10数据集上的正样本对案例图,其中“真正样本”表示与锚点样本同类的样本,“假正样本”表示与锚点样本异类的样本. 本文方法借助全局KNN图实现正样本配对,在标签信息缺失的情况下,正样本配对存在一定误差,而正样本对准确与否严重影响本文方法聚类性能的好坏. 综合图4和图5可知,在CIFAR-10数据集上,狗的预测准确率较低,而本文方法错误地将猫视作其邻居样本. 另外,本文方法在STL-10数据集上的差异性较大,猫的准确率只有37%,而飞机的准确率高达96%,两者间相差59个百分点. 最后,在ImageNet-10数据集上,尽管差异性不大,却错误地将轮船当作飞机的邻居样本、将企鹅作为足球的邻居样本. 从ImageNet-10数据集可见,本文方法利用联合数据增强技术尽管缓解了非语义信息的干扰,改善模型对语义信息的感知,仍无法避免将语义信息占比少且非语义信息极其相似的样本划分为一类的现象,这也是聚类任务中普遍存在的一个问题. 另外,从CIFAR-10和STL-10数据集中猫和狗2个类别的数据可见,仅从特征相似性匹配正样本可能导致将特征相似、语义不同的2个样本错误匹配,这在一定程度上限制了本文方法的性能.
4.2 特征可视化
为了了解本文方法的特征提取和聚类分析在训练过程中的演变过程,本文在CIFAR-10数据集上借助t-SNE[37]实现特征可视化,具体为随机从每个类别中选择500个样本,用颜色或数字表示类别信息及其对应的特征,详见图6. 从图6可见,类别1~9中每个类别的类内紧凑度以及类间的区分度在训练过程中不断改善. 具体而言,类簇2,3,5在所有类簇中最难以清楚划分. 从图6(d)可知,在训练结束时,类簇3,5的类内紧凑度很好,并且类簇2,3,5之间的区分性也很明显,说明本文方法能有效处理高维的数据集.
4.3 收敛性分析
为了分析本文方法结构的合理性,进行了收敛性分析. 在4个数据集上的损失值收敛过程如图7所示,其与图8相对应. 图8展示了这4个数据集上聚类准确率和正样本对准确率的变化过程. 在标签信息缺失情况下,由于借助KNN图寻找正样本存在一定偏误,本文用正样本对准确率描述该偏误,正样本对准确率是top-5准确率,即锚点样本的前5个伪正样本中真实正样本所占的比例. 从图7、图8可知,损失值在所有数据集上有效、平稳地收敛,聚类准确率和正样本对准确率有效地上升;此外,整体损失值、聚类准确率和正样本对准确率三者间具有相互影响的关系. 以上分析说明本文方法的结构设计合理,训练稳定;也说明了正样本对的准确率在语义对比学习框架中起着重要的作用.
5. 结 论
本文提出了一种新颖的语义对比聚类方法,即联合数据增强的语义对比聚类(SCCFA). 与已有的方法相比,本文方法在全局KNN图的指导下,基于近邻样本属于同类样本的假设,寻找锚点样本的top-k近邻样本作为其正样本,改善了特征对比和类别对比的多样性和语义性. 另外,本文采用差异化的强和弱2类数据增强方法,在提升特征的多样性和泛化性的同时,借助强数据增强和弱数据增强生成的模式差异缓解非语义信息对模型的干扰,提升模型对语义信息的学习效率和学习能力. 最后,在6个基准数据集上与已有的多个聚类方法进行性能对比,证明了本文方法的有效性和先进性.
聚类作为计算机视觉研究中的关键技术,本文所提方法尽管解决了对比聚类中的多个不足,但基于KNN图寻找正样本仍存在一些误差,未来将在这方面进行更深入的研究.
作者贡献声明:王气洪负责算法模型的提出和实现、实验结果的整理与分析、论文的撰写和修改;贾洪杰提出研究方向,把握论文的创新性并指导和参与论文的修订;黄龙霞和毛启容讨论算法的改进,设计实验方案,并指导论文修改.
-
表 1 模型切割方法比较
Table 1 Comparison of Model Partition Methods
方法 模型切割执行者 切割的依据收集方式 切片依赖关系处理/服务发现方式 切片更新方式 优化目标 运行时涉及
的切片数量DeepThings[46-47] 网关(gateway) 周期性收集节点状态 网关统一调度 节点拥有完整模型 内存、通信 ≥2 ADCNN[55] 中心节点(central node) 基于历史任务执行时延估计 中心节点调度 节点拥有完整模型 时延、通信 ≥2 Neurosurgeon[34] 客户端 实时观测当前网络、能耗状态 IP绑定(固定) 节点拥有完整模型 能耗(时延) 2 MoDNN[48] 中心节点(group owner) 节点注册到中心节点时获取 中心节点调度 部署一次,无更新 时延 ≥2 DeepX[50] 中心节点(execution planner) 实时收集与线性回归预测 中心节点调度 每次运行推断重新生成执行计划 能耗、内存 ≥2 AOFL[51] 云端或中心节点 周期性收集节点状态 IP绑定(固定) 重新部署 时延、通信 ≥2 CRIME[52] 任意节点 节点实时交互 直接邻居集合 节点拥有完整模型 时延、能耗 ≥2 DeepSlicing[53] 主节点调度(master) 基于历史任务执行时延估计 中心节点调度 节点拥有完整模型 时延、内存 ≥2 Edgent[54] 主节点(边缘服务器) 观测的历史网络数据 IP绑定(固定) 重新部署 准确率、时延 2 文献[45] 中心节点 实时收集节点状态 IP绑定(固定) 节点拥有完整模型 内存 ≥2 Cogent[49] 中心节点(DDPG adgent) 周期性收集节点状态 Kubernetes提供的静态虚拟IP绑定
(固定)重新部署 准确率、时延 2 文献[56−57] 边缘服务器(server) 根据模型及优化目标折中分析 IP绑定(固定) 重新部署 计算、通信时延 2 表 2 不同架构的比较
Table 2 Comparison of Different Architectures
序号 名称 关键结合技术 针对的问题 适用场景 1 基于模型切割的云边协同推理 模型切割、数据压缩、量化、
矩阵分解/压缩、早期退出边端设备能耗、算力有限、
能耗与时延“折中”有云端支撑、数据预处理、
隐私、负载实时调整2 基于模型切割的边边协同推理 模型切割、数据压缩、量化、
矩阵分解/压缩与云端链接不可靠、单节点资
源受限、能耗与时延“折中”无云端支撑、单节点资源不足
且有邻居节点、通信代价低3 基于模型选择的云边协同推理 数据压缩、模型压缩、知识蒸馏 边端设备推理精度不可靠 推理精度高可信、边端节点
资源相对充足4 基于多模型结果聚合的边边协同推理 数据/模型融合、数据压缩、
异/同步通信协同推理并行度低、
推理精度不可靠多场景协同推理、边端节点资源
充足、对时延要求相对较低 -
[1] David S, David C, Nick J. Top 10 strategic technology trends for 2020[EB/OL]. (2019-10-20)[2022-02-05].https://www.gartner.com/smarterwithgartner/gartner-top-10-strategic-technology-trends-for-2020
[2] Carrie M, David R, Michael S. The growth in connected IoT devices is expected to generate 79.4ZB of data in 2025, according to a new IDC forecast[EB/OL]. (2019-06-18) [2022-02-15].https://www.businesswire.com/news/home/20190618005012/en/The-Growth-in-Connected-IoT-Devices-is-Expected-to-Generate-79.4ZB-of-Data-in-2025-According-to-a-New-IDC-Forecast
[3] Xiao Yinhao, Jia Yizhen, Liu Chunchi, et al. Edge computing security: State of the art and challenges[J]. Proceedings of the IEEE, 2019, 107(8): 1608−1631 doi: 10.1109/JPROC.2019.2918437
[4] Kevin M, Amir E. AWS customers rack up hefty bills for moving data[EB/OL]. (2019-10-21)[2022-02-15].https://www.theinformation.com/articles/aws-customers-rack-up-hefty-bills-for-moving-data
[5] Jin Hai, Jia Lin, Zhou Zhi. Boosting edge intelligence with collaborative cross-edge analytics[J]. IEEE Internet of Things Journal, 2020, 8(4): 2444−2458
[6] Xiang Chong, Wang Xinyu, Chen Qingrong, et al. No-jump-into-latency in China's Internet! toward last-mile hop count based IP geo-localization[C/OL] //Proc of the 19th Int Symp on Quality of Service. New York: ACM, 2019[2021-03-15].https://doi.org/10.1145/3326285.3329077
[7] Jiang Xiaolin, Shokri-Ghadikolaei H, Fodor G, et al. Low-latency networking: Where latency lurks and how to tame it[J]. Proceedings of the IEEE, 2018, 107(2): 280−306
[8] 施巍松,张星洲,王一帆,等. 边缘计算: 现状与展望[J]. 计算机研究与发展,2019,56(1):69−89 Shi Weisong, Zhang Xingzhou, Wang Yifan, et al. Edge computing: Status quo and prospect[J]. Journal of Computer Research and Development, 2019, 56(1): 69−89 (in Chinese)
[9] Zamora-Izquierdo MA, Santa J, Martínez JA, et al. Smart farming IoT platform based on edge and cloud computing[J]. Biosystems Engineering, 2019, 177(1): 4−17
[10] 肖文华,刘必欣,刘巍,等. 面向恶劣环境的边缘计算综述[J]. 指挥与控制学报,2019,5(3):181−190 Xiao Wenhua, Liu Bixin, Liu Wei, et al. A review of edge computing for harsh environments[J]. Journal of Command and Control, 2019, 5(3): 181−190 (in Chinese)
[11] Stojkoska BLR, Trivodaliev KV. A review of Internet of things for smart home: Challenges and solutions[J]. Journal of Cleaner Production, 2017, 140(3): 1454−1464
[12] Wan Shaohua, Gu Zonghua, Ni Qiang. Cognitive computing and wireless communications on the edge for healthcare service robots[J]. Computer Communications, 2020, 149(1): 99−106
[13] 吕华章,陈丹,范斌,等. 边缘计算标准化进展与案例分析[J]. 计算机研究与发展,2018,55(3):487−511 Lü Huazhang, Chen Dan, Fan Bin, et al. Standardization progress and case analysis of edge computing[J]. Journal of Computer Research and Development, 2018, 55(3): 487−511 (in Chinese)
[14] Qi Jianpeng. Awesome edge computing[EB/OL]. (2003-06-02) [2022-03-15]. https://github.com/qijianpeng/awesome-edge-computing#engine
[15] Cheol-Ho H, Blesson V. Resource management in fog/edge computing: A survey on architectures, infrastructure, and algorithms[J]. ACM Computing Surveys, 2019, 52(5): 1−37 doi: 10.1145/3342101
[16] 曾鹏,宋纯贺. 边缘计算[J]. 中国计算机学会通讯,2020,16(1):8−10 Zeng Peng, Song Chunhe. Edge computing[J]. Communications of China Computer Federation, 2020, 16(1): 8−10 (in Chinese)
[17] 高晗,田育龙,许封元,等. 深度学习模型压缩与加速综述[J]. 软件学报,2021,32(1):68−92 Gao Han, Tian Yulong, Xu Fengyuan, et al. Overview of deep learning model compression and acceleration[J]. Journal of Software, 2021, 32(1): 68−92 (in Chinese)
[18] Zhou Zhi, Chen Xu, Li En, et al. Edge intelligence: Paving the last mile of artificial intelligence with edge computing[J]. Proceedings of the IEEE, 2019, 107(8): 1738−1762 doi: 10.1109/JPROC.2019.2918951
[19] 李肯立,刘楚波. 边缘智能: 现状和展望[J]. 大数据,2019,5(3):69−75 Li Kenli, Liu Chubo. Edge intelligence: Status quo and prospect[J]. Big Data, 2019, 5(3): 69−75 (in Chinese)
[20] 谈海生,郭得科,张弛,等. 云边端协同智能边缘计算的发展与挑战[J]. 中国计算机学会通讯,2020,16(1):38−44 Tan Haisheng, Guo Deke, Zhang Chi, et al. Development and challenges of cloud-edge-device collaborative intelligent edge computing[J]. Communications of China Computer Federation, 2020, 16(1): 38−44 (in Chinese)
[21] 张星洲,鲁思迪,施巍松. 边缘智能中的协同计算技术研究[J]. 人工智能,2019,5(7):55−67 Zhang Xingzhou, Lu Sidi, Shi Weisong. Research on collaborative computing technology in edge intelligence[J]. Artificial Intelligence, 2019, 5(7): 55−67 (in Chinese)
[22] 王晓飞. 智慧边缘计算: 万物互联到万物赋能的桥梁[J]. 人民论坛·学术前沿,2020(9):6−17 Wang Xiaofei. Smart edge computing: The bridge from the Internet of everything to the empowerment of everything[J]. People’s Forum·Academic Frontiers, 2020(9): 6−17 (in Chinese)
[23] Fan Zhenyu, Wang Yang, Fan Wu, et al. Serving at the edge: An edge computing service architecture based on ICN[J]. ACM Transactions on Internet Technology, 2021, 22(1): 1−27
[24] Jennings A , Copenhagen R V , Rusmin T. Aspects of network edge intelligence[R/OL]. 2001 [2022-03-16]. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.20.6997&rep=rep1&type=pdf
[25] Romaniuk R S. Intelligence in optical networks[G] //Proceedings of SPIE 5125: Proc of the Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments. Bellingham, WA: SPIE, 2003: 17−31
[26] Okagawa T, Nishida K, Yabusaki M. A proposed mobility management for IP-based IMT network platform[J]. IEICE Transactions on Communications, 2005, 88(7): 2726−2734
[27] Liang Ye. Mobile intelligence sharing based on agents in mobile peer-to-peer environment[C] //Proc of the 3rd Int Symp on Intelligent Information Technology and Security Informatics. Piscataway, NJ: IEEE, 2010: 667−670
[28] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84−90
[29] Szegedy C, Liu Wei, Jia Yangqing, et al. Going deeper with convolutions[C/OL] //Proc of the 28th IEEE Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2015 [2022-03-16]. https://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Szegedy_Going_Deeper_With_2015_CVPR_paper.html
[30] Iandola F N, Han S, Moskewicz M W, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size[EB/OL]. (2016-11-04) [2022-03-16]. https://arxiv.org/abs/1602.07360
[31] Cao Yu, Chen Songqing, Hou Peng, et al. FAST: A Fog computing assisted distributed analytics system to monitor fall for Stroke mitigation[C] //Proc of the 10th IEEE Int Conf on Networking, Architecture and Storage. Piscataway, NJ: IEEE, 2015: 2−11
[32] Teerapittayanon S, McDanel B, Kung H T. Distributed deep neural networks over the cloud, the edge and end devices[C] //Proc of the 37th IEEE Int Conf on Distributed Computing Systems. Piscataway, NJ: IEEE, 2017: 328−339
[33] Wang Xiaofei, Han Yiwen, Wang Chenyang, et al. In-edge AI: Intelligentizing mobile edge computing, caching and communication by federated learning[J]. IEEE Network, 2019, 33(5): 156−165 doi: 10.1109/MNET.2019.1800286
[34] Kang Yiping, Johann H, Gao Cao, et al. Neurosurgeon: Collaborative intelligence between the cloud and mobile edge[J]. ACM SIGARCH Computer Architecture News, 2017, 45(1): 615−629 doi: 10.1145/3093337.3037698
[35] Li En, Zhou Zhi, and Chen Xu. Edge intelligence: On-demand deep learning model co-inference with device-edge synergy[C] //Proc of the 2018 Workshop on Mobile Edge Communications. New York, ACM, 2018: 31−36
[36] 李逸楷,张通,陈俊龙. 面向边缘计算应用的宽度孪生网络[J]. 自动化学报,2020,46(10):2060−2071 Li Yikai, Zhang Tong, Chen Junlong. Wide twin networks for edge computing applications[J]. Acta Automatica Sinica, 2020, 46(10): 2060−2071 (in Chinese)
[37] Al-Rakhami M, Alsahli M, Hassan M M, et al. Cost efficient edge intelligence framework using docker containers[C] //Proc of the 16th IEEE Int Conf on Dependable, Autonomic and Secure Computing. Piscataway, NJ: IEEE, 2018: 800−807
[38] Al-Rakhami M, Gumaei A, Alsahli M, et al. A lightweight and cost effective edge intelligence architecture based on containerization technology[J]. World Wide Web, 2020, 23(2): 1341−1360 doi: 10.1007/s11280-019-00692-y
[39] Verbraeken J, Wolting M, Katzy J, et al. A survey on distributed machine learning[J]. ACM Computing Surveys, 2020, 53(2): 1−33 doi: 10.1145/3389414
[40] 杨涛,柴天佑. 分布式协同优化的研究现状与展望[J]. 中国科学:技术科学,2020,50(11):1414−1425 doi: 10.1360/SST-2020-0040 Chai Tianyou, Yang Tao. Research status and prospects of distributed collaborative optimization[J]. Scientia Sinica Technologica, 2020, 50(11): 1414−1425 (in Chinese) doi: 10.1360/SST-2020-0040
[41] Merenda M, Porcaro C, Iero D. Edge machine learning for AI-enabled IoT devices: A review[J/OL]. Sensors, 2020, 20(9) [2022-03-18]. https://doi.org/10.3390/s20092533
[42] Véstias M P, Duarte R P, de Sousa J T, et al. Moving deep learning to the edge[J/OL]. Algorithms, 2020, 13(5) [2022-03-18]. https://doi.org/10.3390/a13050125
[43] Chen Jiasi, Ran Xukan. Deep learning with edge computing: A review[J]. Proceedings of the IEEE, 2019, 107(8): 1655−1674 doi: 10.1109/JPROC.2019.2921977
[44] 洪学海,汪洋. 边缘计算技术发展与对策研究[J]. 中国工程科学,2018,20(2):28−34 Hong Xuehai, Wang Yang. Research on the development and countermeasures of edge computing technology[J]. China Engineering Science, 2018, 20(2): 28−34 (in Chinese)
[45] Hadidi R, Cao Jiashen, Ryoo M S, et al. Toward collaborative inferencing of deep neural networks on Internet-of-things devices[J]. IEEE Internet of Things Journal, 2020, 7(6): 4950−4960 doi: 10.1109/JIOT.2020.2972000
[46] Zhao Zhuoran, Barijough K M, Gerstlauer A. Deepthings: Distributed adaptive deep learning inference on resource-constrained IoT edge clusters[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2018, 37(11): 2348−2359 doi: 10.1109/TCAD.2018.2858384
[47] Pnevmatikatos D N, Pelcat M, Jung M. Embedded Computer Systems: Architectures, Modeling, and Simulation[M]. Berlin: Springer, 2019
[48] Mao Jiachen, Chen Xiang, Nixon K W, et al. MoDNN: Local distributed mobile computing system for deep neural network[C] //Proc of the 24th Design, Automation Test in Europe Conf Exhibition. Piscataway, NJ: IEEE, 2017: 1396−1401
[49] Shan Nanliang, Ye Zecong, Cui Xiaolong. Collaborative intelligence: Accelerating deep neural network inference via device-edge synergy[J/OL]. Secrrity and Communication Networks, 2020 [2022-03-16]. https://doi.org/10.1155/2020/8831341
[50] Lane N D, Bhattacharya S, Georgiev P, et al. DeepX: A software accelerator for low-power deep learning inference on mobile devices[C/OL] //Proc of the 15th ACM/IEEE Int Conf on Information Processing in Sensor Networks (IPSN). 2016 [2022-04-06]. https://doi.org/10.1109/IPSN.2016.7460664
[51] Zhou Li, Samavatian M H, Bacha A, et al. Adaptive parallel execution of deep neural networks on heterogeneous edge devices[C] //Proc of the 4th ACM/IEEE Symp on Edge Computing. New York: ACM, 2019: 195−208
[52] Jahierpagliari D, Chiaro R, Macii E, et al. CRIME: Input-dependent collaborative inference for recurrent neural networks[J]. IEEE Transactions on Computers, 2020, 70(10): 1626−1639
[53] Zhang Shuai, Zhang Sheng, Qian Zhuzhong, et al. DeepSlicing: Collaborative and adaptive CNN inference with low latency[J]. IEEE Transactions on Parallel and Distributed Systems, 2021, 22(9): 2175−2187
[54] Li En, Zeng Liekang, Zhou Zhi, et al. Edge AI: On-demand accelerating deep neural network inference via edge computing[J]. IEEE Transactions on Wireless Communications, Institute of Electrical and Electronics Engineers, 2020, 19(1): 447−457
[55] Zhang Saiqian, Lin Jieyu, Zhang Qi. Adaptive distributed convolutional neural network inference at the network edge with ADCNN[C/OL] //Proc of the 49th Int Conf on Parallel Processing. 2020 [2022-03-18]. https://doi.org/10.1145/3404397.3404473
[56] Shao Jiawei, Zhang Jun. BottleNet++: An end-to-end approach for feature compression in device-edge co-inference systems[C/OL] //Proc of the IEEE Int Conf on Communications Workshops. Piscataway, NJ: IEEE, 2020 [2022-03-18]. https://doi.org/10.1109/ICCWorkshops49005.2020.9145068
[57] Shao Jiawei, Zhang Jun. Communication-computation trade-off in resource-constrained edge inference[J]. IEEE Communications Magazine, 2020, 58(12): 20−26 doi: 10.1109/MCOM.001.2000373
[58] Avasalcai C, Tsigkanos C, Dustdar S. Resource management for latency-sensitive IoT applications with satisfiability[J/OL]. IEEE Transactions on Services Computing, 2021 [2022-03-18]. https://doi.ieeecomputersociety.org/10.1109/TSC.2021.3074188
[59] Chen Min, Li Wei, Hao Yiyue, et al. Edge cognitive computing based smart healthcare system[J]. Future Generation Computer Systems, 2018, 86(9): 403−411
[60] Hu Diyi, Krishnamachari B. Fast and accurate streaming cnn inference via communication compression on the edge[C] //Proc of the 5th ACM/IEEE Int Conf on Internet of Things Design and Implementation. Piscataway, NJ: IEEE, 2020: 157−163
[61] Hsu K J, Choncholas J, Bhardwaj K, et al. DNS does not suffice for MEC-CDN[C] //Proc of the 19th ACM Workshop on Hot Topics in Networks. New York: ACM, 2020: 212−218
[62] Campolo C, Lia G, Amadeo M, et al. Towards named AI networking: Unveiling the potential of NDN for edge AI[G] //LNCS 12338: Proc of the 19th Int Conf on Ad-Hoc Networks and Wireless. Cham: Springer, 2020: 16−22
[63] Jiang A H, Wong D L K, Canel C, et al. Mainstream: Dynamic stem-sharing for multi-tenant video processing[C] //Proc of the 2018 USENIX Annual Technical Conf. New York: ACM, 2018: 29−42
[64] Mhamdi E, Guerraoui R, Rouault S. On the robustness of a neural network[C] //Proc of the 36th IEEE Symp on Reliable Distributed Systems. Piscataway, NJ: IEEE, 2017: 84−93
[65] Yousefpour A, Devic S, Nguyen B Q, et al. Guardians of the Deep Fog: Failure-resilient DNN inference from edge to cloud[C] //Proc of the 1st Int Workshop on Challenges in Artificial Intelligence and Machine Learning for Internet of Things. New York: ACM, 2019: 25−31
[66] Hu Chuang, Bao Wei, Wang Dan, et al. Dynamic adaptive DNN surgery for inference acceleration on the edge[C] //Proc of the 38th IEEE Conf on Computer Communications. Piscataway, NJ: IEEE, 2019: 1423−1431
[67] Song Han, Mao Huizi, Dally W J. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding[EB/OL]. (2016-02-15) [2022-03-18]. https://arxiv.org/abs/1510.00149
[68] Masana M, van de Weijer J, Herranz L, et al. Domain-adaptive deep network compression[C] //Proc of the IEEE Int Conf on Computer Vision. Piscataway, NJ: IEEE, 2017: 22−29
[69] Courbariaux M, Bengio Y, David J P. BinaryConnect: Training deep neural networks with binary weights during propagations[C] //Proc of the 28th Int Conf on Neural Information Processing Systems. Cambridge, MA: MIT Press, 2015: 3123−3131
[70] Gholami A, Kim S, Zhen Dong, et al. A survey of quantization methods for efficient neural network inference[J]. arXiv preprint, arXiv: 2103.13630, 2021
[71] Cao Qingqing, Irimiea A E, Abdelfattah M, et al. Are mobile DNN accelerators accelerating DNNs?[C] //Proc of the 5th Int Workshop on Embedded and Mobile Deep Learning. New York: ACM, 2021: 7−12
[72] Guo Kaiyuan, Song Han, Song Yao, et al. Software-hardware codesign for efficient neural network acceleration[J]. IEEE Micro, 2017, 37(2): 18−25 doi: 10.1109/MM.2017.39
[73] Guo Kaiyuan, Li Wenshuo, Zhong Kai, et al. Neural network accelerator comparison[EB/OL]. (2018-01-01) [2022-12-26]. https://nicsefc.ee.tsinghua.edu.cn/projects/neural-network-accelerator tsinghua.edu.cn/project.html
[74] Li Hao, Kadav A, Durdanovic I, et al. Pruning filters for efficient convnets[J]. arXiv preprint, arXiv: 1608.08710, 2017
[75] Luo Jianhao, Zhang Hao, Zhou Hongyu, et al. ThiNet: Pruning cnn filters for a thinner net[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(10): 2525−2538 doi: 10.1109/TPAMI.2018.2858232
[76] He Yihui, Zhang Xianyu, Sun Jian. Channel pruning for accelerating very deep neural networks[C] //Proc of the 16th IEEE Int Conf on Computer Vision. Piscataway, NJ: IEEE, 2017: 1398−1406
[77] Hu Hengyuan, Peng Rui, Tai Y W, et al. Network trimming: A data-driven neuron pruning approach towards efficient deep architectures[J]. arXiv preprint, arXiv: 1607.03250, 2016
[78] Wen Wei, Wu Chunpeng, Wang Yandan, et al. Learning structured sparsity in deep neural networks[C] //Proc of the 30th Int Conf on Neural Information Processing Systems. New York: ACM, 2016: 2082−2090
[79] Chen Hanting, Wang Yunhe, Xu Chang, et al. Data-free learning of student networks[C] //Proc of the 17th IEEE/CVF Int Conf on Computer Vision. Piscataway, NJ: IEEE, 2019: 3513−3521 [80] Niu Wei, Ma Xiaolong, Lin Sheng, et al. PatDNN: Achieving real-time DNN execution on mobile devices with pattern-based weight pruning[C] //Proc of the 25th Int Conf on Architectural Support for Programming Languages and Operating Systems. New York: ACM, 2020: 907−922
[81] Qin Haotong, Gong Ruihao, Liu Xianglong, et al. Binary neural networks: A survey [J]. Pattern Recognition, 2020, 105(9): 107281 [82] 卢冶,龚成,李涛. 深度神经网络压缩自动化的挑战与机遇[J]. 中国计算机学会通讯,2021,17(3):41−47 Lu Ye, Gong Cheng, Li Tao. Challenges and opportunities of deep neural network compression automation[J]. China Computer Society Communications, 2021, 17(3): 41−47 (in Chinese)
[83] Hubara I, Courbariaux M, Soudry D, et al. Binarized neural networks[C] //Proc of the 30th Int Conf on Neural Information Processing Systems. New York: ACM, 2016: 4114−4122
[84] Li Fengfu, Liu Bin. Ternary weight networks[J]. arXiv preprint, arXiv: 1605.04711, 2016
[85] Alemdar H, Leroy V, Prost-Boucle A, et al. Ternary neural networks for resource-efficient AI applications[C] //Proc of the 30th Int Joint Conf on Neural Networks. Piscataway, NJ: IEEE, 2017: 2547−2554
[86] Chen Yao, Zhang Kang, Gong Cheng, et al. T-DLA: An open-source deep learning accelerator for ternarized DNN models on embedded FPGA[C] //Proc of the 14th IEEE Computer Society Annual Symp on VLSI. Piscataway, NJ: IEEE, 2019: 13−18
[87] Zhou Shuchuang, Wu Yuxin, Ni Zekun, et al. DoReFa-Net: Training low bitwidth convolutional neural networks with low bitwidth gradients[J]. arXiv preprint, arXiv: 1606.06160, 2018
[88] Wang Peisong, Hu Qinghao, Zhang Yifan, et al. Two-step quantization for low-bit neural networks[C] //Proc of the 31st IEEE Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2018: 4376−4384 [89] Jung Sangli, Son Changyong, Lee Seohyung, et al. Learning to quantize deep networks by optimizing quantization intervals with task loss[C] //Proc of the 32nd IEEE/CVF Conf on Computer Vision and Pattern Recognition . Piscataway, NJ: IEEE, 2019: 4345−4354 [90] Gong Cheng, Li Tao, Lu Ye, et al. µL2Q: An ultra-low loss quantization method for DNN compression[C/OL] //Proc of the Int Joint Conf on Neural Networks. Piscataway, NJ: IEEE, 2019 [2022-04-07]. https://doi.org/10.1109/IJCNN.2019.8851699
[91] 葛道辉,李洪升,张亮,等. 轻量级神经网络架构综述[J]. 软件学报,2020,31(9):2627−2653 doi: 10.13328/j.cnki.jos.005942 Ge Daohui, Li Hongsheng, Zhang Liang, et al. A review of lightweight neural network architecture[J]. Journal of Software, 2020, 31(9): 2627−2653 (in Chinese) doi: 10.13328/j.cnki.jos.005942
[92] Shi Lei, Feng Shi, Zhu Zhifang. Functional hashing for compressing neural networks[J]. arXiv preprint, arXiv: 1605.06560, 2016
[93] Wu Junru, Wang Yue, Wu Zhenyu, et al. Deep k-means: Re-training and parameter sharing with harder cluster assignments for compressing deep convolutions[C] //Proc of the 35th Int Conf on Machine Learning PMLR. New York: ACM, 2018: 5363−5372
[94] Xu Xiaowei, Lu Qing, Wang Tianchen, et al. Efficient hardware implementation of cellular neural networks with incremental quantization and early exit[J]. ACM Journal on Emerging Technologies in Computing Systems, 2018, 14(4): 1−20
[95] Li Yuhong, Hao Cong, Zhang Xiaofan, et al. EDD: Efficient differentiable DNN architecture and implementation co-search for embedded AI solutions[C/OL] //Proc of the 57th ACM/IEEE Design Automation Conf. New York: ACM, 2020 [2022-04-07]. https://doi.org/10.1109/DAC18072.2020.9218749
[96] Aimar A, Mostafa H, Calabrese E, et al. NullHop: A flexible convolutional neural network accelerator based on sparse representations of feature maps[J]. IEEE Transactions on Neural Networks and Learning Systems, 2019, 30(3): 644−656 doi: 10.1109/TNNLS.2018.2852335
[97] Sebastian A, Le Gallo M, Khaddam-Aljameh R, et al. Memory devices and applications for in-memory computing[J]. Nature Nanotechnology, 2020, 15(7): 529−544 doi: 10.1038/s41565-020-0655-z
[98] Song Zhuoran, Fu Bangqi, Wu Feiyang, et al. DRQ: Dynamic region-based quantization for deep neural network acceleration[C] //Proc of the 47th ACM/IEEE Annual Int Symp on Computer Architecture. New York: ACM, 2020: 1010−1021
[99] Yang Yixiong, Yuan Zhe, Su Fang, et al. Multi-channel precision-sparsity-adapted Inter-frame differential data Codec for video neural network processor[C] //Proc of the 33rd ACM/IEEE Int Symp on Low Power Electronics and Design. New York: ACM, 2020: 103−108
[100] Tang Yibin, Wang Ying, Li Huawei, et al. MV-Net: Toward real-time deep learning on mobile GPGPU systems[J]. ACM Journal on Emerging Technologies in Computing Systems, 2019, 15(4): 1−25
[101] Chen Shengbo, Shen Cong, Zhang Lanxue, et al. Dynamic aggregation for heterogeneous quantization in federated learning[J]. IEEE Transactions on Wireless Communications, 2021, 20(10): 6804−6819 doi: 10.1109/TWC.2021.3076613
[102] Teerapittayanon S, McDanel B, Kung H T. BranchyNet: Fast inference via early exiting from deep neural networks[C] //Proc of the 23rd Int Conf on Pattern Recognition. Piscataway, NJ: IEEE, 2016: 2464−2469
[103] Lo C, Su YY, Lee CY, et al. A dynamic deep neural network design for efficient workload allocation in edge computing[C] //Proc of the 35th 2017 IEEE Int Conf on Computer Design. Piscataway, NJ: IEEE, 2017: 273−280
[104] Wang Zizhao, Bao Wei, Yuan Dong, et al. SEE: Scheduling early exit for mobile DNN inference during service outage[C] //Proc of the 22nd Int ACM Conf on Modeling, Analysis and Simulation of Wireless and Mobile Systems. New York: ACM, 2019: 279−288
[105] Wang Zizhao, Bao Wei, Yuan Dong, et al. Accelerating on-device DNN inference during service outage through scheduling early exit[J]. Computer Communications, 2020, 162(10): 69−82
[106] Scarpiniti M, Baccarelli E, Momenzadeh A, et al. DeepFogSim: A toolbox for execution and performance evaluation of the inference phase of conditional deep neural networks with early exits atop distributed Fog platforms[J/OL]. Applied Sciences, 2021, 11(1)[2022-03-18]. https://doi.org/10.3390/app11010377
[107] Su Xiao. EasiEI simulator[CP/OL]. [2022-03-18]. https://gitlab.com/Mirrola/ns-3-dev/-/wikis/EasiEI-Simulator
[108] Park E, Kim D, Kim S, et al. Big/little deep neural network for ultra low power inference[C] //Proc of the Int Conf on Hardware/Software Codesign and System Synthesis. Piscataway, NJ: IEEE, 2015: 124−132
[109] Putra T A, Leu J S. Multilevel Neural network for reducing expected inference time[J]. IEEE Access, 2019, 7(11): 174129−174138
[110] Taylor B, Marco V S, Wolff W, et al. Adaptive deep learning model selection on embedded systems[J]. ACM SIGPLAN Notices, 2018, 53(6): 31−43 doi: 10.1145/3299710.3211336
[111] Shu Guansheng, Liu Weiqing, Zheng Xiaojie, et al. IF-CNN: Image-aware inference framework for cnn with the collaboration of mobile devices and cloud[J]. IEEE Access, 2018, 6(10): 68621−68633
[112] Stamoulis D, Chin T W, Prakash A K, et al. Designing adaptive neural networks for energy-constrained image classification[C] //Proc of the Int Conf on Computer-Aided Design. New York: ACM, 2018: 1−8
[113] Song Mingcong, Zhong Kan, Zhang Jiaqi, et al. In-Situ AI: Towards autonomous and incremental deep learning for IoT systems[C] //Proc of the 24th IEEE Int Symp on High Performance Computer Architecture. Piscataway, NJ: IEEE, 2018: 92−103
[114] Zhang Li, Han Shihao, Wei Jianyu, et al. nn-Meter: Towards accurate latency prediction of deep-learning model inference on diverse edge devices[C] //Proc of the 19th Annual Int Conf on Mobile Systems, Applications, and Services. New York: ACM, 2021: 81−93
[115] Yue Zhifeng, Zhu Zhixiang, Wang Chuang, et al. Research on big data processing model of edge-cloud collaboration in cyber-physical systems[C] //Proc of the 5th IEEE Int Conf on Big Data Analytics. Piscataway, NJ: IEEE, 2020: 140−144
[116] Wang Huitian, Cai Guangxing, Huang Zhaowu, et al. ADDA: Adaptive distributed DNN inference acceleration in edge computing environment[C] //Proc of the 25th Int Conf on Parallel and Distributed Systems. Piscataway, NJ: IEEE, 2019: 438−445
[117] Chen Liang, Qi Jiapeng, Su Xiao, et al. REMR: A reliability evaluation method for dynamic edge computing network under time constraints[J]. arXiv preprint, arXiv: 2112.01913, 2021
[118] Long Saiqin, Long Weifan, Li Zhetao, et al. A game-based approach for cost-aware task assignment with QoS constraint in collaborative edge and cloud environments[J]. IEEE Transactions on Parallel and Distributed Systems, 2021, 32(7): 1629−1640 doi: 10.1109/TPDS.2020.3041029
[119] Yang Bo, Cao Xuelin, Li Xiangfan, et al. Mobile-edge-computing-based hierarchical machine learning tasks distribution for IIoT[J]. IEEE Internet of Things Journal, 2020, 7(3): 2169−2180 doi: 10.1109/JIOT.2019.2959035
[120] Fang Yihao, Jin Ziyi, Zheng Rong. TeamNet: A collaborative inference framework on the edge[C] //Proc of the 39th IEEE Int Conf on Distributed Computing Systems. Piscataway, NJ: IEEE, 2019: 1487−1496
[121] Fang Yihao, Shalmani SM, Zheng Rong. CacheNet: A model caching framework for deep learning inference on the edge[J]. 2020 (2020-07-03)[2022-03-17]. arXiv preprint, arXiv: 2007.01793, 2020
[122] 檀超,张静宣,王铁鑫,等. 复杂软件系统的不确定性[J]. 软件学报,2021,32(7):1926−1956 doi: 10.13328/j.cnki.jos.006267 Tan Chao, Zhang Jingxuan, Wang Tiexin, et al. Uncertainty in complex software systems[J]. Journal of Software, 2021, 32(7): 1926−1956 (in Chinese) doi: 10.13328/j.cnki.jos.006267
[123] 宋纯贺,曾鹏,于海斌. 工业互联网智能制造边缘计算: 现状与挑战[J]. 中兴通讯技术,2019,25(3):50−57 doi: 10.12142/ZTETJ.201903008 Song Chunhe, Zeng Peng, Yu Haibin. Industrial Internet intelligent manufacturing edge computing: Current situation and challenges[J]. ZTE Technology, 2019, 25(3): 50−57 (in Chinese) doi: 10.12142/ZTETJ.201903008
[124] Chen Chao, Zhang Daqing, Wang Yasha, et al. Enabling Smart Urban Services with GPS Trajectory Data[M]. Berlin: Springer, 2021
[125] 黄倩怡,李志洋,谢文涛,等. 智能家居中的边缘计算[J]. 计算机研究与发展,2020,57(9):1800−1809 doi: 10.7544/issn1000-1239.2020.20200253 Huang Qianyi, Li Zhiyang, Xie Wentao, et al. Edge computing in smart home[J]. Journal of Computer Research and Development, 2020, 57(9): 1800−1809 (in Chinese) doi: 10.7544/issn1000-1239.2020.20200253
[126] Li Xian, Bi Suzhi, Wang Hui. Optimizing resource allocation for joint AI model training and task inference in edge intelligence systems[J]. IEEE Wireless Communications Letters, 2021, 10(3): 532−536 doi: 10.1109/LWC.2020.3036852
[127] Trivedi A, Wang Lin, Bal H, et al. Sharing and caring of data at the edge[C/OL] //Proc of the 3rd USENIX Workshop on Hot Topics in Edge Computing. Berkeley, CA: USENIX Association, 2020 [2022-04-06]. https://www.usenix.org/conference/hotedge20/presentation/trivedi
[128] Richins D, Doshi D, Blackmore M, et al. AI tax: The hidden cost of AI data center applications[J]. ACM Transactions on Computer Systems, 2021, 37(1-4): 1-32
-
期刊类型引用(3)
1. 钱进,王随,吴昊. 基于国产化平台网络异常数据流处理的研究与实现. 雷达与对抗. 2024(02): 60-63 . 百度学术
2. 尹小康,蔡瑞杰,杨启超,刘胜利. 基于静态和动态混合分析的内存拷贝类函数识别. 软件学报. 2024(07): 3291-3313 . 百度学术
3. 陈晓武,孙志敬. 信息化技术在空调控制器上的应用. 日用电器. 2022(11): 85-88 . 百度学术
其他类型引用(2)