-
摘要:
数据中心的高投入和低资源利用率一直是云服务提供商关注的问题. 面对这个难题,直接的解决方案是在同等资源上混合部署更多的应用以提高资源使用效率. 然而,由于混部应用对共享资源的竞争导致了应用间的性能干扰,从而影响了应用的性能、服务质量(quality of service,QoS)和用户满意度,因此如何保障应用的性能已成为混部场景下的关键问题. 着重从应用和集群特征分析(基础)、干扰检测(前提)、单节点资源分配(微观层面策略)和集群作业调度(宏观层面策略)4个方面阐述多应用混部性能保障的相关背景、挑战和关键技术. 在不同的混部场景下,由于应用和集群特征等不同,性能保障工作所面临的挑战和问题复杂度也各异,例如单位资源上混合部署的应用数量会直接影响到搜索资源空间的时间开销,应用的运行方式会影响到共享资源的竞争强度. 因此,从问题复杂度角度出发,从应用和集群特征、资源干扰维度和混部应用个数3个维度对相关研究工作面临的挑战进行讨论和分析. 探讨了面向高密度混部场景应用性能保障方法的发展方向和挑战,认为全栈式的软硬件协同方法是保障高密度混部下应用性能的趋势,该方法有助于全面地提升应用性能的可靠性和数据中心的资源利用率.
Abstract:The huge cost of investment and low resource utilization in the datacenter has long been a great concern to cloud providers. To address this issue, a straightforward way is co-locating more applications on the same hardware to improve resource efficiency. However, the shared resource contention caused by co-located applications leads to performance interference, affecting the application’s performance, quality of service (QoS) and user satisfaction. Therefore, how to guarantee the performance of the co-located application has been a key issue in the colocation scenario. We introduce the researches of guaranteeing the performance of co-located applications, including the background of co-location, challenges, and key technologies. The related work is summarized from four aspects: application and cluster characterization (basic), interference detection (premise), server-level resource allocation (micro-level policy), and cluster-level job scheduling (macro-level policy). In addition, due to the diverse characteristics of co-located applications and clusters, the research of guaranteeing the performance faces different challenges and problem complexity in the different co-located scenarios. For example, the number of co-located applications deployed on a unit resource will directly affect the time cost of searching resource space, and the running mode of applications will affect the competition intensity of shared resources. Therefore, from the perspective of problem complexity, we discuss and analyze the challenges of research work from three dimensions, cluster and application characteristics, resource interference dimension, and the number of co-located applications. At the end of this paper, we discuss the future research directions and the challenges in the high deployment density scenario. We conclude that the software/hardware co-designed full-stack approach is the trend to guarantee the performance in high deployment density clusters, and this approach can help to provide predictable performance and high resource efficiency in the datacenter.
-
游戏博弈作为现实世界的一种高度抽象,具有良定义、易检验算法性能等特点,成为目前智能决策研究的热点. 近些年,一系列人工智能方法在游戏中均取得了很好的效果,甚至战胜人类,彰显出人工智能在现实应用中极大的潜力,具有重大的意义. 而且,不断有新的算法在各类游戏博弈中取得重要进展. 如在围棋中,以AlphaGo[1]和AlphaZero[2]为代表的人工智能战胜了李世石、柯洁等人类顶尖高手;Libratus[3]和Pluribus[4]战胜德州扑克(Texas Hold’em)职业冠军;Suphx[5]在天凤麻将平台超越职业选手段位;AlphaStar[6],OpenAI-Five[7]分别在星际争霸中与DOTA2中战胜人类世界冠军等.
目前针对棋牌类游戏存在多种求解方式,例如:1)基于强化学习(reinforcement learning)的方法[1-2, 6-8]采用试错的方式学习智能体在自身观测状态下的最优策略,通过深度神经网络拟合状态动作值函数或状态动作概率分布的方法,根据获取到的经验更新相应神经网络,得到更好的策略;2)基于反事实遗憾最小化(counterfactual regret minimization,CFR)的方法[3-4, 9-10]采用类似在线学习的方式,在每一轮迭代中计算每种自身观测状态下所有动作的反事实遗憾,根据遗憾匹配(regret matching)等遗憾最小化方法生成新一轮策略,尝试降低新策略的遗憾,最终输出每一轮的平均策略;3)通过在线优化的方法,如一阶方法(first order method)[11],将中小规模二人零和博弈问题建模为一个凸优化问题进行求解.
国内扑克游戏,如掼蛋、斗地主等,作为一类非完美信息博弈,相较于目前已有较好算法的德州扑克等游戏博弈有较大差异. 国内扑克游戏具有信息集状态多、动作空间复杂、状态动作难以约简等特点[8],因此大部分现有方法难以应用. 例如用于求解德州扑克的蒙特卡洛反事实遗憾最小化[10,12](Monte Carlo counterfactual regret minimization,MCCFR)算法虽然能缓解在求解德州扑克问题时由于博弈树大小而引发的难以迭代遍历问题[13],但是斗地主或掼蛋这样无法简单进行状态动作空间约简的扑克游戏,其博弈树规模仍过于庞大,无法简单适用;经典的强化学习方法如DQN[14-15],A3C[16-17]等则由于较大的动作空间导致这些算法的网络结构难以较好地拟合扑克类的值函数[8],从而无法在掼蛋、斗地主等国内扑克类游戏中取得较好的效果.
深度蒙特卡洛[8](deep Monte Carlo,DMC)方法是目前针对国内扑克游戏设计人工智能算法所面临问题的主要解决途径之一. DMC方法采用蒙特卡洛采样评估状态动作值函数,其考虑到斗地主等扑克游戏动作不易约简且动作之间由于出牌相似而具有相似关系的特点,通过将动作进行编码与状态一同作为神经网络的输入,借此解决动作空间大且不易约简的问题. 同时DMC方法采用TorchBeast[18]训练框架,通过大量采样来降低训练方差,在斗地主游戏中取得了较好的效果. 但是单纯的DMC方法在面对以掼蛋为代表的更大规模扑克博弈时,依然面临一些问题:1)DMC方法需要大量的训练时间. 采用DMC方法的DouZero系统,在对抗专家策略的监督学习方法时,在斗地主环境中需要10天时间才能达到50%的对抗胜率. 因此对于更复杂的扑克博弈如掼蛋,其信息集数量、信息集大小、动作空间、每局历史信息长度均远超斗地主,需要更多的训练时间. 2)DMC方法实际执行策略过程中总是选择第1个状态动作值最大的动作,因此在实际对局过程中更容易被对手利用. 同时由于DMC训练过程中的高方差原因,较小的值扰动也可能造成较大的策略差异,从而造成策略质量较大的变化.
为了有效解决训练时间问题,考虑到在常见的扑克博弈中,存在大量的已有知识或领域知识,因此如果能够将现有的先验知识融入算法的训练过程,将大大提升算法的训练效率. 为此文献[19]提出一种暖启动(warm start)方法,该方法针对反事实遗憾最小化算法进行暖启动,通过已有策略,赋予在每个信息集中的动作一个合适的反事实遗憾值,从而实现对于策略求解的加速计算. 然而,暖启动方法需要获取整个博弈信息,从而进行期望值与遗憾值的计算,因此对于大规模扑克博弈需要进行大幅度的状态与动作的约简,而这对于斗地主、掼蛋等国内主流扑克博弈难以实现,相关方法较难直接应用.
因此本文提出了一种软深度蒙特卡洛(soft deep Monte Carlo,SDMC)方法,对以掼蛋为代表的国内扑克类博弈进行求解. 首先针对DMC方法需要大量训练时间的问题,提出通过软启动(soft warm start)方式,结合已有策略知识,在训练过程中进行已有策略决策与SDMC策略模型决策的混合决策,辅助进行策略训练,提升策略收敛速度;然后在实际对战过程中依据策略模型状态动作值预测,通过软动作采样(soft action sample,SAS),缓解DMC方法仅选择最大值动作时,由于策略固定而易被对手利用等问题,增强策略鲁棒性. 最后,本文在掼蛋博弈中进行实验验证. 本文提出的SDMC方法在第2届“中国人工智能博弈算法大赛”取得冠军,在与DMC方法和第1届冠军等其他参赛算法进行对比实验证明了本文所提出的方法在掼蛋扑克博弈中的有效性.
1. 相关背景
1.1 掼蛋扑克的博弈问题建模
1.2 相关研究进展
本节首先介绍现有扑克类博弈的求解方法,并分析各个方法的优缺点;其次着重介绍在斗地主中的最新方法,从而更好地介绍本文提出的SDMC方法.
1.2.1 扑克类游戏求解方法
1.2.2 深度蒙特卡洛方法
DMC方法是由Zha 等人[8]提出,是应用在斗地主环境下DouZero AI系统中的核心算法. DMC方法考虑到斗地主等扑克游戏动作不易约简且动作之间存在相似关系的特点,通过将动作进行编码与状态一同作为神经网络的输入,解决现有其他方法在大规模具有相似动作的空间下不适用的问题. 具体来讲,DMC方法尝试训练神经网络V,使其输出值与实际的值函数尽可能地相近:
θ∗=argminθ||Q(τit,ai)−Vθ(τit,ai)||, (1) 其中θ为神经网络V的参数,且
Q(τit,ai)=Eτ∼P(τ|π,τit,ai)[R(τ)] (2) 为在当前动作观测历史τit下,选择动作ai后,依据当前策略π所能获得的期望奖励值.
在训练过程中,DMC方法采用ϵ-贪心的策略选择方法,即给定神经网络和参数Vθ,训练过程中选择的策略
πϵ(τit,ai)=(1−ϵ)I(ai==a∗)+ϵ|A|, (3) 其中|A|为当前可选动作数量,a∗=argmaxa′V(τit,a′),I(⋅)为指示函数(indicating function)且仅在参数为真时结果为1,否则为0,argmax函数选择值最大的第1个动作.
在实际博弈过程中,DMC方法直接选择值函数最大的动作,即在AOH选择τit,可选动作集合A下,策略为
π(τit,ai)=I(ai==argmaxa′V(τit,a′)). (4) 扑克类游戏作为一种天然的非完美信息博弈,已经具有悠久的研究历史.
针对竞争型的扑克类游戏,如德州扑克,通常采取求解纳什均衡的方式. 其中为代表的CFR[9]采用自博弈的方式进行训练. 在训练的每一轮中,个体与上一轮训练出的策略进行对抗,并依靠遍历整棵博弈树的方式计算策略的遗憾,通过最小化遗憾的方式最终求解博弈的纳什均衡. 但受制于CFR的遍历过程,随着博弈规模的增加,遍历整棵博弈树需要极大的时间与空间,因此MCCFR[10]方法采取采样的方式更新博弈策略的遗憾,降低算法需求. 虽然CFR类的方法在以德州扑克为代表的博弈中取得惊人的效果,但是其方法限制了其在大规模环境下的应用,往往需要结合博弈约简(abstraction)[20]方法,降低博弈树规模. 因此难以处理如掼蛋、斗地主等非完美信息博弈.
而对合作型的扑克类游戏如Hanabi,则有着更为多样的求解范式. 不同于竞争型博弈下致力于求解纳什均衡而将其看作一个优化问题,合作型博弈也可以建模为一类学习问题[21]. 其中贝叶斯动作编码器(Bayesian action decoder,BAD)[22-23]方法在Hanabi中取得了最为理想的成果. BAD方法使用深度强化学习的方式在公共信念中探索合适的策略,但此类方法仅面向纯合作类场景选取确定性动作,无法简单适用到如掼蛋这样具有竞争合作的混合场景中.
针对掼蛋、斗地主等牌类博弈面对状态、动作空间复杂不易求解等问题,You等人[24]提出通过一种组合Q网络(combinatorial Q-network)的方式,将决策过程分为组牌和出牌2个步骤. 但组牌过程耗时巨大,不利于在大规模环境下的训练.DeltaDou方法[25]通过贝叶斯推断的方式推理对手的卡牌,之后采用类似于AlphaZero的方式进行蒙特卡洛树搜索,从而对策略进行训练,但仍需约20天的训练时间才能在斗地主中达到专家水平.
掼蛋扑克博弈问题由于具有无法观测对手手牌内容、独立决策的特性,经常被建模为一个部分可观测的马尔可夫决策过程(partially observable Markov decision process,POMDP). 在部分可观测马尔可夫决策过程中,智能体i表示智能体的编号索引,状态s表示当前实际状态. 在每一个时间点t,每一个智能体i都会观测到一个观测状态oi=Z(S,i),这里函数Z是一个观测函数. 在每次智能体观测到观测状态oi时,智能体i都可以选择一个动作 {\boldsymbol{a}} . 因此,智能体 i 的策略 {\pi }_{i} 可以看做一个动作观测历史(action-observation history,AOH) {\tau }_{t}=\{{o}_{0}^{i},{a}_{0}^{i},{o}_{1}^{i},{a}_{1}^{i},…o_{t-1}^i,a_{t-1}^i,{o}_{t}^{i}\} 的函数. 每当所有智能体执行一个动作后,当前状态 {s}_{t} 会根据环境转移函数转换到新的状态 {s}_{t+1}\sim \mathcal{P}\left({s}_{t+1}|{s}_{t},{{\boldsymbol{a}}}\right) ,其中 \boldsymbol{a}=({a}^{0},{a}^{1},… ) 为所有智能体的联合动作,每个智能体 i 会收到环境的奖励 {r}_{t}^{\;i}={\mathcal{R}}^{i}({s}_{t},\boldsymbol{a}) . 因此当前状态下的轨迹可用 {\tau }_{t}=\left\{{s}_{0},{\boldsymbol{a}}_{0},s_1,{\boldsymbol{a}}_1,… ,s_{t-1},{\boldsymbol{a}}_{t-1},{s}_{t}\right\} 表示. 在POMDP中,智能体 i 目标在于最大化自身奖励 {J}_{\pi }={E}_{\tau \sim P\left(\tau \right|\pi )}\left[R\left(\tau \right)\right] ,其中函数 R\left(\tau \right)=\displaystyle\sum_{t}{{\gamma }^{t}r}_{t} 为智能体收到的折扣累计奖赏, \gamma 为累计折扣因子.
掼蛋扑克博弈,具有序贯决策特性,即每个AOH下最多有1个智能体进行决策,状态转移函数可约简为 {s}_{t+1}=\mathcal{P}\left({s}_{t+1}|{s}_{t},{a}^{i}\right) ,其中 {a}^{i} 为当前智能体决策. 由于扑克博弈的关键信息通常由2部分组成:手牌、已打出牌等当前状态信息与所有参与博弈的智能体的历史动作信息,且可以通过当前观测状态与历史动作信息对牌局进行复盘,故掼蛋博弈的AOH可由 {\tau }_{t}^{\;i}=\left\langle{Z\left({s}_{t},i\right),{H}_{t}}\right\rangle 进行表示,其中 {h}_{t}^{i}\in {H}_{t} 为智能体 i 在包含时刻 t 前的动作历史 \left\{{a}_{0}^{i},{a}_{1}^{i},\dots ,{a}_{t}^{i}\right\} . 掼蛋扑克博弈不同于普通的POMDP,其奖励值往往仅存在于终止状态集合 {S}_{\mathrm{t}\mathrm{e}\mathrm{r}\mathrm{m}\mathrm{i}\mathrm{n}\mathrm{a}\mathrm{l}} ,因此对于非终止状态 {s}_{t} ,所有智能体获取到的奖励为0,即 \forall i,{s}_{t}\notin {S}_{\mathrm{t}\mathrm{e}\mathrm{r}\mathrm{m}\mathrm{i}\mathrm{n}\mathrm{a}\mathrm{l}},{\mathcal{R}}^{i}\left({s}_{t},\boldsymbol{a}\right)=0 ,因此累计折扣因子通常可以设置为1.
2. 软深度蒙特卡洛方法
本节将介绍SDMC方法,SDMC方法包含软启动与软动作采样2个过程,解决现有方法在以掼蛋为例的扑克博弈中的问题. 同时,为了更好地进行深度学习训练,本文亦创新性地提出了一种针对深度学习的掼蛋扑克博弈编码方法.
2.1 软启动蒙特卡洛方法
DMC方法将动作观测历史与可选择动作进行结合,作为神经网络输入,通过蒙特卡洛采样方式对值函数进行拟合,采样策略根据神经网络预测值采用 \epsilon -贪心的策略.
由于DMC方法采用随机网络进行初始化,再通过自博弈的方式不断自我对战产生样本,并进行更新. 因此在训练初期DMC自博弈产生的博弈轨迹样本的值更多是面对对手使用随机策略时的动作值,由此产生出的策略往往并不具有实用价值,只是训练迭代过程中的中间策略,为后续更强策略的训练提供基础.
为了加快策略的训练过程并尽量降低因加速训练过程而产生的影响,本文提出了软启动DMC方法,通过软启动的方式进行训练,借助已有策略,尽量加速训练过程.
具体的,对于已有策略 {\pi }_{\mathrm{E}} ,软启动DMC尝试融合借鉴预训练方法. 常见的预训练通过训练一个神经网络模型 {V}_{\theta } ,使得其输出预测值与已有策略值函数尽可能相近,即
\begin{array}{c}{\theta }^{\;*}=\underset{\theta }{\mathrm{arg}\;{\mathrm{min}}}\left| \left|{Q}_{\mathrm{E}}\left({\tau }_{t}^{\;i},{a}^{i}\right)-{V}_{\theta }\left({\tau }_{t}^{\;i},{a}^{i}\right)\right| \right|,\end{array} (5) 其中 {Q}_{\mathrm{E}}\left({\tau }_{t}^{\;i},{a}^{i}\right) 为已有策略的值函数.
但是已有策略的值函数由于掼蛋扑克博弈规模过大通常难以直接获取,因此可以通过蒙特卡洛采样的方式进行自博弈模拟评估:
\begin{array}{c}{Q}_{\mathrm{E}}\left({\tau }_{t}^{\;i},{a}^{i}\right)={E}_{\tau \sim P\left(\tau |{\pi }_{\mathrm{E}},{\tau }_{t}^{\;i},{a}^{i}\right)}\left[R\left(\tau \right)\right]. \end{array} (6) 但由于已有策略具有先验知识,有较多的动作并不会主动选择,故直接自博弈评估时可能存在较多的动作没有给出评估值,从而出现如过估计[26]等问题. 而通过在已有策略 {\pi }_{\mathrm{E}} 中加入 \epsilon -贪心的方式可以部分缓解该问题,但仍面临若 \epsilon 设置较大,则自博弈的评估并非已有策略值;而若 \epsilon 设置较小,则由于探索样本比例较低,需要大量的训练才可进行较好的拟合,出现与最初降低训练时间需求的初衷相违背的问题.
考虑到由于已有策略并非一定最优,经过初始化过后仍需采取自博弈的方式进行训练,因此软启动考虑在训练过程中融入已有策略而非如文献[19]直接去拟合已有策略方式,如图1(a)所示. 这样既融入了已有策略对当前模型训练进行加速,同时又避免了普通的暖启动方法所面临的过估计等问题.
具体来讲,软启动结合已有策略与当前模型生成策略的自博弈评估值 {\widetilde{Q}}_{\mathrm{E}}\left({\tau }_{t}^{\;i},{a}^{i}\right) 代替式(5)中的已有策略评估值 {Q}_{\mathrm{E}}\left({\tau }_{t}^{\;i},{a}^{i}\right) ,即
\begin{array}{c}{\theta }^{\;*}=\underset{\theta }{\mathrm{arg}\;{\mathrm{min}}}\left| \left|{\widetilde{Q}}_{\mathrm{E}}\left({\tau }_{t}^{\;i},{a}^{i}\right)-{V}_{\theta }\left({\tau }_{t}^{\;i},{a}^{i}\right)\right| \right|. \end{array} (7) 其中软启动所采用的评估值由策略 {\widetilde{\pi }}_{\mathrm{E}} 生成, {\widetilde{\pi }}_{\mathrm{E}} 结合了式(4)中当前模型生成的策略 \pi 与已有策略 {\pi }_{\mathrm{E}} :
\begin{array}{c}{\widetilde{\pi }}_{\mathrm{E}}\left({\tau }_{t}^{\;i},{a}^{i}\right)=\dfrac{\epsilon }{\left|A\right|}+\left(1-\omega -\epsilon \right)\pi \left({\tau }_{t}^{\;i},{a}^{i}\right)+\omega {\pi }_{\mathrm{E}}\left({\tau }_{t}^{\;i},{a}^{i}\right),\end{array} (8) 为混合了2种决策模型的 \epsilon -贪心策略,其中权重 \omega 为软启动参数,可随着训练的进行而衰减.
具体算法流程如算法1所示.
算法1. 软启动蒙特卡洛.
初始化:初始化经验缓存 {\left\{{B}_{i}\right\}}_{i=1}^{n} 与经验缓存 {\left\{{D}_{i}\right\}}_{i=1}^{n} 为空,其中 n 为智能体数量;随机初始化SDMC 神经网络 {\left\{{V}_{i}\right\}}_{i=1}^{n} 的参数 {\theta }_{i} .
① for episode=1 to max_episodes
② for t=0 to T
③ i\leftarrow 当前行动智能体编号;
④ 智能体 i 观测到动作观测历史 {\tau }_{t}^{\;i} ;
⑤ 由式(8)计算软启动策略 {\widetilde{\pi }}_{\mathrm{E}}^{i} ;
⑥ 选取动作 {a}^{i}\sim{\widetilde{\pi }}_{\mathrm{E}}^{i} ;
⑦ 存储样本 \left\{{\tau }_{t}^{\;i},{a}^{i}\right\} 至经验缓存 {B}_{i} ;
⑧ end for
⑨ 获得环境奖励 \boldsymbol{r}=({r}_{1},{r}_{2},\dots ,{r}_{n}) ;
⑩ for i=1 to n
⑪ for \left\{{\tau }_{t}^{\;i},{a}^{i}\right\} in {B}_{i}
⑫ 存储样本 \left\{{\tau }_{t}^{\;i},{a}^{i},{r}_{i}\right\} 至经验缓存 {D}_{i} ;
⑬ end for
⑭ 清空经验缓存 {B}_{i} ;
⑮ while {D}_{i}.length > batch\_size
⑯ 从 {D}_{i} 采样并更新神经网络 {V}_{i} ;
⑰ end while
⑱ end for
⑲ end for
⑳ 输出:神经网络模型 {\left\{{V}_{i}\right\}}_{i=1}^{n} .
对于掼蛋这类大规模扑克博弈,可以通过TorchBeast[18]框架进行并行训练. 具体对于算法1来说,每一个actor执行算法1中 ①~⑭步,并在每次循环开始之前与learner同步模型;learner执行⑮~⑰步.
2.2 软动作采样
传统DMC方法在使用模型决策时,一般选择最大值的动作,即对于动作观测历史 {\tau }_{t}^{\;i} 和可选动作集合 A ,由式(4)选择动作. 但由于仅选择最大值的动作容易受到微小扰动的干扰,如训练方差等,导致策略大幅度变化,因此评估值较为接近的动作都有可能是最好的动作,且在实际使用过程中完全确定性的策略较易被对手猜测出自身手牌等信息,因此采用带有软动作采样(soft action sample,SAS)的动作选择方式,流程如图1(b)所示,在保证所选动作在当前模型的评估下评估值变化不大的前提下,通过舍弃可选动作集合中值较低的动作,保留评估值与最大值接近的备选动作,构造备选动作集合 \hat{A} ,并在备选动作集合 \hat{A} 中对每一个动作的值进行softmax处理,按比例分配被选择的概率:
\begin{array}{c}P\left({a}^{i}|{\tau }_{t}^{\;i},\theta \right)={\mathrm{e}}^{{V}_{\theta }\left({\tau }_{t}^{\;i},{a}^{i}\right)}/\displaystyle\sum\limits _{{a}'\in \hat{A}}{\mathrm{e}}^{{V}_{\theta }\left({\tau }_{t}^{\;i},{a}'\right)},\end{array} (9) 最终根据概率分布 P\left(a|{\tau }_{t}^{\;i},\theta \right) 选择动作. 其中备选集合 \hat{A} 的选择方式可通过设定最低阈值的形式,即对于阈值 {v}_{{\tau }_{t}^{\;i}} ,备选集合 \hat{A} 为
\begin{array}{c}\hat{A}=\left\{{a}^{i}|\forall {a}^{i}\in A,{V}_{\theta }\left({\tau }_{t}^{\;i},{a}^{i}\right)\ge {v}_{{\tau }_{t}^{\;i}}\right\}. \end{array} (10) 对于阈值的选择需要保证其与最大值尽量接近,可根据当前所有动作的值的分布选择. 具体而言,可以通过设置较小的阈值权重 {\omega }'选择 {v}_{{\tau }_{t}^{\;i}} :
\begin{split} {v}_{{\tau }_{t}^{\;i}}=\; &\underset{{a}^{i}}{\mathrm{max}}{V}_{\theta }\left({\tau }_{t}^{\;i},{a}^{i}\right)- {\omega }'\left(\underset{{a}^{i}}{\mathrm{max}}{V}_{\theta }\left({\tau }_{t}^{\;i},{a}^{i}\right)-\underset{{a}^{i}}{\mathrm{min}}{V}_{\theta }\left({\tau }_{t}^{\;i},{a}^{i}\right)\right)=\\ & \left(1-{\omega }'\right)\underset{{a}^{i}}{\mathrm{max}}{V}_{\theta }\left({\tau }_{t}^{\;i},{a}^{i}\right)+{\omega }'\underset{{a}^{i}}{\mathrm{min}}{V}_{\theta }\left({\tau }_{t}^{\;i},{a}^{i}\right). \\[-1pt] \end{split} (11) 原则上,阈值的选择应保证将所有最优动作筛选出来,并摒弃所有非最优动作. 若能精确估计所有动作的 Q 值,则权重 {\omega }'应设为0,即仅选择最优的动作. 但由于训练过程中的采样带来的方差扰动,使得最优动作的 Q 值并非准确值,故而需采用较小的权重. 随着训练过程的增加, Q 值愈发准确,可适当降低权重大小.
软动作采样算法流程如算法2所示.
算法2. 软动作采样.
输入:决策模型 {V}_{\theta } ,动作观测历史 {\tau }_{t}^{\;i} ,可选动作 集合 A ,阈值权重 {\omega }';
输出:模型最终选择动作 {a}^{i} .
① 由式(11)计算阈值 v ;
② \hat{A}\leftarrow \left\{{a}'|\forall {a}'\in A,{V}_{\theta }\left({\tau }_{t}^{\;i},{a}'\right)\ge v\right\} ;
③ \boldsymbol{P}\leftarrow \left(0,0,\dots ,0\right) ; /*初始化每个动作的概率为0*/
④ for {a}' in \hat{A}
⑤ 由式(9)计算每个动作的概率 \mathit{P}\left({a}'\right) ;
⑥ end for
⑦ {a}^{i}\sim\boldsymbol{P} ; /*根据概率分布 \boldsymbol{P} 采样一个动作 {a}^{i} */
⑧ 返回结果 {a}^{i} .
2.3 掼蛋扑克博弈编码框架
对状态信息进行编码是深度强化学习在扑克环境中进行应用的重要组成部分,本节从DouZero[8]针对斗地主的编码方法出发,根据掼蛋扑克游戏与斗地主的不同,创新性地提出一种适用于掼蛋扑克游戏的编码框架.
对于掼蛋扑克牌局状态的编码应至少包含3部分:私有信息、当前可出牌与公共信息. 私有信息一般则为自己的手牌;当前可出牌包含所有可能出牌动作;公共信息包含出牌记录与对局信息,牌局信息指如掼蛋中的牌局等级信息、当前其余玩家手牌数量等当前对局的信息. 在此基础上可以添加额外的信息辅助深度网络进行训练.
状态的编码均采用1位有效编码的形式. 手牌信息根据游戏使用的牌数量构建不同大小的空矩阵,若使用 n 副牌则构建 n\times (4\times 13+2) 大小的矩阵,其中 4\times 13+2 表示1副牌,4为花色索引,13为点数索引,2表示大小王,当拥有某张手牌时,对应位置的矩阵数值置为1. 对于标准掼蛋规则,由于使用2副牌,因此 n=2 ,具体卡牌编码如图2“卡牌表示”部分所示.
出牌动作的编码类似手牌编码方式,分别对出牌动作的类型、大小与所使用的牌进行编码. 对类型、大小的编码可解决相同出牌具有不同出牌类型与大小情况,如掼蛋中部分具有逢人配的顺子等,以及区分在编码出牌记录时无出牌记录的0填充编码与“过牌”(PASS)在编码时的区别,并提供额外的信息辅助神经网络处理,如图2所示.
对于出牌记录的处理主要包括对出牌动作的编码与出牌动作结构的组织. 对于SDMC与DMC等方法,出牌记录信息会被输入到循环神经网络,因此可采取序列式结构对出牌进行组织,即从下家出牌到智能体自己出牌为止1轮的出牌编码进行拼接.
3. 实验与结果分析
本节对本文提出的SDMC方法进行实验分析,使用掼蛋扑克环境,衡量SDMC方法中软启动的加速训练效果,并分别与第1届、第2届“中国人工智能博弈算法大赛”的参赛算法对比,证明SDMC方法的有效性.
3.1 掼蛋扑克介绍
掼蛋是国内一种广泛流传的扑克类博弈. 掼蛋博弈使用2副扑克牌,共108张牌,采取2对2的模式对抗,其中每个博弈玩家与对家为1支队伍进行对抗. 本文采用第2届“中国人工智能博弈算法大赛”中的掼蛋规则,下面简要介绍.
1) 等级. 1局掼蛋博弈可以分为若干小局,每小局根据双方队伍等级中最高的决定当前牌局等级,并依据小局胜负情况更新双方的等级. 掼蛋对局中初始双方等级为2,之后依次为3,4,5,6,7,8,9,10,J,Q,K,A.
2) 升级. 每小局对战结束后,仅第1个出完牌的玩家(称为上游)所在队伍可以升级,升级数依据队友出完牌的顺序决定:若队友第2个出完牌(称为二游),则升3级;第3个出完牌(称为三游),则升2级;若队友最后1个出完牌(称为下游),则只升1级.
3) 获胜条件. 当一方队伍到达等级A,并且队伍中一人获得上游,另一人获得二游或者三游.
4) 特殊牌. 掼蛋中和当前牌局等级相同的牌为级牌,其中红桃级牌称为逢人配,可当作任意牌与其他花色牌组合使用.
5) 牌型. 掼蛋中牌型如下:单张、对子、三连对、三同张、二连三、三带二、顺子、同花顺、炸弹、天王炸. 其中炸弹张数多者为大,同样张数按照点数排序,同花顺大于任意不超过5张的炸弹,天王炸为2张大王和2张小王为最大牌型.
6) 牌点大小. 掼蛋中牌点从大到小依次为:大王、小王、级牌、A、K、Q、J、10、9、8、7、6、5、4、3、2. A在搭配成三连对、二连三、顺子、同花顺时,可视作1.
7) 进贡、还贡. 从第2小局开始,由上一轮的下游向上游进贡,挑选1张除红心级牌之外最大的牌给上游. 上游选择1张不大于10的牌给下游还贡. 如果上一局出现一方队伍获得三游和下游,则队伍2人均向对方队伍分别进贡和接受还贡. 如进贡方有2个大王,则可以不进贡.
掼蛋由于使用2副牌,因此具有更高的求解复杂度,对算法训练效率提出了更大挑战. 仅第1轮发牌后的信息集的数量级约为1020,远超斗地主等扑克博弈(斗地主的第1轮发牌后数量级约1014,如考虑去除斗地主的花色影响约为108). 同时考虑到各个玩家拥有更多的手牌产生的指数级增长的可选动作以及等级、逢人配等因素,掼蛋实际信息集数量与大小远超斗地主等扑克博弈.
3.2 实现细节
本节主要描述了基于SDMC的掼蛋智能体实现细节,包含整体架构、状态编码方式与奖励设计3个方面.
1) 整体架构细节
基于SDMC的掼蛋智能体主要由2部分组成:掼蛋贡、还牌规则决策模块与出牌的SDMC决策模块. 其中所使用的掼蛋贡、还牌规则决策模块采用第1届“中国人工智能博弈算法大赛”的冠军规则. 出牌的SDMC决策模块中,式(11)中 {\omega }'=\dfrac{1}{500} .
SDMC的网络结构与文献[8]相似,均采用LSTM网络处理历史动作,通过6层全连接网络输出动作评估值.
2) 状态编码方式
掼蛋编码方式采用2.3节编码框架,每一部分的编码大小如表1所示.
表 1 掼蛋环境状态编码Table 1. State Representation of GuanDan Games类型 含义 one-hot编码长度 出牌动作 卡牌的矩阵表示 108 类型的矩阵表示 10 大小的矩阵表示 15 私有信息 手牌的卡牌矩阵表示 108 手牌中逢人配数量 3 公共信息 当前等级 13 其他玩家剩余手牌 108 上家已出卡牌矩阵表示 108 对家已出卡牌矩阵表示 108 下家已出卡牌矩阵表示 108 最近一次的出牌动作 133 上家最近一次出牌动作 133 对家最近一次出牌动作 133 下家最近一次出牌动作 133 上家剩余手牌数量 28 对家剩余手牌数量 28 下家剩余手牌数量 28 最近4轮出牌的联合表示 532 3) 奖励设计
如3.1节介绍的,掼蛋胜负取决于队伍积分是否超过A,因此通过判断大局胜利的方式给予2队智能体奖励,可能会导致较差的小局内策略获得正奖励,因此在掼蛋环境训练过程中,当小局结束时,对双方队伍给定奖励,奖励分配方式如表2所示,1—1—2—2表示完牌顺序分别为队伍1、队伍1、队伍2、队伍2的选手.
表 2 掼蛋环境训练中的奖励设计Table 2. Reward Functions Designed in GuanDan Games完牌顺序 队伍1获得奖励 队伍2获得奖励 1—1—2—2 +3 −3 1—2—1—2 +2 −2 1—2—2—1 +1 −1 2—1—1—2 −1 +1 2—1—2—1 −2 +2 2—2—1—1 −3 +3 设计的奖励方式基本与掼蛋晋级方式相同,但当掼蛋遇到等级A时有所不同,由于当一方队伍到达等级A时想要获胜至少要有一个队友第1个完牌并且另一个队友不能最后一个完牌,因此若以队伍1达到等级A为例,完牌顺序1—1—2—2与完牌顺序1—2—1—2相同,剩余4种完牌顺序奖励也相同. 考虑到SDMC方法会尽量选择高评估值的动作,因此不对奖励函数进行修正很大程度上并不会影响训练策略的正确性,故智能体在训练过程中并未对等级A进行特殊处理.
3.3 实验结果
本节通过与不同算法的对比验证SDMC的效果. 具体而言,分别与第1届“中国人工智能博弈算法大赛”的冠军(1st Champion)与前2届16强(1st Top 16和2nd Top 16)进行比较. 实验中选取2种算法作为2支队伍,采取2种评估指标,分别评估对战双方团队的胜利与净胜小分情况.
我们首先验证了经过30天训练的SDMC与其他方法的最终胜率与净胜小分,每场对战均进500次,并重复验证5次,最终胜率与标准差如表3所示. 表3中的数据表示算法1对阵算法2时的胜率,如SDMC对抗2nd Top 16胜率为97.5%,算法的排名顺序按照击败(胜率>50%)其他算法的顺序进行排序.
表 3 不同算法对抗胜率Table 3. Winning Percentage of Different Algorithms Against Each Other% 算法1 算法2 SDMC(本文) SDMC-无SAS 1st Champion 2nd Top 16 1st Top 16 SDMC(本文) \ 51.0\pm 1.0 92.1\pm 1.0 97.5\pm 0.7 100.0\pm 0.0 SDMC-无SAS 49.0\pm 1.0 \ 91.8\pm 0.9 97.2\pm 0.9 100.0\pm 0.0 1st Champion 7.9\pm 1.0 8.2\pm 0.9 \ 62.2\pm 1.0 97.3\pm 0.6 2nd Top 16 2.5\pm 0.7 2.8\pm 0.9 37.8\pm 1.0 \ 94.9\pm 0.8 1st Top 16 0.0\pm 0.0 0.0\pm 0.0 2.7\pm 0.6 5.1\pm 0.8 \ 注:“\”表示2种相同算法之间不对抗,没有对抗胜率. 我们看到SDMC对战第1届比赛的冠军以及其他算法胜率均大于90%,对抗2nd Top 16和1st Top 16时胜率甚至分别大于97.5%和100%,因此可以认为SDMC的效果显著高于其他算法. 同时我们也对比了不使用SAS的SDMC(SDMC-无SAS)和使用SAS的SDMC的效果,对于测试的,使用SAS能够提升一定的SDMC的效果.
同时对于净胜小分,我们详细列出了各种算法之间对战过程中双方小分的获得情况与净胜分,如表4所示. 可以看到SDMC在对战其他参赛方法的时候具有很高的3分得分率,即在掼蛋中有很高的双上(队伍分别以上游和二游完牌)概率,在对战1st Champion时达到44.3%,对战2nd Top 16时达到48.4%,对战1st Top 16时达到68.4%,因此对战过程中的净胜分非常高,对战1st Champion时平均每局净胜达到4955.6分. 同时观察到SDMC与SDMC-无SAS在对抗过程中无论是3分、2分、1分的占比还是净胜分均是SDMC更胜一筹,且SDMC在对战除2nd Top 16的对手时,净胜分均高于SDMC-无SAS.
表 4 不同算法对抗时得分Table 4. Score of Different Algorithms Against Each Other算法 对战算法 3分占比/% 2分占比/% 1分占比/% 净胜小分
(500局)SDMC-无SAS SDMC(本文) 23.0\pm 0.7 9.8\pm 0.3 13.7\pm 0.6 -95.6\pm 233.9 1st Champion 12.3\pm 0.2 6.0\pm 0.4 13.9\pm 0.6 -495.6\pm 126.5 2nd Top 16 8.5\pm 0.7 5.7\pm 0.6 12.8\pm 0.4 -5869.4\pm 96.7 1st Top 16 2.0\pm 0.2 1.7\pm 0.2 7.5\pm 0.2 -7415.2\pm 47.1 SDMC(本文) SDMC-无SAS 23.4\pm 0.6 11.3\pm 0.4 15.6\pm 0.8 95.6\pm 233.9 1st Champion 12.4\pm 0.4 6.2\pm 0.4 13.8\pm 0.3 -4907.0\pm 80.0 2nd Top 16 8.8\pm 0.4 5.7\pm 0.2 12.2\pm 0.4 -5885.4\pm 93.8 1st Top 16 1.9\pm 0.2 1.8\pm 0.2 7.4\pm 0.3 -7387.4\pm 14.9 SDMC(本文) 1st Champion 44.3\pm 1.1 9.6\pm 0.2 13.8\pm 0.7 4955.6\pm 126.5 SDMC-无SAS 44.2\pm 0.8 9.8\pm 0.3 13.7\pm 0.6 4907.0\pm 80.0 2nd Top 16 23.3\pm 0.7 7.2\pm 0.3 16.3\pm 0.6 -1213.0\pm 128.4 1st Top 16 8.7\pm 0.4 3.8\pm 0.4 14.4\pm 0.7 -5997.4\pm 140.3 SDMC(本文) 2nd Top 16 48.4\pm 0.4 12.0\pm 0.3 12.6\pm 0.3 5869.4\pm 96.7 SDMC-无SAS 48.8\pm 0.7 12.0\pm 0.2 12.6\pm 0.4 5885.4\pm 93.8 1st Champion 30.2\pm 0.4 9.2\pm 0.4 13.7\pm 0.4 1213.0\pm 128.4 1st Top 16 10.1\pm 0.3 5.1\pm 0.3 15.7\pm 0.5 -5367.6\pm 63.4 SDMC(本文) 1st Top 16 68.4\pm 0.5 13.5\pm 0.6 7.0\pm 0.1 7415.2\pm 47.1 SDMC-无SAS 68.1\pm 0.5 13.9\pm 0.5 7.0\pm 0.3 7387.4\pm 14.9 1st Champion 50.5\pm 1.2 11.5\pm 0.4 11.0\pm 0.4 5997.4\pm 140.3 2nd Top 16 42.6\pm 0.7 14.2\pm 0.4 12.4\pm 0.3 5367.6\pm 63.4 最后为了验证本文提出的软启动方法的实验效果,我们比较了DMC方法、已有策略预训练的方法(基于策略启动的DMC)以及我们提出的采用软启动的SDMC方法对抗1st Top 16时的胜率与平均每小局净胜分曲线,如图3所示,其纵坐标分别为对抗的胜率与平均每小局净胜分,横坐标为训练所用的时间步. 对于每种方法,每次测试记录200局与1st Top 16的胜率与每小局净胜分结果,测试3次,图3绘制了测试的平均值与标准差. 其中基于策略启动的DMC采取与DMC和SDMC相同的神经网络架构与训练方式,不同之处在于基于策略启动的DMC每次自博弈生成轨迹是依据已有策略进行 \epsilon -贪心采样,训练基于策略启动的DMC模型. 基于策略启动的DMC所采用的已有策略与SDMC相同,均为1st Champion.
从图3(a)中可以看到SDMC相较于DMC在训练初始阶段取得了较高的胜率提升速度,如在胜率达到60%时,SDMC需要约 2.7\times {10}^{8} 个时间步,而DMC则需要约 4\times {10}^{8} 个时间步,对于60%胜率,SDMC仅需DMC的68%训练开销. 同样地,在图3(b)中对于净胜小分,SDMC仅需 2.5\times {10}^{8} 个时间步即可达到净胜分大于0,而DMC需要约 3.8\times {10}^{8} 个时间步,SDMC降低了约 35\% 的时间需求.
同时可以发现,相比于SDMC和DMC,基于策略启动的DMC训练效果不佳,原因可能正如2.1节中讨论的,仅通过已有策略生成数据在训练过程中由于掼蛋的动作空间过于庞大,因此无法很好地拟合未探索动作的值,因此存在过估计问题. 而DMC和SDMC因为存在通过神经网络去决策的步骤,当存在高评估值的动作时,由于倾向于选择高评估值动作,可以有效地对于这个动作的评估值进行验证,从而更好地评估.
4. 总 结
本文提出了一种针对掼蛋扑克博弈的软深度蒙特卡洛SDMC方法. SDMC方法在学习过程中不仅采用了软启动方法,结合已有策略,加速模型训练过程,同时采取软动作采样,在实际对战过程中,保证选择的策略在当前模型下的评估值变化不大的情况下对动作进行采样,降低训练过程中方差带来的影响,并增加被对手利用的难度. 在掼蛋环境下的实验表明,本文所提方法SDMC相较于现有方法有着更高的对战胜率与净胜得分. 之后,拟从软动作采样的角度出发,依据现有模型的动作评估值,结合子博弈求解方法提升在实战环境下的策略强度,致力于得到在团体对战情况下的团队最大最小均衡等博弈论角度下的最优策略,最终实现在掼蛋等扑克博弈环境下战胜人类的职业选手.
作者贡献声明:葛振兴提出选题、研究内容,设计技术方案,撰写论文;向帅设计技术方案,采集和整理实验数据,修订论文;田品卓设计技术方案,提出指导性建议;高阳提出指导性建议,指导论文写作.
-
表 1 资源隔离工具汇总
Table 1 Summary of Resource Isolation Tools
资源隔离机制 资源隔离工具 CPU隔离 Linux CPUset Cgroups, taskset 内存隔离 Linux Memory Cgroups 网络隔离 Linux qdisc 磁盘带宽隔离 Linux blkio Cgroups CPU功耗 DVFS, ACPI 末级缓存隔离 CAT 内存带宽隔离 MBA 表 2 特征分析研究工作总结
Table 2 Summary of Feature Analysis Research Work
表 3 干扰检测研究工作总结
Table 3 Summary of Interference Detection Research Work
表 4 单节点资源分配研究工作总结
Table 4 Summary of Node Resource Allocation Research Work
研究工作 场景 搜索方法 资源维度 CPU 内存 网络带宽 磁盘I/O 功耗 LLC 内存带宽 Heracles[14] 单LS+多BE 梯度下降 √ √ √ √ PARTIES[9] 多LS+多BE 梯度下降 √ √ √ √ √ √ HCloud[83] 公有云 基于规则 √ √ PerfIso[69] 单LS+多BE 基于规则 √ √ DRF[95] 多应用 基于规则 √ √ CoPart[34] 单LS+多BE 启发式算法 √ √ Avalon[99] 单LS+多BE 启发式算法 √ √ PRCP[100] 无服务计算链 启发式算法 √ Whirlpool[96] 多BE 启发式算法 √ KPart[98] 多BE 启发式算法 √ UCP[97] 多BE 启发式算法 √ CLITE[61] 多LS+多BE 贝叶斯优化 √ √ √ SATORI[101] 多BE 贝叶斯优化 √ √ √ DRLPart[102] 多BE 深度强化学习 √ √ √ Hipster[103] 单LS+多BE 启发式+强化学习 √ √ CuttleSys[104] 单LS+多BE 数据挖掘 √ √ Sinan[71] 微服务链 机器学习 √ FIRM[60] 无服务计算链 机器学习 √ √ √ √ √ 注:“√”表示具备该维度. 表 5 集群作业调度研究工作总结
Table 5 Summary of Cluster Job Scheduling Research Work
-
[1] Dean J, Barroso L A. The tail at scale[J]. Communications of the ACM, 2013, 56(2): 74−80 doi: 10.1145/2408776.2408794
[2] Barroso L A, Hölzle U. The datacenter as a computer: An introduction to the design of warehouse-scale machines[J]. Synthesis Lectures on Computer Architecture, 2009, 4(1): 1−108
[3] Delimitrou C, Kozyrakis C. Quasar: Resource-efficient and QoS-aware cluster management[J]. ACM SIGPLAN Notices, 2014, 49(4): 127−144 doi: 10.1145/2644865.2541941
[4] Verma A, Pedrosa L, Korupolu M, et al. Large-scale cluster management at Google with Borg [C/OL] //Proc of the 10th European Conf on Computer Systems. New York: ACM, 2015[2023-01-11].https://dl.acm.org/doi/10.1145/2741948.2741964
[5] Guo Jing, Chang Zihao, Wang Sa, et al. Who limits the resource efficiency of my datacenter: An analysis of Alibaba datacenter traces [C/OL] //Proc of the 27th IEEE/ACM Int Symp on Quality of Service. Piscataway, NJ: IEEE, 2019[2023-01-11].https://ieeexplore.ieee.org/document/9068614
[6] Reiss C, Tumanov A, Ganger G R, et al. Heterogeneity and dynamicity of clouds at scale: Google trace analysis [C/OL] //Proc of the 3rd ACM Symp on Cloud Computing. New York: ACM, 2012[2023-01-11].https://dl.acm.org/doi/10.1145/2391229.2391236
[7] Tirmazi M, Barker A, Deng N, et al. Borg: The next generation [C/OL] //Proc of the 15th European Conf on Computer Systems. New York: ACM, 2020[2023-01-11]. doi: 10.1145/3342195.3387517
[8] Nathuji R, Kansal A, Ghaffarkhah A. Q-clouds: Managing performance interference effects for QoS-aware clouds [C] //Proc of the 5th European Conf on Computer Systems. New York: ACM, 2010: 237−250
[9] Chen Shuang, Delimitrou C, Martínez J F. PARTIES: QoS-aware resource partitioning for multiple interactive services [C] //Proc of the 24th Int Conf on Architectural Support for Programming Languages and Operating Systems. New York: ACM, 2019: 107−120
[10] Zhuravlev S, Blagodurov S, Fedorova A. Addressing shared resource contention in multicore processors via scheduling[J]. ACM SIGPLAN Notices, 2010, 45(3): 129−142 doi: 10.1145/1735971.1736036
[11] 徐志伟,李春典. 低熵云计算系统[J]. 中国科学:信息科学,2017,47:1149−1163 Xu Zhiwei, Li Chundian. Low-entropy cloud computing systems[J]. SCIENTIA SINICA Informationis, 2017, 47: 1149−1163 (in Chinese)
[12] Chandra D, Guo Fei, Kim S, et al. Predicting inter-thread cache contention on a chip multi-processor architecture [C] // Proc of the 11th Int Symp on High-Performance Computer Architecture. Piscataway, NJ: IEEE, 2005: 340−351
[13] Tang Lingjia, Mars J, Vachharajani N, et al. The impact of memory subsystem resource sharing on datacenter applications [C] // Proc of the 38th Annual Int Symp on Computer Architecture (ISCA). New York: ACM, 2011: 283−294
[14] Lo D, Cheng Liqun, Govindaraju R, et al. Heracles: Improving resource efficiency at scale [C] //Proc of the 42nd Annual Int Symp on Computer Architecture. New York: ACM, 2015: 450−462
[15] Linden G. Marissa Mayer at Web 2.0 [EB/OL]. 2006 [2022-03-29]. http://glinden.blogspot.com/2006/11/marissa-mayer-at-web-20.html
[16] Schurman E, Brutlag J. The user and business impact of server delays, additional bytes, and http chunking in web search [EB/OL]. 2009[2023-01-11]. https://vdocuments.mx/the-user-and-business-impact-of-server-delays-additional-bytes-and-http-chunking.html
[17] Einav Y. Amazon found every 100ms of latency cost them 1% in sales [EB/OL]. 2019[2022-03-29].https://www.gigaspaces.com/blog/amazon-found-every-100ms-of-latency-cost-them-1-in-sales
[18] Haque M E, He Yuxiong, Elnikety S, et al. Exploiting heterogeneity for tail latency and energy efficiency [C] //Proc of the 50th Annual IEEE/ACM Int Symp on Microarchitecture. New York: ACM, 2017: 625−638
[19] 张鲁飞,陈左宁. 虚拟集群上面向功耗的形式化的VM调度策略[J]. 计算机科学,2014,41(8):38−41 Zhang Lufei, Chen Zuoning. Power-efficient formal scheduling policy of VMs in virtualized clusters[J]. Computer Science, 2014, 41(8): 38−41 (in Chinese)
[20] Leverich J, Kozyrakis C. Reconciling high server utilization and sub-millisecond quality-of-service [C/OL] //Proc of the 9th European Conf on Computer Systems. New York: ACM, 2014[2023-01-11].https://dl.acm.org/doi/10.1145/2592798.2592821
[21] Lin Jiang, Lu Qingda, Ding Xiaoning, et al. Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems [C] //Proc of the 14th Int Symp on High Performance Computer Architecture. Piscataway, NJ: IEEE, 2008: 367−378
[22] Sherwood T, Calder B, Emer J. Reducing cache misses using hardware and software page placement [C] //Proc of the 13th Int Conf on Supercomputing. New York: ACM, 1999: 155−164
[23] Ye Ying, West R, Cheng Zhuoqun, et al. Coloris: A dynamic cache partitioning system using page coloring [C] // Proc of the 23rd Int Conf on Parallel Architecture and Compilation Techniques (PACT). Piscataway, NJ: IEEE, 2014: 381−392
[24] 邱杰凡,华宗汉,范菁,等. 内存体系划分技术的研究与发展[J]. 软件学报,2022,33(2):751−769 doi: 10.13328/j.cnki.jos.006370 Qiu Jiefan, Hua Zonghan, Fan Jing, et al. Evolution of memory partitioning technologies: Case study through page coloring[J]. Journal of Software, 2022, 33(2): 751−769 (in Chinese) doi: 10.13328/j.cnki.jos.006370
[25] Albonesi D H. Selective cache ways: On-demand cache resource allocation [C] // Proc of the 32nd Annual ACM/IEEE Int Symp on Microarchitecture. Piscataway, NJ: IEEE, 1999: 248−259
[26] Balasubramonian R, Albonesi D, Buyuktosunoglu A, et al. Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures [C] //Proc of the 33rd Annual ACM/IEEE Int Symp on Microarchitecture. Piscataway, NJ: IEEE, 2000: 245−257
[27] Chiou D, Jain P, Rudolph L, et al. Application-specific memory management for embedded systems using software-controlled caches [C] //Proc of the 37th Annual Design Automation Conf. New York: ACM, 2000: 416−419
[28] Ranganathan P, Adve S, Jouppi N P. Reconfigurable caches and their application to media processing[J]. ACM SIGARCH Computer Architecture News, 2000, 28(2): 214−224 doi: 10.1145/342001.339685
[29] Liu Fang, Jiang Xiaowei, Solihin Y. Understanding how off-chip memory bandwidth partitioning in chip multiprocessors affects system performance [C/OL] //Proc of the 16th Int Symp on High-Performance Computer Architecture. Piscataway, NJ: IEEE, 2010[2023-01-11].https://ieeexplore.ieee.org/document/5416655/
[30] Yun H, Yao Gang, Pellizzoni R, et al. Memguard: Memory bandwidth reservation system for efficient performance isolation in multi-core platforms [C] //Proc of the 19th Real-Time and Embedded Technology and Applications Symp. Piscataway, NJ: IEEE, 2013: 55−64
[31] Iyer R, Zhao Li, Guo Fei, et al. QoS policies and architecture for cache/memory in CMP platforms[J]. ACM SIGMETRICS Performance Evaluation Review, 2007, 35(1): 25−36 doi: 10.1145/1269899.1254886
[32] Herdrich A, Illikkal R, Iyer R, et al. Rate-based QoS techniques for cache/memory in CMP platforms [C] //Proc of the 23rd Int Conf on Supercomputing. New York: ACM, 2009: 479−488
[33] Shahrad M, Balkind J, Wentzlaff D. Architectural implications of function-as-a-service computing [C] //Proc of the 52nd Annual IEEE/ACM Int Symp on Microarchitecture. New York: ACM, 2019: 1063−1075
[34] Park J, Park S, Baek W. CoPart: Coordinated partitioning of last-level cache and memory bandwidth for fairness-aware workload consolidation on commodity servers [C/OL] //Proc of the 14th EuroSys Conf. New York: ACM, 2019[2023-01-11].https://dl.acm.org/doi/10.1145/3302424.3303963
[35] Bashir N, Deng Nan, Rzadca K, et al. Take it to the limit: Peak prediction-driven resource overcommitment in datacenters [C] //Proc of the 16th European Conf on Computer Systems. New York: ACM, 2021: 556−573
[36] Lagar-Cavilla A, Ahn J, Souhlal S, et al. Software-defined far memory in warehouse-scale computers [C] //Proc of the 24th Int Conf on Architectural Support for Programming Languages and Operating Systems. New York: ACM, 2019: 317−330
[37] Waldspurger C A. Memory resource management in VMware ESX server[J]. ACM SIGOPS Operating Systems Review, 2002, 36(SI): 181−194 doi: 10.1145/844128.844146
[38] Gu Juncheng, Lee Y, Zhang Yiwen, et al. Efficient memory disaggregation with infiniswap [C] //Proc of the 14th USENIX Symp on Networked Systems Design and Implementation. Berkeley, CA: USENIX Association, 2017: 649−667
[39] Liang Shuang, Noronha R, Panda D K. Swapping to remote memory over infiniband: An approach using a high performance network block device [C/OL] //Proc of the 7th IEEE Int Conf on Cluster Computing. Piscataway, NJ: IEEE, 2005[2023-01-11].https://ieeexplore.ieee.org/document/4154093
[40] Brown M A. Traffic control howto [EB/OL]. 2006 [2022-03-29]. http://linux-ip.net/articles/Traffic-Control-HOWTO/
[41] Hong Chiyao, Caesar M, Godfrey P B. Finishing flows quickly with preemptive scheduling[J]. ACM SIGCOMM Computer Communication Review, 2012, 42(4): 127−138 doi: 10.1145/2377677.2377710
[42] Hu Shuihai, Bai Wei, Chen Kai, et al. Providing bandwidth guarantees, work conservation and low latency simultaneously in the cloud[J]. IEEE Transactions on Cloud Computing, 2018, 9(2): 763−776
[43] Grosvenor M P, Schwarzkopf M, Gog I, et al. Queues don’t matter when you can JUMP Them! [C/OL] //Proc of the 12th USENIX Symp on Networked Systems Design and Implementation. Berkeley, CA: USENIX Association, 2015[2023-01-11].https://dl.acm.org/doi/abs/10.5555/2789770.2789771
[44] Perry J, Ousterhout A, Balakrishnan H, et al. Fastpass: A centralized"zero-queue" datacenter network [C] //Proc of the 28th ACM Conf on SIGCOMM. New York: ACM, 2014: 307−318
[45] Nagaraj K, Bharadia D, Mao Hongzi, et al. Numfabric: Fast and flexible bandwidth allocation in datacenters [C] //Proc of the 30th ACM SIGCOMM Conf. New York: ACM, 2016: 188−201
[46] Wang Shuai, Gao Kaihui, Qian Kun, et al. Predictable vFabric on informative data plane [C] //Proc of the 36th ACM SIGCOMM Conf. New York: ACM, 2022: 615−632
[47] Ma Jiuyue, Sui Xiufeng, Sun Ninghui, et al. Supporting differentiated services in computers via programmable architecture for resourcing-on-demand (PARD) [C] //Proc of the 20th Int Conf on Architectural Support for Programming Languages and Operating Systems. New York: ACM, 2015: 131−143
[48] Nanda S, Chiueh T. A survey on virtualization technologies [EB/OL]. 2011 [2023-01-11].https://rtcl.eecs.umich.edu/papers/publications/2011/TR179.pdf
[49] Jonas E, Schleier-Smith J, Sreekanti V, et al. Cloud programming simplified: A Berkeley view on serverless computing [J]. arXiv preprint, arXiv: 1902. 03383, 2019
[50] Armbrust M, Fox A, Griffith R, et al. Above the clouds: A Berkeley view of cloud computing, UCB/EECS-2009-28[R]. Berkeley, CA: University of California, 2009
[51] Google Inc. Borg cluster workload traces [EB/OL]. 2019 [2022-03-20].https://github.com/google/cluster-data.
[52] Alibaba Group. Alibaba cluster trace program [EB/OL]. 2021 [2022-03-29].https://github.com/alibaba/clusterdata
[53] Microsoft. Azure public dataset [EB/OL]. 2020 [2022-03-29].https://github.com/Azure/AzurePublicDataset
[54] Chen Tianshi, Guo Qi, Temam O, et al. Statistical performance comparisons of computers[J]. IEEE Transactions on Computers, 2014, 64(5): 1442−1455
[55] Krushevskaja D, Sandler M. Understanding latency variations of black box services [C] //Proc of the 22nd Int Conf on World Wide Web. New York: ACM, 2013: 703−714
[56] Ravindranath L, Padhye J, Mahajan R, et al. Timecard: Controlling user-perceived delays in server-based mobile applications [C] //Proc of the 24th ACM Symp on Operating Systems Principles. New York: ACM, 2013: 85−100
[57] Ravindranath L, Padhye J, Agarwal S, et al. AppInsight: Mobile App performance monitoring in the wild [C] //Proc of the 10th USENIX Symp on Operating Systems Design and Implementation. Berkeley, CA: USENIX Association, 2012: 107−120
[58] Amvrosiadis G, Park J W, Ganger G R, et al. On the diversity of cluster workloads and its impact on research results [C] //Proc of the 23rd USENIX Annual Technical Conf. Berkeley, CA: USENIX Association, 2018: 533−546
[59] Delimitrou C, Kozyrakis C. iBench: Quantifying interference for datacenter applications [C] //Proc of the 9th IEEE Int Symp on Workload Characterization. Piscataway, NJ: IEEE, 2013: 23−33
[60] Qiu Haoran, Banerjee S S, Jha S, et al. FIRM: An intelligent fine-grained resource management framework for SLO-oriented microservices [C] //Proc of the 14th USENIX Symp on Operating Systems Design and Implementation. Berkeley, CA: USENIX Association, 2020: 805−825
[61] Patel T, Tiwari D. CLITE: Efficient and QoS-aware co-location of multiple latency-critical jobs for warehouse scale computers [C] // Proc of the 26th IEEE Int Symp on High Performance Computer Architecture. Piscataway, NJ: IEEE, 2020: 193−206
[62] Mars J, Tang Lingjia, Skadron K, et al. Increasing utilization in modern warehouse-scale computers using bubble-up[J]. IEEE Micro, 2012, 32(3): 88−99 doi: 10.1109/MM.2012.22
[63] Yang Hailong, Breslow A, Mars J, et al. Bubble-flux: Precise online QoS management for increased utilization in warehouse scale computers[J]. ACM SIGARCH Computer Architecture News, 2013, 41(3): 607−618 doi: 10.1145/2508148.2485974
[64] Chen Quan, Yang Hailong, Guo Minyi, et al. Prophet: Precise QoS prediction on non-preemptive accelerators to improve utilization in warehouse-scale computers [C] //Proc of the 22nd Int Conf on Architectural Support for Programming Languages and Operating Systems. New York: ACM, 2017: 17−32
[65] Zhang Yunqi, Prekas G, Fumarola G M, et al. History-based harvesting of spare cycles and storage in large-scale datacenters [C] //Proc of the 12th USENIX Symp on Operating Systems Design and Implementation. Berkeley, CA: USENIX Association, 2016: 755−770
[66] Yang Yanan, Zhao Laiping, Li Yiming, et al. INFless: A native serverless system for low-latency, high-throughput inference [C] //Proc of the 27th ACM Int Conf on Architectural Support for Programming Languages and Operating Systems. New York: ACM, 2022: 768−781
[67] Shahrad M, Fonseca R, Goiri Í, et al. Serverless in the wild: Characterizing and optimizing the serverless workload at a large cloud provider [C] //Proc of the 25th USENIX Annual Technical Conf. Berkeley, CA: USENIX Association, 2020: 205−218
[68] Cortez E, Bonde A, Muzio A, et al. Resource central: Understanding and predicting workloads for improved resource management in large cloud platforms [C] //Proc of the 26th Symp on Operating Systems Principles. New York: ACM, 2017: 153−167
[69] Iorgulescu C, Azimi R, Kwon Y, et al. PerfIso: Performance isolation for commercial latency-sensitive services [C] //Proc of the 23rd USENIX Annual Technical Conf. Berkeley, CA: USENIX Association, 2018: 519−532
[70] Luo Shutian, Xu Huanle, Lu Chengzhi, et al. Characterizing microservice dependency and performance: Alibaba trace analysis [C] //Proc of the 12th ACM Symp on Cloud Computing. New York: ACM, 2021: 412−426
[71] Zhang Yanqi, Hua Weizhe, Zhou Zhuangzhuang, et al. Sinan: ML-based and QoS-aware resource management for cloud microservices [C] //Proc of the 26th ACM Int Conf on Architectural Support for Programming Languages and Operating Systems. New York: ACM, 2021: 167−181
[72] Gan Yu, Zhang Yanqi, Cheng Dailun, et al. An open-source benchmark suite for microservices and their hardware-software implications for cloud & edge systems [C] //Proc of the 24th Int Conf on Architectural Support for Programming Languages and Operating Systems. New York: ACM, 2019: 3−18
[73] Gan Yu, Zhang Yanqi, Hu K, et al. Seer: Leveraging big data to navigate the complexity of performance debugging in cloud microservices [C] //Proc of the 24th Int Conf on Architectural Support for Programming Languages and Operating Systems. New York: ACM, 2019: 19−33
[74] Tian Huangshi, Zheng Yunchuan, Wang Wei. Characterizing and synthesizing task dependencies of data-parallel jobs in Alibaba cloud [C] //Proc of the 10th ACM Symp on Cloud Computing. New York: ACM, 2019: 139−151
[75] Sriraman A, Dhanotia A, Wenisch T F. Softsku: Optimizing server architectures for microservice diversity@ scale [C] //Proc of the 46th Int Symp on Computer Architecture. New York: ACM, 2019: 513−526
[76] Delimitrou C, Kozyrakis C. Paragon: QoS-aware scheduling for heterogeneous datacenters[J]. ACM SIGPLAN Notices, 2013, 48(4): 77−88 doi: 10.1145/2499368.2451125
[77] Shan Yizhou, Huang Yutong, Chen Yilun, et al. LegoOS: A disseminated, distributed OS for hardware resource disaggregation [C] //Proc of the 13th USENIX Symp on Operating Systems Design and Implementation. Berkeley, CA: USENIX Association, 2018: 69−87
[78] Lu Chengzhi, Ye Kejiang, Xu Guoyao, et al. Imbalance in the cloud: An analysis on Alibaba cluster trace [C] //Proc of the 5th IEEE Int Conf on Big Data. Piscataway, NJ: IEEE, 2017: 2884−2892
[79] 王康瑾,贾统,李影. 在离线混部作业调度与资源管理技术研究综述[J]. 软件学报,2020,31(10):3100−3119 Wang Kangjin, Jia Tong, Li Ying. State-of-the-art survey of scheduling and resource management technology for colocation jobs[J]. Journal of Software, 2020, 31(10): 3100−3119 (in Chinese)
[80] Liu Qixiao, Yu Zhibin. The elasticity and plasticity in semi-containerized co-locating cloud workload: A view from Alibaba trace [C] //Proc of the 9th ACM Symp on Cloud Computing. New York: ACM, 2018: 347−360
[81] Zhao Laiping, Yang Yanan, Zhang Kaixuan, et al. Rhythm: Component-distinguishable workload deployment in datacenters [C/OL] //Proc of the 15th European Conf on Computer Systems. New York: ACM, 2020[2023-01-11].https://dl.acm.org/doi/abs/10.1145/3342195.3387534
[82] Wang Liang, Li Mengyuan, Zhang Yinqian, et al. Peeking behind the curtains of serverless platforms [C] //Proc of the 23rd USENIX Annual Technical Conf. Berkeley, CA: USENIX Association, 2018: 133−146
[83] Delimitrou C, Kozyrakis C. HCloud: Resource-efficient provisioning in shared cloud systems [C] //Proc of the 21st Int Conf on Architectural Support for Programming Languages and Operating Systems. New York: ACM, 2016: 473−488
[84] Zhang Xiao, Tune E, Hagmann R, et al. CPI2: CPU performance isolation for shared compute clusters [C] //Proc of the 8th ACM European Conf on Computer Systems. New York: ACM, 2013: 379−391
[85] Ousterhout A, Fried J, Behrens J, et al. Shenango: Achieving high CPU efficiency for latency-sensitive datacenter workloads [C] //Proc of the 16th USENIX Symp on Networked Systems Design and Implementation. Berkeley, CA: USENIX Association, 2019: 361−378
[86] Fried J, Ruan Zhenyuan, Ousterhout A, et al. Caladan: Mitigating interference at microsecond timescales [C] //Proc of the 14th USENIX Symp on Operating Systems Design and Implementation. Berkeley, CA: USENIX Association, 2020: 281−297
[87] Chen Quan, Xue Shuai, Zhao Shang, et al. Alita: Comprehensive performance isolation through bias resource management for public clouds [C/OL] //Proc of the 33rd Int Conf for High Performance Computing, Networking, Storage and Analysis. Piscataway, NJ: IEEE, 2020[2023-01-11].https://ieeexplore.ieee.org/document/9355282
[88] Novaković D, Vasić N, Novaković S, et al. DeepDive: Transparently identifying and managing performance interference in virtualized environments [C] //Proc of the 18th USENIX Annual Technical Conf. Berkeley, CA: USENIX Association, 2013: 219−230
[89] Li Yusen, Shan Chuxu, Chen Ruobing, et al. GAugur: Quantifying performance interference of colocated games for improving resource utilization in cloud gaming [C] //Proc of the 28th Int Symp on High-Performance Parallel and Distributed Computing. New York: ACM, 2019: 231−242
[90] Li Zijun, Chen Quan, Xue Shuai, et al. Amoeba: QoS-awareness and reduced resource usage of microservices with serverless computing [C] //Proc of the 34th IEEE Int Parallel and Distributed Processing Symp. Piscataway, NJ: IEEE, 2020: 399−408
[91] Zhao Jiacheng, Cui Huimin, Xue Jingling, et al. Predicting cross-core performance interference on multicore processors with regression analysis[J]. IEEE Transactions on Parallel and Distributed Systems, 2015, 27(5): 1443−1456
[92] Wang Sa, Zhu Yanhai, Chen Shanpei, et al. A case for adaptive resource management in Alibaba datacenter using neural networks[J]. Journal of Computer Science and Technology, 2020, 35(1): 209−220 doi: 10.1007/s11390-020-9732-x
[93] 李杰,张静,李伟东,等. 一种基于共享公平和时变资源需求的公平分配策略[J]. 计算机研究与发展,2019,56(7):1534−1544 Li Jie, Zhang Jing, Li Weidong, et al. A fair distribution strategy based on shared fair and time-varying resource demand[J]. Journal of Computer Research and Development, 2019, 56(7): 1534−1544 (in Chinese)
[94] 王金海,黄传河,王晶,等. 异构云计算体系结构及其多资源联合公平分配策略[J]. 计算机研究与发展,2015,52(6):1288−1302 Wang Jinhai, Huang Chuanhe, Wang Jing, et al. A heterogeneous cloud computing architecture and multi-resource-joint fairness allocation strategy[J]. Journal of Computer Research and Development, 2015, 52(6): 1288−1302 (in Chinese)
[95] Ghodsi A, Zaharia M, Hindman B, et al. Dominant resource fairness: Fair allocation of multiple resource types [C] //Proc of the 8th USENIX Symp on Networked Systems Design and Implementation. Berkeley, CA: USENIX Association, 2011: 323–336
[96] Mukkara A, Beckmann N, Sanchez D. Whirlpool: Improving dynamic cache management with static data classification[J]. ACM SIGARCH Computer Architecture News, 2016, 44(2): 113−127 doi: 10.1145/2980024.2872363
[97] Qureshi M K, Patt Y N. Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches [C] //Proc of the 39th Annual IEEE/ACM Int Symp on Microarchitecture. Piscataway, NJ: IEEE, 2006: 423−432
[98] El-Sayed N, Mukkara A, Tsai P A, et al. KPart: A hybrid cache partitioning-sharing technique for commodity multicores [C] //Proc of the 24th IEEE Int Symp on High Performance Computer Architecture. Piscataway, NJ: IEEE, 2018: 104−117
[99] Chen Quan, Wang Zhenning, Leng Jingwen, et al. Avalon: Towards QoS awareness and improved utilization through multi-resource management in datacenters [C] //Proc of the 33rd ACM Int Conf on Supercomputing. New York: ACM, 2019: 272−283
[100] Lin Changyuan, Khazaei H. Modeling and optimization of performance and cost of serverless applications[J]. IEEE Transactions on Parallel and Distributed Systems, 2020, 32(3): 615−632
[101] Roy R B, Patel T, Tiwari D. SATORI: Efficient and fair resource partitioning by sacrificing short-term benefits for long-term gains [C] //Proc of the 48th ACM/IEEE Annual Int Symp on Computer Architecture. Piscataway, NJ: IEEE, 2021: 292−305
[102] Chen Ruobing, Wu Jinping, Shi Haosen, et al. DRLPart: A deep reinforcement learning framework for optimally efficient and robust resource partitioning on commodity servers [C] //Proc of the 30th Int Symp on High-Performance Parallel and Distributed Computing. New York: ACM, 2021: 175−188
[103] Nishtala R, Carpenter P, Petrucci V, et al. Hipster: Hybrid task manager for latency-critical cloud workloads [C] //Proc of the 23rd IEEE Int Symp on High Performance Computer Architecture. Piscataway, NJ: IEEE, 2017: 409−420
[104] Kulkarni N, Gonzalez-Pumariega G, Khurana A, et al. CuttleSys: Data-driven resource management for interactive services on reconfigurable multicores [C] //Proc of the 53rd Annual IEEE/ACM Int Symp on Microarchitecture. Piscataway, NJ: IEEE, 2020: 650−664
[105] Zhou Hao, Chen Ming, Lin Qian, et al. Overload control for scaling Wechat microservices [C] //Proc of the 9th ACM Symp on Cloud Computing. New York: ACM, 2018: 149−161
[106] Grandl R, Chowdhury M, Akella A, et al. Altruistic scheduling in multi-resource clusters [C] //Proc of the 12th USENIX Symp on Operating Systems Design and Implementation. Berkeley, CA: USENIX Association, 2016: 65−80
[107] 李青,李勇,涂碧波,等. QoS保证的数据中心动态资源供应方法[J]. 计算机学报,2014,37(12):2395−2407 Li Qing, Li Yong, Tu Bibo, et al. QoS-guaranteed dynamic resource provision in Internet data centers[J]. Chinese Journal of Computers, 2014, 37(12): 2395−2407 (in Chinese)
[108] Romero F, Delimitrou C. Mage: Online and interference-aware scheduling for multi-scale heterogeneous systems [C/OL] //Proc of the 27th Int Conf on Parallel Architectures and Compilation Techniques. New York: ACM, 2018[2023-01-11].https://dl.acm.org/doi/10.1145/3243176.3243183
[109] Zhao Laiping, Yang Yanan, Li Yiming, et al. Understanding, predicting and scheduling serverless workloads under partial interference [C/OL] //Proc of the 34th Int Conf for High Performance Computing, Networking, Storage and Analysis. New York: ACM, 2021[2023-01-11].https://ieeexplore.ieee.org/document/9910093
[110] Xu Ran, Mitra S, Rahman J, et al. Pythia: Improving datacenter utilization via precise contention prediction for multiple co-located workloads [C] //Proc of the 19th Int Middleware Conf. New York: ACM, 2018: 146−160
[111] Agache A, Brooker M, Iordache A, et al. Firecracker: Lightweight virtualization for serverless applications [C] //Proc of the 17th USENIX Symp on Networked Systems Design and Implementation. Berkeley, CA: USENIX Association, 2020: 419−434
[112] Ferdman M, Adileh A, Kocberber O, et al. Clearing the clouds: A study of emerging scale-out workloads on modern hardware[J]. ACM SIGPLAN Notices, 2012, 47(4): 37−48 doi: 10.1145/2248487.2150982
[113] Kaffes K, Yadwadkar N J, Kozyrakis C. Centralized core-granular scheduling for serverless functions [C] // Proc of the 10th ACM Symp on Cloud Computing. New York: ACM, 2019: 158−164
-
期刊类型引用(1)
1. 李继国,方淳. 基于SM9的指定验证者聚合签名方案. 网络与信息安全学报. 2024(04): 63-71 . 百度学术
其他类型引用(0)