ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2017, Vol. 54 ›› Issue (7): 1617-1628.doi: 10.7544/issn1000-1239.2017.20160247

• 网络技术 • 上一篇    



  1. 1(中国科学院计算技术研究所 北京 100190);2(中国科学院大学 北京 100049) (
  • 出版日期: 2017-07-01
  • 基金资助: 

A Distributed Deadline Propagation Approach to Reduce Long-Tail in Datacenters

Ren Rui1,2, Ma Jiuyue1,2, Sui Xiufeng1, Bao Yungang1   

  1. 1(Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190);2(University of Chinese Academy of Sciences, Beijing 100049)
  • Online: 2017-07-01

摘要: 提出了一种在数据中心环境下用于减少长尾延迟的分布式实时约束传播方法,该方法能够使当前节点感知请求的全局响应时间约束信息,并能够将请求的实时约束信息传播到整个处理路径;节点可以利用请求的实时约束信息进行请求调度或加速请求执行时间,以此来减少长尾延迟现象.同时,针对划分/聚合模式和串行/依赖模式2种数据中心应用,提出了阶段服务模型和并行单元模型,并基于这2种模型实现了分布式实时约束传播框架.最后,在分布式实时约束传播框架上实现了实时约束感知调度算法,通过实验进行了简单的验证,初步的实验结果显示了分布式实时约束传播方法能够在一定程度上减少长尾延迟.

关键词: 实时约束传播, 长尾延迟, 数据中心, 划分/聚合模式, 串行/依赖模式

Abstract: Long-tail latency is inevitable and may be amplified for highly modular datacenter applications such as Bing, Facebook, and Amazon’s retail platform, due to resource sharing, queuing, background maintenance activities, etc. Thus how to tolerate the latency variability in shared environments is crucial in datacenters. This paper proposes a distributed deadline propagation (D\+2P) approach for datacenter applications to reduce long-tail latency. The key idea of D\+2P is inspired by the traffic light system in Manhattan, New York City, where one can enjoy a chain of green lights after one stop at a red light, and it allows local nodes to perceive global deadline information and to propagate the information among distributed nodes. Local nodes can leverage the information to do scheduling and adjust processing speed to reduce long-tail latency. Then, we propose stage-service model and parallel-unit model to describe sequential/dependent pattern and partition/aggregate pattern, and implement a distributed deadline propagation framework. At last, based on distributed deadline propagation framework, we use D\+2P-enabled deadline-aware scheduling algorithm to reduce long-tail latency in our experiments, and the preliminary experimental results show that D\+2P has the potential of reducing the long-tail latency in datacenters by local nodes leveraging the propagated deadline information.

Key words: deadline propagation, long-tail latency, datacenter, partition/aggregates pattern, sequential/dependent pattern