Abstract:
Long-tail latency is inevitable and may be amplified for highly modular datacenter applications such as Bing, Facebook, and Amazon’s retail platform, due to resource sharing, queuing, background maintenance activities, etc. Thus how to tolerate the latency variability in shared environments is crucial in datacenters. This paper proposes a distributed deadline propagation (D\+2P) approach for datacenter applications to reduce long-tail latency. The key idea of D\+2P is inspired by the traffic light system in Manhattan, New York City, where one can enjoy a chain of green lights after one stop at a red light, and it allows local nodes to perceive global deadline information and to propagate the information among distributed nodes. Local nodes can leverage the information to do scheduling and adjust processing speed to reduce long-tail latency. Then, we propose stage-service model and parallel-unit model to describe sequential/dependent pattern and partition/aggregate pattern, and implement a distributed deadline propagation framework. At last, based on distributed deadline propagation framework, we use D\+2P-enabled deadline-aware scheduling algorithm to reduce long-tail latency in our experiments, and the preliminary experimental results show that D\+2P has the potential of reducing the long-tail latency in datacenters by local nodes leveraging the propagated deadline information.