ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2021, Vol. 58 ›› Issue (3): 497-512.doi: 10.7544/issn1000-1239.2021.20200501

Previous Articles     Next Articles

Scalability for Monolithic Schedulers of Cluster Resource Management Framework

Mao Anqi, Tang Xiaochun, Ding Zhao, Li Zhanhuai   

  1. (School of Computer Science, Northwestern Polytechnical University,Xi'an 710129) (Key Laboratory of Big Data Storage and Management (Northwestern Polytechnical University), Ministry of Industry and Information Technology, Xi'an 710129)
  • Online:2021-03-01
  • Supported by: 
    This work was supported by the National Key Research and Development Program of China (2018YFB1003400).

Abstract: The significant advantages of monolithic cluster resource management system in ensuring the consistency of global resource status and applying multiple scheduling models make it widely used in actual systems. Howerver, the performance of the monolithic resource manager in a large cluster management environment does not meet expectations, because it uses a single node to maintain the global resource state. When the resource manager is receiving and processing large-scale periodic heartbeat information, the load pressure on the resource manager will increase sharply, which leads to a scalability bottleneck. In order to solve these problems, this paper proposes the idea of “no change, no update” to replace the periodic update mechanism of the resource manager. In our paper, we briefly summarize three main topics. Firstly, we introduce a differential-based heartbeat information processing model in the computing node. When the resource status of the computing node has not changed, it will not send the message to the resource manager, thereby reducing the size and number of messages. Secondly, we propose a ring network monitoring model between computing nodes. By adopting this mode, the periodic monitoring pressure can be transferred to the computing nodes. Finally, we implement these two models on YARN. After experimental verification, we can conclude that when the cluster reaches 10 000 nodes and the heartbeat interval is 3 s, the YARN based on our models increases the heartbeat information processing efficiency and resource update efficiency by about 40%. In addition, the scale of the cluster managed by improved YARN is more than 1.88 times that of the original YARN.

Key words: monolithic schedulers, scalability, heartbeat message, differential, ring monitoring

CLC Number: