ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2018, Vol. 55 ›› Issue (2): 438-446.doi: 10.7544/issn1000-1239.2018.20160796

Previous Articles    

Proxy Based Metadata Optimization and Implementation in Parallel Filesystem

Yi Jianliang1, Chen Zhiguang1, Xiao Nong1,2, Lu Yutong3   

  1. 1(College of Computer, National University of Defense Technology, Changsha 410073); 2(State Key Laboratory of High Performance Computing (National University of Defense Technology), Changsha 410073); 3(School of Data and Computer Science, Sun Yat-sen University, Guangzhou 510275)
  • Online:2018-02-01

Abstract: In high-performance computing environment, parallel file system faces a mega client. These clients often issue a large number of concurrent IO request to the system in the same period of time, making the metadata server under a huge pressure. On the other hand, concurrent read and write requests from these clients often relate to the same directory. It makes it difficult to schedule work load across multiple servers. Therefore, we add a proxy server between the client and the metadata server and propose corresponding optimization methods to reduce the work load of the metadata server. In this paper, we realize two aspects of optimization based on proxy server. First of all, since the high-performance computing program often access files concurrently, we consider merging the numerous requests into a big one and then sent it to metadata server. Secondly, concurrent IO from the high-performance computing program often points to the same directory. Traditional metadata load balancing mechanism commonly use sub-tree partitioning method to dispatch work load across multiple server. This method is unable to realize load balancing in the situation where all operations relate to the same directory. The paper realizes fine-grained load balancing by scheduling the operations from the same directory to the plurality of metadata servers.

Key words: proxy server, high concurrency, high performance computing, parallel file system, load balancing

CLC Number: