Abstract:
In high-performance computing environment, parallel file system faces a mega client. These clients often issue a large number of concurrent IO request to the system in the same period of time, making the metadata server under a huge pressure. On the other hand, concurrent read and write requests from these clients often relate to the same directory. It makes it difficult to schedule work load across multiple servers. Therefore, we add a proxy server between the client and the metadata server and propose corresponding optimization methods to reduce the work load of the metadata server. In this paper, we realize two aspects of optimization based on proxy server. First of all, since the high-performance computing program often access files concurrently, we consider merging the numerous requests into a big one and then sent it to metadata server. Secondly, concurrent IO from the high-performance computing program often points to the same directory. Traditional metadata load balancing mechanism commonly use sub-tree partitioning method to dispatch work load across multiple server. This method is unable to realize load balancing in the situation where all operations relate to the same directory. The paper realizes fine-grained load balancing by scheduling the operations from the same directory to the plurality of metadata servers.