高级检索
    易建亮, 陈志广, 肖侬, 卢宇彤. 基于代理的并行文件系统元数据优化与实现[J]. 计算机研究与发展, 2018, 55(2): 438-446. DOI: 10.7544/issn1000-1239.2018.20160796
    引用本文: 易建亮, 陈志广, 肖侬, 卢宇彤. 基于代理的并行文件系统元数据优化与实现[J]. 计算机研究与发展, 2018, 55(2): 438-446. DOI: 10.7544/issn1000-1239.2018.20160796
    Yi Jianliang, Chen Zhiguang, Xiao Nong, Lu Yutong. Proxy Based Metadata Optimization and Implementation in Parallel Filesystem[J]. Journal of Computer Research and Development, 2018, 55(2): 438-446. DOI: 10.7544/issn1000-1239.2018.20160796
    Citation: Yi Jianliang, Chen Zhiguang, Xiao Nong, Lu Yutong. Proxy Based Metadata Optimization and Implementation in Parallel Filesystem[J]. Journal of Computer Research and Development, 2018, 55(2): 438-446. DOI: 10.7544/issn1000-1239.2018.20160796

    基于代理的并行文件系统元数据优化与实现

    Proxy Based Metadata Optimization and Implementation in Parallel Filesystem

    • 摘要: 在高性能计算环境中,并行文件系统面临百万量级的客户端,这些客户端往往在同一时间段内发出大量并发I/O请求,使元数据服务器承载巨大的压力.另一方面,这些客户端发出的并发读写请求往往指向同一目录,导致很难将元数据负载调度到多个服务器上.为此,提出在并行文件系统的客户端和元数据服务器之间增加一级代理(proxy),并给出相应的优化措施降低元数据服务器的负载.在元数据代理上实现2方面的优化:1)由于高性能计算程序往往并发访问大量的文件,可以考虑通过元数据聚合将大量请求合并成1个请求发送到元数据服务器上,降低元数据服务器的负载;2)高性能计算程序的并发I/O往往指向同一目录,而传统的元数据负载均衡机制一般采用子树划分的方法将元数据负载调度到多个元数据服务器上,无法实现针对同一目录元数据操作的负载均衡,通过代理将针对同一目录的元数据操作调度到多个元数据服务器上,实现细粒度的负载均衡.

       

      Abstract: In high-performance computing environment, parallel file system faces a mega client. These clients often issue a large number of concurrent IO request to the system in the same period of time, making the metadata server under a huge pressure. On the other hand, concurrent read and write requests from these clients often relate to the same directory. It makes it difficult to schedule work load across multiple servers. Therefore, we add a proxy server between the client and the metadata server and propose corresponding optimization methods to reduce the work load of the metadata server. In this paper, we realize two aspects of optimization based on proxy server. First of all, since the high-performance computing program often access files concurrently, we consider merging the numerous requests into a big one and then sent it to metadata server. Secondly, concurrent IO from the high-performance computing program often points to the same directory. Traditional metadata load balancing mechanism commonly use sub-tree partitioning method to dispatch work load across multiple server. This method is unable to realize load balancing in the situation where all operations relate to the same directory. The paper realizes fine-grained load balancing by scheduling the operations from the same directory to the plurality of metadata servers.

       

    /

    返回文章
    返回