ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2014, Vol. 51 ›› Issue (8): 1663-1670.doi: 10.7544/issn1000-1239.2014.20121094

• 系统结构 • 上一篇    下一篇

MDDS:一种面向高性能计算的并行文件系统元数据性能提升方法

陈 起,陈左宁,蒋金虎   

  1. (江南计算技术研究所 江苏无锡 214083) (chenqi.jn@gmail.com)
  • 出版日期: 2014-08-15

MDDS: A Method to Improve the Metadata Performance of Parallel File System for HPC

Chen Qi, Chen Zuoning, Jiang Jinhu   

  1. (Jiangnan Institute of Computing Technology, Wuxi, Jiangsu 214083)
  • Online: 2014-08-15

摘要: 随着计算能力的增强、应用课题规模和复杂度的增加,高性能计算机对并行文件系统性能要求越来越高.在海量小文件和大规模并发I/O操作的应用场景中,文件系统元数据的吞吐率成为限制其性能的关键因素.设计并实现了元数据代理(meta data delegation service, MDDS),通过降低元数据服务间的耦合度,保证元数据集群的高可用性;使用目录子树方式管理元数据代理空间,避免跨节点目录引入的分布式原子操作的复杂性和低效性.并针对高性能计算中I/O转发架构,提出基于元数据代理的两种作业调度策略——单作业独占单元数据代理调度和多作业共享多元数据代理调度——实现作业间和作业内的负载均衡.在116台存储服务器上对MDDS进行评估,实验结果表明,元数据代理提供了拟线性的元数据性能,在大规模的环境中较Lustre CMD方案有较好的扩展性;两种调度方式有效分散了作业元数据的负载,改善了高性能计算中的元数据瓶颈问题.

关键词: 高性能计算, 并行文件系统, 元数据代理, I/O转发, 负载均衡

Abstract: With the increasing of the computational ability of supercomputers, problem size and complexity targeted by applications, higher performance of I/O subsystems is required. While the throughput of single metadata server limited the performance of the parallel file system in high concurrent access and high-frequency file creating/deleting scenarios. Focused on the typical parallel I/O scenario in high performance computing, MDDS (meta data delegation service) is implemented in Lustre file system, which uses loose coupling to keep the high availability of the cluster, organizes the MDDS namespace by directory subtree to avoid the complexity and inefficiency of distributed atomic operations introduced by cross-node operations, and uses metadata migration mechanism to avoid objects data moving between data servers. Oriented to I/O-forwarding framework, two job-scheduling strategies, one job scheduled on single MDDS and jobs sharing multiple MDDS, are addressed to achieve load balancing of the requests for metadata inside or between jobs. The performance of MDDS is evaluated on 116 storage servers. The initial experimental results show that quasi linear scalable metadata performance is achieved by MDDS, and even show better scalability than Lustre CMD (cluster metadata Design) in larger-scale cluster. The two job-scheduling strategies distribute the applications' metadata access load effectively, and overcome performance bottlenecks in accessing file metadata in HPC.

Key words: high performance computing, parallel file system, metadata delegation service, I/O-forwarding framework, load balance

中图分类号: