Proxy Based Metadata Optimization and Implementation in Parallel Filesystem
-
摘要: 在高性能计算环境中,并行文件系统面临百万量级的客户端,这些客户端往往在同一时间段内发出大量并发I/O请求,使元数据服务器承载巨大的压力.另一方面,这些客户端发出的并发读写请求往往指向同一目录,导致很难将元数据负载调度到多个服务器上.为此,提出在并行文件系统的客户端和元数据服务器之间增加一级代理(proxy),并给出相应的优化措施降低元数据服务器的负载.在元数据代理上实现2方面的优化:1)由于高性能计算程序往往并发访问大量的文件,可以考虑通过元数据聚合将大量请求合并成1个请求发送到元数据服务器上,降低元数据服务器的负载;2)高性能计算程序的并发I/O往往指向同一目录,而传统的元数据负载均衡机制一般采用子树划分的方法将元数据负载调度到多个元数据服务器上,无法实现针对同一目录元数据操作的负载均衡,通过代理将针对同一目录的元数据操作调度到多个元数据服务器上,实现细粒度的负载均衡.Abstract: In high-performance computing environment, parallel file system faces a mega client. These clients often issue a large number of concurrent IO request to the system in the same period of time, making the metadata server under a huge pressure. On the other hand, concurrent read and write requests from these clients often relate to the same directory. It makes it difficult to schedule work load across multiple servers. Therefore, we add a proxy server between the client and the metadata server and propose corresponding optimization methods to reduce the work load of the metadata server. In this paper, we realize two aspects of optimization based on proxy server. First of all, since the high-performance computing program often access files concurrently, we consider merging the numerous requests into a big one and then sent it to metadata server. Secondly, concurrent IO from the high-performance computing program often points to the same directory. Traditional metadata load balancing mechanism commonly use sub-tree partitioning method to dispatch work load across multiple server. This method is unable to realize load balancing in the situation where all operations relate to the same directory. The paper realizes fine-grained load balancing by scheduling the operations from the same directory to the plurality of metadata servers.
-
Keywords:
- proxy server /
- high concurrency /
- high performance computing /
- parallel file system /
- load balancing
-
-
期刊类型引用(5)
1. 李贻婷. 基于混合算法的云制造资源配置研究. 自动化与信息工程. 2022(02): 41-44+48 . 百度学术
2. 陈媛. “互联网”背景下的高校毕业生档案管理系统. 现代电子技术. 2021(01): 167-171 . 百度学术
3. 刘晴,蔡健挺,姜海,何春涛. 基于元数据的电网通信资源数据校核方法. 计算技术与自动化. 2020(04): 148-153 . 百度学术
4. 王煜,叶赛,范文涛. 基于粒度结构分析的数控机床制造信息资源自动化检测方法. 制造业自动化. 2019(12): 120-124 . 百度学术
5. 杨琼,王冬. 基于分区操作系统的文件并行访问方法. 航空计算技术. 2018(05): 85-87+94 . 百度学术
其他类型引用(2)
计量
- 文章访问数: 1162
- HTML全文浏览量: 1
- PDF下载量: 511
- 被引次数: 7