ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2015, Vol. 52 ›› Issue (11): 2568-2576.doi: 10.7544/issn1000-1239.2015.20148038

• 系统结构 • 上一篇    下一篇

网络分簇BWRAID:更快的扩展、恢复和读写性能

孙振元1,2,3,许鲁1,2,3,刘振军1,2,3,董欢庆1,2,刘昌1,2   

  1. 1(中国科学院计算技术研究所 北京 100190); 2(中科蓝鲸信息技术有限公司 北京 100089); 3(中国科学院大学 北京 100049) (sunzhenyuan@nrchpc.ac.cn)
  • 出版日期: 2015-11-01
  • 基金资助: 
    基金项目:国家“八六三”高技术研究发展计划基金项目(2013AA013205);国家“九七三”重点基础研究发展计划基金项目(2011CB302304);国家科技支撑计划基金项目(2011BAH04B04);中国科学院战略性先导科技专项基金项目(XDA06010401);中国科学院重点部署基金项目(KGZD-EW-103-5(7))

Network Declustering BWRAID: Faster Scalability, Recovery and IO Performance

Sun Zhenyuan1,2,3, Xu Lu1,2,3, Liu Zhenjun1,2,3, Dong Huanqing1,2, Liu Chang1,2   

  1. 1(Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190);2(Zhongke Blue Whale Information Technologies Co., Ltd, Beijing 100089);3(University of Chinese Academy of Sciences, Beijing 100049)
  • Online: 2015-11-01

摘要: 存储区域网(storage area network, SAN)是重要的网络存储方法.使用商用硬件BWRAID在SAN上实现了分布式RAID.初始版本的BWRAID使用全对称结构,然而其存在3个问题:1)扩展时要读取数据重新计算校验, IO负载高、扩展时间长;2)将数据集中恢复到单个存储节点,没有分布的并发恢复;3)数据布局不合理,导致内部RAID4有大量同步更新.为解决上述问题,提出了“网络分簇BWRAID”.新系统采用“分簇RAID”(declustering RAID)的非对称结构,分簇对象是相等大小的小虚拟盘而不是数据块;在扩展时,它在节点之间仅迁移虚拟卷,不需计算校验.由于一个恢复需要的节点数量小于节点总数,多个恢复就能并行.为优化IO使用新的数据布局,按内部RAID4条带组织用户的存储空间,并给出了搜索虚拟盘的算法,用于在系统分配、扩展、恢复时,搜索合适的虚拟盘.实验表明网络分簇BWRAID更好:在系统扩展时无需重新计算校验,加速扩展5~8倍;并行恢复成倍加速;新数据布局提高了IO性能.

关键词: 网络RAID, 分簇RAID, 数据布局, 可扩展性, 恢复, 虚拟存储

Abstract: Storage area network (SAN) is important for network storage. We construct BWRAID, a distributed RAID in a SAN, from commodity hardware. The original version BWRAID has a symmetric architecture. However, it has three problems: Firstly, it reads data to re-calculate parity when it expands, and the re-calculation consumes much IO and time. Secondly, it recovers data to one storage node (SN), and recovery can be more efficient if parallel on multi nodes. Thirdly, its data layout is bad for IO, that makes its internal RAID4 have many costly read-modify-write updates even on sequential writes. To solve these problems, we propose a network declustering BWRAID. It has an asymmetric architecture like a declustering RAID, but it is declustered by equal size virtual disks instead of blocks. It is expanded by moving virtual disks without calculation. It runs multi recovery in parallel as the number of nodes involved in each recovery is less than the total number of nodes in the system. To optimize IO, we change its data layout to express user IO space by internal RAID4 stripes. We also provide algorithms to search suitable virtual disks for system allocation, expansion, or recovery. Experiments show that network declustering BWRAID is better than the original one. It expands the system without calculating parity five to eight times faster, and its parallel recovery is multi times faster, and it increases the IO performance with the new data layout.

Key words: network RAID, declustering RAID, data layout, scalability, recovery, virtual storage

中图分类号: