Abstract:
Storage area network (SAN) is important for network storage. We construct BWRAID, a distributed RAID in a SAN, from commodity hardware. The original version BWRAID has a symmetric architecture. However, it has three problems: Firstly, it reads data to re-calculate parity when it expands, and the re-calculation consumes much IO and time. Secondly, it recovers data to one storage node (SN), and recovery can be more efficient if parallel on multi nodes. Thirdly, its data layout is bad for IO, that makes its internal RAID4 have many costly read-modify-write updates even on sequential writes. To solve these problems, we propose a network declustering BWRAID. It has an asymmetric architecture like a declustering RAID, but it is declustered by equal size virtual disks instead of blocks. It is expanded by moving virtual disks without calculation. It runs multi recovery in parallel as the number of nodes involved in each recovery is less than the total number of nodes in the system. To optimize IO, we change its data layout to express user IO space by internal RAID4 stripes. We also provide algorithms to search suitable virtual disks for system allocation, expansion, or recovery. Experiments show that network declustering BWRAID is better than the original one. It expands the system without calculating parity five to eight times faster, and its parallel recovery is multi times faster, and it increases the IO performance with the new data layout.