ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2016, Vol. 53 ›› Issue (9): 1918-1929.doi: 10.7544/issn1000-1239.2016.20150075

Previous Articles     Next Articles

Concurrent Node Reconstruction for Erasure-Coded Storage Clusters

Huang Jianzhong, Cao Qiang, Huang Siti, Xie Changsheng   

  1. (Wuhan National Laboratory for Optoelectronics (Huazhong University of Science and Technology), Wuhan 430074)
  • Online:2016-09-01

Abstract: A key design goal of erasure-coded storage clusters is to minimize network traffic incurred by reconstruction I/Os, because reducing network traffic helps to shorten reconstruction time, which in turn leads to high system reliability. An interleaved reconstruction scheme (IRS) is proposed to address the issue of concurrently recovering two and more failed nodes. With analyzing the I/O flows of centralized reconstruction scheme (CRec) and decentralized reconstruction scheme (DRec), it is revealed that reconstruction performance bottleneck lies in the manager node for CRec and replacement nodes for DRec. IRS improves CRec and DRec from two aspects: 1) acting as rebuilding nodes, replacement nodes deal with reconstruction I/Os in a parallel manner, thereby bypassing the storage manager in CRec; 2) all replacement nodes collaboratively rebuild all failed blocks, exploiting structural properties of erasure codes to transfer each surviving block only once during the reconstruction process, and achieving high reconstruction I/O parallelism. The three reconstruction schemes (i.e., CRec, DRec, and IRS) are implemented under (k+r, k) Reed-Solomon-coded storage clusters where real-world I/O traces are replayed. Experimental results show that, under an erasure-coded storage cluster with parameters k=9 and r=3, IRS outperforms both CRec and DRec schemes in terms of reconstruction time by a factor of at least 1.63 and 2.14 for double-node and triple-node on-line reconstructions, respectively.

Key words: erasure codes, clustered storage, storage reliability, node reconstruction, interleaved reconstruction

CLC Number: