Abstract:
A key design goal of erasure-coded storage clusters is to minimize network traffic incurred by reconstruction I/Os, because reducing network traffic helps to shorten reconstruction time, which in turn leads to high system reliability. An interleaved reconstruction scheme (IRS) is proposed to address the issue of concurrently recovering two and more failed nodes. With analyzing the I/O flows of centralized reconstruction scheme (CRec) and decentralized reconstruction scheme (DRec), it is revealed that reconstruction performance bottleneck lies in the manager node for CRec and replacement nodes for DRec. IRS improves CRec and DRec from two aspects: 1) acting as rebuilding nodes, replacement nodes deal with reconstruction I/Os in a parallel manner, thereby bypassing the storage manager in CRec; 2) all replacement nodes collaboratively rebuild all failed blocks, exploiting structural properties of erasure codes to transfer each surviving block only once during the reconstruction process, and achieving high reconstruction I/O parallelism. The three reconstruction schemes (i.e., CRec, DRec, and IRS) are implemented under (k+r, k) Reed-Solomon-coded storage clusters where real-world I/O traces are replayed. Experimental results show that, under an erasure-coded storage cluster with parameters k=9 and r=3, IRS outperforms both CRec and DRec schemes in terms of reconstruction time by a factor of at least 1.63 and 2.14 for double-node and triple-node on-line reconstructions, respectively.