高级检索

    面向纠删码存储集群的节点并发重构

    Concurrent Node Reconstruction for Erasure-Coded Storage Clusters

    • 摘要: 纠删码存储集群的一个关键设计目标是降低重构I/O所引起的网络流量,因为降低网络流量有助于缩短重构时间,进而提高可靠性. 针对2个或多个失效节点并发重构这一研究话题,提出一种交叉式重构方案(interleaved reconstruction scheme, IRS).所有替换节点能协同、并行地重构所有失效分块.通过对现有集中式重构方案(centralized reconstruction scheme, CRec)和分散式重构方案(decentralized reconstruction scheme, DRec)的I/O流进行分析,分析发现CRec中存储管理器和DRec中替换节点是重构性能的瓶颈. 针对此,IRS从2个方面进行改进:1)替换节点充当重构节点进行并行式重构,消除CRec中管理器这一重构瓶颈;2)利用纠删码的编码结构特性,所有替换节点协同地重构所有失效分块,确保重构时只传输一次所需存活分块.在Reed-Solomon码存储集群上实现了上述3个重构方案,并用真实I/O trace进行对比测试. 实验结果表明:当纠删码存储集群的编码参数为k=9和r=3时,IRS方案的双节点重构性能是其他2种重构方案的1.63倍;而3节点重构性能是其他2种重构方案的2.14倍.

       

      Abstract: A key design goal of erasure-coded storage clusters is to minimize network traffic incurred by reconstruction I/Os, because reducing network traffic helps to shorten reconstruction time, which in turn leads to high system reliability. An interleaved reconstruction scheme (IRS) is proposed to address the issue of concurrently recovering two and more failed nodes. With analyzing the I/O flows of centralized reconstruction scheme (CRec) and decentralized reconstruction scheme (DRec), it is revealed that reconstruction performance bottleneck lies in the manager node for CRec and replacement nodes for DRec. IRS improves CRec and DRec from two aspects: 1) acting as rebuilding nodes, replacement nodes deal with reconstruction I/Os in a parallel manner, thereby bypassing the storage manager in CRec; 2) all replacement nodes collaboratively rebuild all failed blocks, exploiting structural properties of erasure codes to transfer each surviving block only once during the reconstruction process, and achieving high reconstruction I/O parallelism. The three reconstruction schemes (i.e., CRec, DRec, and IRS) are implemented under (k+r, k) Reed-Solomon-coded storage clusters where real-world I/O traces are replayed. Experimental results show that, under an erasure-coded storage cluster with parameters k=9 and r=3, IRS outperforms both CRec and DRec schemes in terms of reconstruction time by a factor of at least 1.63 and 2.14 for double-node and triple-node on-line reconstructions, respectively.

       

    /

    返回文章
    返回