Tang Yingjie, Wang Fang, Xie Yanwen. An Efficient Failure Reconstruction Based on In-Network Computing for Erasure-Coded Storage Systems[J]. Journal of Computer Research and Development, 2019, 56(4): 767-778. DOI: 10.7544/issn1000-1239.2019.20170834
Citation:
Tang Yingjie, Wang Fang, Xie Yanwen. An Efficient Failure Reconstruction Based on In-Network Computing for Erasure-Coded Storage Systems[J]. Journal of Computer Research and Development, 2019, 56(4): 767-778. DOI: 10.7544/issn1000-1239.2019.20170834
Tang Yingjie, Wang Fang, Xie Yanwen. An Efficient Failure Reconstruction Based on In-Network Computing for Erasure-Coded Storage Systems[J]. Journal of Computer Research and Development, 2019, 56(4): 767-778. DOI: 10.7544/issn1000-1239.2019.20170834
Citation:
Tang Yingjie, Wang Fang, Xie Yanwen. An Efficient Failure Reconstruction Based on In-Network Computing for Erasure-Coded Storage Systems[J]. Journal of Computer Research and Development, 2019, 56(4): 767-778. DOI: 10.7544/issn1000-1239.2019.20170834
(Wuhan National Laboratory for Optoelectronics (Huazhong University of Science and Technology), Wuhan 430074) (Key Laboratory of Information Storage System (Huazhong University of Science and Technology), Ministry of Education, Wuhan 430074) (Shenzhen Huazhong University of Science and Technology Research Institute, Shenzhen, Guangdong 518000)
Nowadays, the scale of distributed storage systems is getting increasingly larger. No matter whether the storage devices are disks or solid-state drives, the system is always faced with the risk of data loss. Traditional storage systems maintain three copies of each data block to ensure high reliability. Today, a number of distributed storage systems are increasingly shifting to the use of erasure codes because they can offer higher reliability and lower storage overhead. The erasure codes, however, have an obvious shortcoming in the reconstruction of an unavailable block, because they need to read multiple disks, which results in a large amount of network traffic and disk operations and ultimately high recovery overhead. In this paper, INP (in-network pipeline), an effective failure reconstruction scheme based on in-network computing that utilizes SDN (software defined networking) technology is presented in order to reduce the overhead of recovery without sacrificing any other performance. We use the global topology information for network from SDN controller to establish the tree of reconstruction, and transmit data according to it. The switches do part of the calculation that can reduce the network traffic, therefore to eliminate the bottleneck of the network, and to enhance the recovery performance. We evaluate the recovery efficiency of INP in different network bandwidths. Compared with the common erasure code system, it greatly reduces the network traffic and in a certain bandwidth, the degraded read time is the same as that of normal reading.
Ren Yi, Wu Quanyuan, Jia Yan, Han Weihong, and Guan Jianbo. A Survey of Transaction Processing Technology[J]. Journal of Computer Research and Development, 2005, 42(10): 1779-1784.