• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Bao Han, Wang Yijie. A Fast Construction Method of the Erasure Code with Small Cross-Cloud Data Center Repair Traffic[J]. Journal of Computer Research and Development, 2023, 60(10): 2418-2439. DOI: 10.7544/issn1000-1239.202220580
Citation: Bao Han, Wang Yijie. A Fast Construction Method of the Erasure Code with Small Cross-Cloud Data Center Repair Traffic[J]. Journal of Computer Research and Development, 2023, 60(10): 2418-2439. DOI: 10.7544/issn1000-1239.202220580

A Fast Construction Method of the Erasure Code with Small Cross-Cloud Data Center Repair Traffic

Funds: This work was supported by the National Key Research and Development Program of China(2016YFB1000101), the National Natural Science Foundation of China(61379052), the Science Foundation of Ministry of Education of China(2018A02002), and the Natural Science Foundation for Distinguished Young Scholars of Hunan Province(14JJ1026).
More Information
  • Author Bio:

    Bao Han: born in 1992. PhD. His main research interests include cloud storage and erasure coding

    Wang Yijie: born in 1971. PhD, professor, PhD supervisor. Distinguished member of CCF. Her main research interests include distributed storage, big data analysis, and cloud computing

  • Received Date: June 15, 2022
  • Revised Date: September 15, 2022
  • Available Online: April 17, 2023
  • Compared with cross-cloud data center replication, cross-cloud data center erasure code is more reliable and space-efficiency. However, existing cross-cloud data center erasure codes cannot achieve low cross-cloud data center repair traffic, high encoding parameters adaptability, and high erasure code construction efficiency at the same time, so they are rarely used in production. We propose a fast construction method of the erasure code with small cross-cloud data center repair traffic, called FMEL, which can obtain the erasure code with small cross-cloud data center repair traffic quickly under different encoding parameters. Specifically, FMEL converts erasure code repair group distribution schemes and the corresponding encoding parameters into fixed-length feature vectors, and verifies whether the erasure code repair group distribution schemes match the encoding parameter by classifying corresponding feature vectors with a support vector machine—a feature vector positively indicates that the corresponding erasure code repair group distribution scheme passes the verification. Then, FMEL uses a parallel search algorithm to pick the erasure code repair group distribution scheme with the smallest cross-cloud data center repair traffic from all distribution schemes passing the verification, and converts it into the generator matrix of the erasure code with small cross-cloud data center repair traffic. Experiments in a cross-cloud data center environment show that FMEL can construct the optimal code that can achieve the lower bound of cross-cloud data center repair traffic under most encoding parameters. Meanwhile, FMEL’s erasure code construction time is 89% less than the existing work’s optimal code construction time. Compared with several popular erasure codes, the erasure code constructed by FMEL can reduce the cross-cloud data center repair traffic by from 42.9% to 56.0%.

  • [1]
    Cheng Yuxia, Yu Xinjie, Chen Wenzhi, et al. A practical cross-datacenter fault-tolerance algorithm in the cloud storage system[J]. Cluster Computing, 2017, 20(2): 1801−1813 doi: 10.1007/s10586-017-0840-5
    [2]
    搜狐. 亚马逊AWS证实晚间宕机[EB/OL]. (2019-06-24)[2019-08-11]. http://www.sohu.com/a/322769512_115060

    SOHU. Amazon AWS confirms the downtime in night [EB/OL]. (2019-06-24)[2019-08-11]. http://www.sohu.com/a/322769512_115060 (in Chinese)
    [3]
    搜狐. AWS 数据中心再出断电事故, 丢失数据超过1TB[EB/OL]. (2019-09-05)[2021-09-24]. https://www.sohu.com/a/338998898_468733

    SOHU. Unexpected power outage in AWS data center causes over 1TB of data loss [EB/OL]. (2019-09-05)[2021-09-24]. https://www.sohu.com/a/338998898_468733 (in Chinese)
    [4]
    新浪科技. 光缆挖断影响支付宝[EB/OL]. (2015-05-27)[2019-08-11]. http: //tech.sina.com.cn/i/2015-05-27//doc-iavxeafs8200893.shtml

    Sina Technology. Cable smashing affects Alipay [EB/OL]. (2015-05-27)[2019-08-11]. http://tech.sina.com.cn/i/2015-05-27//doc-iavxeafs8200893.shtml (in Chinese)
    [5]
    搜狐. 日本地震危及数家IT巨头设在东京的数据中心[EB/OL]. (2019-06-03)[2019-08-11]. http://it.sohu.com/20110311/n279778961.shtml

    SOHU. Japan earthquake threatens data centers of several IT giants in Tokyo [EB/OL]. (2019-06-03)[2019-08-11]. http://it.sohu.com/20110311/n279778961.shtml (in Chinese)
    [6]
    科技迅. 官方回应亚马逊中国云服务大规模故障[EB/OL]. (2019-06-03)[2019-08-11]. http://www.kejixun.com/article/190603/464156.shtml

    Kejixun. Official response to large-scale failure of Amazon China cloud service: Affected by the construction party to cut fiber [EB/OL]. (2019-06-03)[2019-08-11]. http://www.kejixun.com/article/190603/464156.shtml (in Chinese)
    [7]
    Wang Huaimin, Shi Peichang, Zhang Yiyan. JointCloud: A cross-cloud cooperation architecture for integrated Internet service customization[C]// Proc of the 37th IEEE Int Conf on Distributed Computing Systems (ICDCS). Piscataway, NJ: IEEE, 2017: 1846−1855
    [8]
    Zhang Yuchao, Nie Xiaohui, Jiang Junchen, et al. BDS+: An inter-datacenter data replication system with dynamic bandwidth separation[J]. IEEE/ACM Transactions on Networking, 2021, 29(2): 918−934 doi: 10.1109/TNET.2021.3054924
    [9]
    Zhou Tianli, Tian Chao. Fast erasure coding for data storage[J]. ACM Transactions on Storage, 2020, 16(1): 1−24
    [10]
    Wang Yijie, Li Sikun. Research and performance evaluation of data replication technology in distributed storage systems[J]. International Journal of Computer and Mathematics with Applications, 2006, 51(11): 1625−1632 doi: 10.1016/j.camwa.2006.05.002
    [11]
    王意洁,许方亮,裴晓强. 分布式存储中的纠删码容错技术研究[J]. 计算机学报,2017,40(1):236−255 doi: 10.11897/SP.J.1016.2017.00236

    Wang Yijie, Xu Fangliang, Pei Xiaoqiang. Research on erasure code-based fault-tolerant technology for distributed storage[J]. Chinese Journal of Computers, 2017, 40(1): 236−255 (in Chinese) doi: 10.11897/SP.J.1016.2017.00236
    [12]
    Wang Yijie, Pei Xiaoqiang, Ma Xingkong, et al. TA-Update: An adaptive update scheme with tree-structured transmission in erasure-coded storage systems[J]. IEEE Transactions on Parallel and Distributed Systems, 2017, 29(8): 1893−1906
    [13]
    俞新杰. 跨数据中心容错的云存储系统[D]. 杭州: 浙江大学, 2016

    Yu Xinjie. Cloud storage system with cross datacenters fault tolerance [D]. Hangzhou: Zhejiang University, 2016 (in Chinese)
    [14]
    Caneleo P, Mohan L, Parampalli U, et al. On improving recovery performance in erasure code based geo-diverse storage clusters [C] //Proc of the 12th Int Conf on the Design of Reliable Communication Networks. Piscataway, NJ: IEEE, 2016: 123−129
    [15]
    Chen H, Hu Yuchong, Lee P, et al. NCCloud: A network-coding-based storage system in a cloud-of-clouds[J]. IEEE Transactions on Computers, 2013, 63(1): 31−44
    [16]
    Hu Yuchong, Chen H, Lee P, et al. NCCloud: Applying network coding for the storage repair in a cloud-of-clouds [C]//Proc of the 10th USENIX Conf on File and Storage Technologies. Berkeley, CA: USENIX Association, 2012: 21
    [17]
    Hu Yuchong, Lee P, Zhang Xiaoyang. Double regenerating codes for hierarchical data centers [C]//Proc of the IEEE Int Symp on Information Theory (ISIT). Piscataway, NJ: IEEE, 2016: 245-249
    [18]
    Xie Xin, Wu Chentao, Gu Junqing, et al. AZ-Code: An efficient availability zone level erasure code to provide high fault tolerance in cloud storage systems [C]//Proc of the 35th Symp on Mass Storage Systems and Technologies (MSST). Piscataway, NJ: IEEE, 2019: 230−243
    [19]
    Bao Han, Wang Yijie, Xu Fangliang. An adaptive erasure code for JointCloud storage of Internet of things big data[J]. IEEE Internet of Things Journal, 2020, 7(3): 1613−1624 doi: 10.1109/JIOT.2019.2947720
    [20]
    亚马逊. AWS上的云存储[EB/OL]. (2021-09-24)[2021-09-24].https: //aws.amazon.com/cn/products/storage/

    AWS. Cloud storage on AWS[EB/OL]. (2021-09-24)[2021-09-24].https://aws.amazon.com/cn/products/storage/ (in Chinese)
    [21]
    Huang Cheng, Simitci H, Xu Yikang, et al. Erasure coding in windows Azure storage [C]//Proc of the USENIX Annual Technical Conf. Berkeley, CA: USENIX Association, 2012: 2
    [22]
    Sathiamoorthy M, Asteris M, Papailiopoulos D, et al. XORing elephants: Novel erasure codes for big data[J]. VLDB Endowment, 2013, 6(3): 325−336
    [23]
    Shahabinejad M, Khabbazian M, Ardakani M. On the average locality of locally repairable codes[J]. IEEE Transactions on Communications, 2017, 66(7): 2773−2783
    [24]
    Saeed S. Sandooq: Improving the communication cost and service latency for a multi-user erasure-coded geo-distributed cloud environment [D]. Urbana-Champaign: University of Illinois at Urbana-Champaign, 2016
    [25]
    Xu Fangliang, Wang Yijie, Ma Xingkong. Incremental encoding for erasure-coded cross-datacenters cloud storage[J]. Future Generation Computer Systems, 2018, 87: 527−537 doi: 10.1016/j.future.2018.04.047
    [26]
    包涵,王意洁,许方亮. 基于生成矩阵变换的跨数据中心纠删码写入方法[J]. 计算机研究与发展,2020,57(2):291−305

    Bao Han, Wang Yijie, Xu Fangliang. A cross-datacenter erasure code writing method based on generator matrix transformation[J]. Journal of Computer Research and Development, 2020, 57(2): 291−305 (in Chinese)
    [27]
    Murashka V. A generalization of Hall’s theorem on hypercenter [EB/OL]. (2021-08-16)[2022-07-25].https://arxiv.org/abs/2103.04900v2
    [28]
    Wang Yijie, Li Xiaoyong, Li Xiaoling, et al. A survey of queries over uncertain data[J]. Knowledge & Information Systems, 2013, 37(3): 485−530
    [29]
    Wang Yijie, Ma Xingkong. A general scalable and elastic content-based publish/subscribe service[J]. IEEE Transactions on Parallel & Distributed Systems, 2015, 26(8): 2100−2113
    [30]
    Wang Zhenya, Yao Ligang, Cai Yongwu, et al. Mahalanobis semi-supervised mapping and beetle antennae search based support vector machine for wind turbine rolling bearings fault diagnosis[J]. Renewable Energy, 2020, 155: 1312−1327 doi: 10.1016/j.renene.2020.04.041
    [31]
    Shankar K, Lakshmanaprabu S, Gupta D, et al. Optimal feature-based multi-kernel SVM approach for thyroid disease classification[J]. The Journal of Supercomputing, 2020, 76(28): 1−16
    [32]
    Sherki P, Vala V. A class-incremental classification method based on support vector machine[C/OL]// Proc of the 14th IEEE Int Conf on Semantic Computing (ICSC). Piscataway, NJ: IEEE, 2020: 31−36
    [33]
    Li Xiaolu, Li Runhui, Lee P, et al. OpenEC: Toward unified and configurable erasure coding management in distributed storage systems [C]//Proc of the 17th USENIX Conf on File and Storage Technologies. Berkeley, CA: USENIX Association, 2019: 331−344
    [34]
    Liu Tao, Wu Shaocheng, Li Jin, et al. Blockchain-based trusted sharing of electric energy privacy data[C]// Proc of the Int Conf on Cyberspace Innovation of Advanced Technologies. New York: ACM, 2020: 556−564
    [35]
    优刻得. 优刻得官网[EB/OL]. (2021-09-24)[2021-09-24].https: //www.ucloud.cn

    UCloud. UCloud's official website [EB/OL]. (2021-09-24)[2021-09-24].https://www.ucloud.cn (in chinese)
    [36]
    Gao Zhen, Zhang Lingling, Cheng Yinghao, et al. Design of FPGA-implemented Reed-Solomon erasure code decoders with fault detection and location on user memory[J]. IEEE Transactions on Very Large Scale Integration Systems, 2021, 29(6): 1073−1082 doi: 10.1109/TVLSI.2021.3066804
    [37]
    Apache. Apache Hadoop 3.0. 0 [EB/OL]. (2021-09-24)[2021-09-24]. http://hadoop.apache.org/docs/r3.0.0/
    [38]
    Andrew F. Storage architecture and challenges at Google faculty summit 2010[EB/OL]. (2010-06-29)[2019-08-11]. https://www.systutoriaLS.com/3306/storage-architecture-and-challenges/
    [39]
  • Related Articles

    [1]Xu Pengyu, Kuang Boyu, Su Mang, Fu Anmin. Survey of Large-Language-Model-Based Automated Program Repair[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440467
    [2]Liu Si, Zhang Degan, Liu Xiaohuan, Zhang Ting, Wu Hao. An Adaptive Repair Algorithm for AODV Routing Based on Decision Region[J]. Journal of Computer Research and Development, 2020, 57(9): 1898-1910. DOI: 10.7544/issn1000-1239.2020.20190508
    [3]Bao Han, Wang Yijie, Xu Fangliang. A Cross-Datacenter Erasure Code Writing Method Based on Generator Matrix Transformation[J]. Journal of Computer Research and Development, 2020, 57(2): 291-305. DOI: 10.7544/issn1000-1239.2020.20190542
    [4]Zhang Xiaoyang, Xu Jiahao, Hu Yuchong. Proactive Locally Repairable Codes for Cloud Storage Systems[J]. Journal of Computer Research and Development, 2019, 56(9): 1988-2000. DOI: 10.7544/issn1000-1239.2019.20190048
    [5]Liu Pei, Jiang Ziyi, Cao Xiu. Node Selection Algorithm During Multi-Nodes Repair Progress in Distributed Storage System[J]. Journal of Computer Research and Development, 2018, 55(7): 1557-1568. DOI: 10.7544/issn1000-1239.2018.20160915
    [6]Luo Xianghong and Shu Jiwu. Summary of Research for Erasure Code in Storage System[J]. Journal of Computer Research and Development, 2012, 49(1): 1-11.
    [7]Zhang Guangxing, Xie Gaogang, Zhang Dafang. Gamma Distribution of the Internet Traffic Zoomed[J]. Journal of Computer Research and Development, 2011, 48(9): 1597-1607.
    [8]Xiao Bailong, Guo Wei, Liu Jun, Zhu Silu. Research on Local Route Repair Algorithm in Mobile Ad Hoc Networks[J]. Journal of Computer Research and Development, 2007, 44(8): 1383-1389.
    [9]Wang Dan, Xie Gaogang, Yang Jianhua, Zhang Guangxing, Li Zhenyu. An Improved Adaptive Sampling Method for Traffic Measurement[J]. Journal of Computer Research and Development, 2007, 44(8): 1339-1347.
    [10]Wang Shenghui and Qiu Zhengding. Multi-Step Prediction of VBR Video Traffic Based on Mutifractal Analysis[J]. Journal of Computer Research and Development, 2007, 44(1): 92-98.
  • Cited by

    Periodical cited type(1)

    1. 王勇,熊毅,杨天宇,沈益冉. 一种面向耳戴式设备的用户安全连续认证方法. 计算机研究与发展. 2024(11): 2821-2834 . 本站查看

    Other cited types(0)

Catalog

    Article views (150) PDF downloads (92) Cited by(1)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return