• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Zhang Zilin, Liu Duo, Tan Yujuan, Wu Yu, Luo Longpan, Wang Weilüe, Qiao Lei. An Erasure-Coded Data Update Method for Distributed Storage Clusters[J]. Journal of Computer Research and Development, 2022, 59(11): 2451-2466. DOI: 10.7544/issn1000-1239.20210211
Citation: Zhang Zilin, Liu Duo, Tan Yujuan, Wu Yu, Luo Longpan, Wang Weilüe, Qiao Lei. An Erasure-Coded Data Update Method for Distributed Storage Clusters[J]. Journal of Computer Research and Development, 2022, 59(11): 2451-2466. DOI: 10.7544/issn1000-1239.20210211

An Erasure-Coded Data Update Method for Distributed Storage Clusters

Funds: This work was supported by the National Natural Science Foundation of China (62072059) and the Funds for Chongqing Distinguished Young Scholars (cstc2020jcyj-jqX0012).
More Information
  • Published Date: October 31, 2022
  • Erasure coding is widely deployed in distributed storage clusters to provide data reliability, but the disk I/O overhead becomes a performance bottleneck when data updates are intensive. On the one hand, traditional data update strategies need to read the original data chunk, and then write new data when updating the data chunk. In the case of intensive updates, frequent write-after-read seriously affects the write performance of the storage clusters. On the other hand, the operations of updating the parity chunk include reading the increments randomly distributed in the log file and merging them with the data file, which also introduces additional disk seek overhead. In this paper, a data updating method, named PARD (parity logging with reserved space and data delta), is proposed to solve these problems. The main idea of PARD is to use the linear calculations of erasure coding to reduce write-after-read, and take advantage of the disk characteristics to reduce the disk seek overhead. PARD comprises three key design features: 1) Adopting in-place data updates and log-based parity updates. 2) Taking advantage of the linear calculations of erasure coding to construct the log based on data increments. For a series of write requests to the same data chunk, only the first update needs to read the original data chunk, and the subsequent update executes the pure write, which remarkably reduces the write-after-read. 3) According to the characteristics of disk, reserving space for the log at the end of data file to reduce the disk seek overhead of reading and writing log. Experiments show that when the chunk size is 4 MB, PARD gains at least, 30.4%, 47.0% and 82.0% improvements in update throughput compared with PLR, PARIX, and FO, respectively.
  • Related Articles

    [1]Li Qian, Hu Yupeng, Ye Zhenyu, Xiao Ye, Qin Zheng. An Ant Colony Optimization Algorithms Based Data Update Scheme for Erasure-Coded Storage Systems[J]. Journal of Computer Research and Development, 2021, 58(2): 305-318. DOI: 10.7544/issn1000-1239.2021.20200383
    [2]Zhang Yao, Chu Jiajia, Weng Chuliang. Survey on Data Updating in Erasure-Coded Storage Systems[J]. Journal of Computer Research and Development, 2020, 57(11): 2419-2431. DOI: 10.7544/issn1000-1239.2020.20190675
    [3]Tan Chao, Ji Genlin, Zhao Bin. Self-Adaptive Streaming Big Data Learning Algorithm Based on Incremental Tangent Space Alignment[J]. Journal of Computer Research and Development, 2017, 54(11): 2547-2557. DOI: 10.7544/issn1000-1239.2017.20160712
    [4]Wu Libing, Dang Ping, Nie Lei, He Yanxiang, Li Fei. A Fragmentable Admission Control Algorithm for Resource Reservation[J]. Journal of Computer Research and Development, 2014, 51(6): 1199-1205.
    [5]Xing Jing, Xiong Jin, Sun Ninghui, and Ma Jie. Scalable Storage Space Management to Support EB-Scale Storage[J]. Journal of Computer Research and Development, 2013, 50(8): 1573-1582.
    [6]Liu Shenglan, Feng Lin, Jin Bo, Wu Zhenyu. A New Local Space Alignment Algorithm[J]. Journal of Computer Research and Development, 2013, 50(7): 1426-1434.
    [7]Xiao Peng, Hu Zhigang. An Adaptive Dynamic Redundant Reservation Strategy in Grid Computing[J]. Journal of Computer Research and Development, 2013, 50(3): 481-489.
    [8]Liu Xiaozhu, Peng Zhiyong. On-Line Dynamic Index Hybrid Update Scheme Based on Self-Learning of Allocated Space[J]. Journal of Computer Research and Development, 2012, 49(10): 2118-2130.
    [9]Hu Chunming, Huai Jinpeng, and Wo Tianyu. Flexible Resource Capacity Reservation Mechanism for Service Grid Using Slack Time[J]. Journal of Computer Research and Development, 2007, 44(1): 20-28.
    [10]Yang Renzhong, Hou Zifeng, Li Jingxia. A Resource Reservation Scheme for PCF Handoff Station in WLAN[J]. Journal of Computer Research and Development, 2005, 42(11): 1962-1968.

Catalog

    Article views (118) PDF downloads (59) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return