An Erasure-Coded Data Update Method for Distributed Storage Clusters

Zhang Zilin; Liu Duo; Tan Yujuan; Wu Yu; Luo Longpan; Wang Weilüe; Qiao Lei

doi:10.7544/issn1000-1239.20210211

Journal of Computer Research and Development > 2022 > 59(11): 2451-2466. > DOI: 10.7544/issn1000-1239.20210211 CSTR: 32373.14.issn1000-1239.20210211

Zhang Zilin, Liu Duo, Tan Yujuan, Wu Yu, Luo Longpan, Wang Weilüe, Qiao Lei. An Erasure-Coded Data Update Method for Distributed Storage Clusters[J]. Journal of Computer Research and Development, 2022, 59(11): 2451-2466. DOI: 10.7544/issn1000-1239.20210211

Citation:

PDF (2326 KB)

An Erasure-Coded Data Update Method for Distributed Storage Clusters

¹(College of Computer Science, Chongqing University, Chongqing 400044)
²(Beijing Institute of Control Engineering, Beijing 100080)

Funds: This work was supported by the National Natural Science Foundation of China (62072059) and the Funds for Chongqing Distinguished Young Scholars (cstc2020jcyj-jqX0012).

More Information

Published Date: October 31, 2022

Graphical Abstract

Abstract

Abstract

Erasure coding is widely deployed in distributed storage clusters to provide data reliability, but the disk I/O overhead becomes a performance bottleneck when data updates are intensive. On the one hand, traditional data update strategies need to read the original data chunk, and then write new data when updating the data chunk. In the case of intensive updates, frequent write-after-read seriously affects the write performance of the storage clusters. On the other hand, the operations of updating the parity chunk include reading the increments randomly distributed in the log file and merging them with the data file, which also introduces additional disk seek overhead. In this paper, a data updating method, named PARD (parity logging with reserved space and data delta), is proposed to solve these problems. The main idea of PARD is to use the linear calculations of erasure coding to reduce write-after-read, and take advantage of the disk characteristics to reduce the disk seek overhead. PARD comprises three key design features: 1) Adopting in-place data updates and log-based parity updates. 2) Taking advantage of the linear calculations of erasure coding to construct the log based on data increments. For a series of write requests to the same data chunk, only the first update needs to read the original data chunk, and the subsequent update executes the pure write, which remarkably reduces the write-after-read. 3) According to the characteristics of disk, reserving space for the log at the end of data file to reduce the disk seek overhead of reading and writing log. Experiments show that when the chunk size is 4 MB, PARD gains at least, 30.4%, 47.0% and 82.0% improvements in update throughput compared with PLR, PARIX, and FO, respectively.
- erasure codes,
- storage cluster,
- data update,
- delta,
- reserved space

FullText(HTML)

References (0)

[1]	Yu Ruiqi, Zhang Xinyun, Ren Shuang. A Review of Quantum Machine Learning Algorithms Based on Variational Quantum Circuit[J]. Journal of Computer Research and Development, 2025, 62(4): 821-851. DOI: 10.7544/issn1000-1239.202330979
[2]	Qian Luoxiong, Chen Mei, Ma Xueyan, Zhang Chi, Zhang Jinhong. Multi-View Clustering Based on Adaptive Tensor Singular Value Shrinkage[J]. Journal of Computer Research and Development, 2025, 62(3): 733-750. DOI: 10.7544/issn1000-1239.202330785
[3]	Pan Shijie, Gao Fei, Wan Linchun, Qin Sujuan, Wen Qiaoyan. Quantum Algorithm for Spectral Regression[J]. Journal of Computer Research and Development, 2021, 58(9): 1835-1842. DOI: 10.7544/issn1000-1239.2021.20210366
[4]	Yu Runlong, Zhao Hongke, Wang Zhong, Ye Yuyang, Zhang Peining, Liu Qi, Chen Enhong. Negatively Correlated Search with Asymmetry for Real-Parameter Optimization Problems[J]. Journal of Computer Research and Development, 2019, 56(8): 1746-1757. DOI: 10.7544/issn1000-1239.2019.20190198
[5]	Zhang Cheng, Wang Dong, Shen Chuan, Cheng Hong, Chen Lan, Wei Sui. Separable Compressive Imaging Method Based on Singular Value Decomposition[J]. Journal of Computer Research and Development, 2016, 53(12): 2816-2823. DOI: 10.7544/issn1000-1239.2016.20150414
[6]	Ning Xin, Li Weijun, Li Haoguang, Liu Wenjie. Uncorrelated Locality Preserving Discriminant Analysis Based on Bionics[J]. Journal of Computer Research and Development, 2016, 53(11): 2623-2629. DOI: 10.7544/issn1000-1239.2016.20150630
[7]	Zhao Feng, Huang Qingming, Gao Wen. An Image Matching Algorithm Based on Singular Value Decomposition[J]. Journal of Computer Research and Development, 2010, 47(1): 23-32.
[8]	Lin Yuan, Luo Siwei, and Yang Liner. Recommendation-Based Grid Resource Matching Algorithm[J]. Journal of Computer Research and Development, 2009, 46(11): 1814-1820.
[9]	Sun Yong, Wu Bo, and Feng Yanpeng. A Policy-and Value- Iteration Algorithm for POMDP[J]. Journal of Computer Research and Development, 2008, 45(10): 1763-1768.
[10]	Zhang Shihui, Kong Lingfu, and Feng Liang. An Improved Hestenes SVD Method and Its Parallel Computing and Application in Parallel Robot[J]. Journal of Computer Research and Development, 2008, 45(4): 716-724.

Cited By

Cited by

Periodical cited type(2)

1.	李光. 基于区块链技术的建筑工程质量管理策略. 中国建筑装饰装修. 2025(02): 75-77 .
2.	Jing He，Xiaofeng Ma，Dawei Zhang，Feng Peng. Supervised and revocable decentralized identity privacy protection scheme. Security and Safety. 2024(04): 113-135 .