• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Xie Wenbing, Guan Ruixue, Zhang Yiming, Li Jiamei, Wang Jun. Efficient Optimization of Erasure Coding for Storage Library[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440091
Citation: Xie Wenbing, Guan Ruixue, Zhang Yiming, Li Jiamei, Wang Jun. Efficient Optimization of Erasure Coding for Storage Library[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440091

Efficient Optimization of Erasure Coding for Storage Library

More Information
  • Author Bio:

    Xie Wenbing: born in 1989. PhD candidate. His main research interests include dynamic binary translation, runtime libraries optimization, program analysis, compiler optimization

    Guan Ruixue: born in 1993. Master. Her main research interests include dynamic binary instrumentation and runtime libraries optimization

    Zhang Yiming: born in 1993. Master. Her main research interests include dynamic binary instrumentation, debug tools, runtime libraries optimization

    Li Jiamei: born in 1996. Master. Her main research interests include program analysis, runtime libraries optimization

    Wang Jun: born in 1980. Master. His main research interests include program analysis, autonomous and controllable basic software ecology

  • Received Date: February 02, 2024
  • Revised Date: October 10, 2024
  • Accepted Date: October 14, 2024
  • Available Online: October 21, 2024
  • In the information stage, the importance of data storage lies in ensuring the reliability, consistency, security, and real-time accessibility of information. Erasure codes (EC) play a crucial role in data storage systems due to their ability to minimize storage overhead and handle multiple component failures. However, the process of encoding and decoding EC involves intensive computation, impacting storage system efficiency. This paper focuses on optimizing EC, with a special emphasis on the Galois field (GF) multiplication within multi-layer loops, a time-consuming aspect of EC. We first evaluate the pros and cons of three methods for GF multiplication calculation: the log table searching method, the complete multiplication table searching method, and the shift decomposition method. Subsequently, a 4 b splitting (SP) method is proposed to reduce memory access overhead during table searching in GF(28). We delve into the SP’s analysis and leverage the 64 b modern processor architecture and vector instruction set characteristics to introduce data-level parallelism in multi-layer loops. This involves amplifying data access granularity and implementing single instruction multiple data (SIMD) vectorization. Based on the open-source Intel storage acceleration library (ISA-L), all optimization methods are implemented and tested on the Sunway processor and the x86 processor. The experimental results show the effectiveness of proposed optimization in improving EC performance across different data scalability scenarios. When compared to the original ISA-L, our optimizations exhibit an average performance speedup of 3.28x on the Sunway, 2.36x on the x86.

  • [1]
    Plank J S. A tutorial on reed-solomon coding for fault-tolerance in RAID-like systems[J]. Software Practice & Experience, 2015, 27(9): 995−1012
    [2]
    Zhou Tianli, Tian Chao. Fast erasure coding for data storage: A comprehensive study of the acceleration techniques[J]. ACM Transactions on Storage, 2020, 16(1): 1−24
    [3]
    Chen P M, Lee E K, Gibson G A, et al. RAID: High-performance, reliable secondary storage[J]. ACM Computing Surveys, 1997, 26(2): 145−185
    [4]
    Wilcox-O’Hearn Z, Warner B. Tahoe: The least-authority filesystem[C]//Proc of the 4th ACM Int Workshop on Storage Security and Survivability. New York: ACM, 2008: 21−26
    [5]
    Ishengoma F. HDFS+: Erasure-coding based Hadoop distributed file system[J]. International Journal of Scientific & Technology Research, 2013, 2(9): 190−197
    [6]
    Hu Yuchong, Cheng Liangfeng, Yao Qiaori, et al. Exploiting combined locality for wide-stripe erasure coding in distributed storage[C]//Proc of the 19th USENIX Conf on File and Storage Technologies (FAST21). New York: ACM. USENIX Association, 2021: 233−248
    [7]
    Bowers K D, Juels A, Oprea A. HAIL: A high-availability and integrity layer for cloud storage[C]//Proc of the 16th ACM Conf on Computer and Communications Security. New York: ACM, 2009: 187−198
    [8]
    Shi Haiyang, Lu Xiaoyi, Shankar et al. UMR-EC: A unified and multi-rail erasure coding library for high-performance distributed storage systems[C]//Proc of the 28th Int Symp on High-Performance Parallel and Distributed Computing (HPDC ’19). New York: ACM, 2019: 219−230
    [9]
    Shi Haiyang, Lu Xiaoyi, Panda D K. EC-Bench: Benchmarking onload and offload erasure coders on modern hardware architectures[C]//Proc of the 2018 Benchmarking, Measuring, and Optimizing: First BenchCouncil Int Symp. New York: ACM, 2019: 215−230
    [10]
    Detchart J, Lacan J. Improving the coding speed of erasure codes with polynomial ring transforms[C/OL]//Proc of the 2017 IEEE Global Communications Conf. Piscataway, NJ: IEEE, 2017[2024-09-27]. https://dl.acm.org/doi/10.1109/GLOCOM.2017.8255009
    [11]
    Uezato Y Y. Accelerating XOR-based erasure coding using program optimization techniques[C/OL]//Proc of the Int Conf for High Performance Computing, Networking, Storage and Analysis(SC21). New York: ACM, 2021[2024-09-27]. https://doi.org/10.1145/3458817.3476204
    [12]
    Blomer J, Kalfane M, Karp R, et al. An XOR-based erasure-resilient coding scheme[R]. Berkeley, CA: Technical Report at International Computer Science Institute, 1995
    [13]
    Luo Jianqiang, Shrestha, Mochan, XuLihao et al. Efficient encoding schedules for XOR-based erasure codes[J]. IEEE Computer Society, 2014, 63(9): 2259−2272
    [14]
    Feng Kai, Ma Wentao, Huang Wei et al. Speeding up galois field arithmetic on Intel MIC architecture[G]//LNTCS 2172: Proc of the 10th IFIP Int Conf on Network and Parallel Computing. Berlin: Springer, 2013: 143−154
    [15]
    Günther S M, Riemensberger M, Utschick W. Efficient GF arithmetic for linear network coding using hardware SIMD extensions[C/OL]//Proc of the 2014 Int Symp on Network Coding. Piscataway, NJ: IEEE, 2014[2024-10-01]. https://ieeexplore.ieee.org/document/6892123
    [16]
    Shin S R, Choo S Y, Park J S. Accelerating random network coding using 512-bit SIMD instructions[C]//Proc of the 2019 Int Conf on Information and Communication Technology Convergence (ICTC). Piscataway, NJ: IEEE, 2019: 1099−1103
    [17]
    Intel. Optimizing storage solutions using the Intel® intelligent storage acceleration library[EB/OL]. (2014-09-24)[2024-05-30]. https://www.intel.com/content/www/us/en/developer/articles/technical/optimizing-storage-solutions-using-the-intel-intelligent-storage-acceleration-library.html
    [18]
    Macwilliams F J, Sloane N J A. The Theory of Error-correcting Codes. Parts I, II[M]. Amsterdam: North-Holland, 1977
    [19]
    Plank J S. Jerasure: A library in C/C++ facilitating erasure coding for storage applications[R]. Knoxville: University of Tennessee, 1995
    [20]
    Ueno R, Homma N, Sugawara Y. Highly efficient GF(2^8) nversion circuit based on redundant GF arithmetic and its application to AES design[G]//LNCS 9293: Proc of the 2015 Cryptographic Hardware and Embedded Systems(CHES’15). Berlin: Springer, 2015: 63−80
    [21]
    Bakhvalov D. 现代CPU性能分析与优化[M]. 朱金鹏,李成栋,译. 北京:机械工业出版社,2023

    Bakhvalov D. Performance Analysis and Tuning on Modern CPUs[M]. Translated by Zhu Jinpeng, Li Chengdong. Beijing: China Machine Press, 2023 (in Chinese)
    [22]
    Robey R, Zamora Y. Parallel and High Performance Computing[M]. Greenwich: Manning, 2021
  • Related Articles

    [1]Liu Le, Guo Shengnan, Jin Xiyuan, Zhao Miaomiao, Chen Ran, Lin Youfang, Wan Huaiyu. Spatial-Temporal Traffic Data Imputation Method with Uncertainty Modeling[J]. Journal of Computer Research and Development, 2025, 62(2): 346-363. DOI: 10.7544/issn1000-1239.202330455
    [2]Xu Xiao, Ding Shifei, Sun Tongfeng, Liao Hongmei. Large-Scale Density Peaks Clustering Algorithm Based on Grid Screening[J]. Journal of Computer Research and Development, 2018, 55(11): 2419-2429. DOI: 10.7544/issn1000-1239.2018.20170227
    [3]Wang Haiyan, Xiao Yikang. Dynamic Group Discovery Based on Density Peaks Clustering[J]. Journal of Computer Research and Development, 2018, 55(2): 391-399. DOI: 10.7544/issn1000-1239.2018.20160928
    [4]Ren Lifang, Wang Wenjian, Xu Hang. Uncertainty-Aware Adaptive Service Composition in Cloud Computing[J]. Journal of Computer Research and Development, 2016, 53(12): 2867-2881. DOI: 10.7544/issn1000-1239.2016.20150078
    [5]Xu Zhengguo, Zheng Hui, He Liang, Yao Jiaqi. Self-Adaptive Clustering Based on Local Density by Descending Search[J]. Journal of Computer Research and Development, 2016, 53(8): 1719-1728. DOI: 10.7544/issn1000-1239.2016.20160136
    [6]Xu Min, Deng Zhaohong, Wang Shitong, Shi Yingzhong. MMCKDE: m-Mixed Clustering Kernel Density Estimation over Data Streams[J]. Journal of Computer Research and Development, 2014, 51(10): 2277-2294. DOI: 10.7544/issn1000-1239.2014.20130718
    [7]Qi Yafei, Wang Yijie, and Li Xiaoyong. A Skyline Query Method over Gaussian Model Uncertain Data Streams[J]. Journal of Computer Research and Development, 2012, 49(7): 1467-1473.
    [8]Pan Weimin and He Jun. Neuro-Fuzzy System Modeling with Density-Based Clustering[J]. Journal of Computer Research and Development, 2010, 47(11): 1986-1992.
    [9]Chen Jianmei, Lu Hu, Song Yuqing, Song Shunlin, Xu Jing, Xie Conghua, Ni Weiwei. A Possibility Fuzzy Clustering Algorithm Based on the Uncertainty Membership[J]. Journal of Computer Research and Development, 2008, 45(9): 1486-1492.
    [10]Ma Liang, Chen Qunxiu, and Cai Lianhong. An Improved Model for Adaptive Text Information Filtering[J]. Journal of Computer Research and Development, 2005, 42(1): 79-84.
  • Cited by

    Periodical cited type(0)

    Other cited types(1)

Catalog

    Article views (29) PDF downloads (11) Cited by(1)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return