• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Wei Zheng, Dou Yu, Gao Yanzhen, Ma Jie, Sun Ninghui, Xing Jing. A Consistent Hash Data Placement Algorithm Based on Stripe[J]. Journal of Computer Research and Development, 2021, 58(4): 888-903. DOI: 10.7544/issn1000-1239.2021.20190732
Citation: Wei Zheng, Dou Yu, Gao Yanzhen, Ma Jie, Sun Ninghui, Xing Jing. A Consistent Hash Data Placement Algorithm Based on Stripe[J]. Journal of Computer Research and Development, 2021, 58(4): 888-903. DOI: 10.7544/issn1000-1239.2021.20190732

A Consistent Hash Data Placement Algorithm Based on Stripe

Funds: This work was supported by the National Key Research and Development Program of China (2018YFC0809300), the National Natural Science Foundation of China (61502454), and the Distributed Full Flash Project of ECR Team of Lenovo Research Institute.
More Information
  • Published Date: March 31, 2021
  • As the carrier of data storage, distributed storage system is widely used in the field of large data. Erasure codes are widely adopted by storage systems because of their high spatial efficiency and reliable data storage. In EB-level large-scale erasure coded distributed storage system, the cost of metadata management is high, and the query efficiency of metadata such as location information affects the I/O latency and throughput. The centralized data placement algorithm, based on location information records, needs frequent access to metadata servers, resulting in performance optimization constraints. More and more centerless data placement algorithms based on Hash mapping are applied. But some problems exist in the process of node change and data recovery, such as difficult location change, a large amount of migrated data, low concurrency of data recovery and migration. This paper proposes a consistent Hash data placement algorithm based on stripe (SCHash). SCHash places data in the unit of stripe. By transforming the mapping from data block to node into the mapping process from stripe to node group, it reduces the amount of data migration in the process of node change. Thus, in the recovery process, the proportion of data migration is reduced, and the recovery speed is accelerated. On the basis of SCHash, this paper designs and implements a recovery strategy of parallel I/O scheduling based on stripe. The recovery strategy avoids the selection of the data block in the same node in I/O operation, which also enhances the degree of parallelism of I/O. Compared with the APHash algorithm, SCHash algorithm reduces the data transfer by 46.71% to 85.28% in the data recovery. The recovery rate is improved by 48.16% when the nodes are rebuilt in the stripe, and the recovery rate is increased by 138.44% when the nodes are rebuilt out of the stripe.
  • Related Articles

    [1]Chen Yubiao, Li Jianzhong, Li Yingshu. SBS: An Efficient R-Tree Query Algorithm Exploiting the Internal Parallelism of SSDs[J]. Journal of Computer Research and Development, 2020, 57(11): 2404-2418. DOI: 10.7544/issn1000-1239.2020.20190564
    [2]Zhao Xinyi, Huang Xiangdong, Qiao Jialin, Kang Rong, Li Na, Wang Jianmin. A Spatio-Temporal Index Based on Skew Spatial Coding and R-Tree[J]. Journal of Computer Research and Development, 2019, 56(3): 666-676. DOI: 10.7544/issn1000-1239.2019.20170750
    [3]Wu Jiasen and Song Fangmin. Study on R-Calculus[J]. Journal of Computer Research and Development, 2012, 49(4): 833-838.
    [4]Fang Wei, Sun Guangzhong, Wu Chao, and Chen Guoliang. A Parallel Algorithm of Three-Dimensional Fast Fourier Transform[J]. Journal of Computer Research and Development, 2011, 48(3): 440-446.
    [5]Li Xin, Li Fan, Bian Xingbin, Liu Qihe. Answer Set Programming Representation for E-R Model[J]. Journal of Computer Research and Development, 2010, 47(1): 164-173.
    [6]Zhao Xianfeng, Li Ning, and Huang Wei. Reversible R-S Digital Watermarking Using the Subliminal Channel[J]. Journal of Computer Research and Development, 2009, 46(1): 100-107.
    [7]Li Bohan, Hao Zhongxiao. A Decision Algorithm on Judging the Overlap of Nodes for R*Tree Based on Clustering Analysis[J]. Journal of Computer Research and Development, 2008, 45(12): 2154-2161.
    [8]Liu Bing, Yan Heping, Duan Jiangjiao, Wang Wei, and Shi Baile. A Bottom-Up Distance-Based Index Tree for Metric Space[J]. Journal of Computer Research and Development, 2006, 43(9): 1651-1657.
    [9]Jiang Xiajun, Wu Huizhong, and Li Weiqing. R-tree Method of Matching Algorithm for Data Distribution Management[J]. Journal of Computer Research and Development, 2006, 43(2): 362-367.
    [10]Chi Lihua, Liu jie, and Hu Qingfeng. Evaluation and Test for Scalability of Numerical Parallel Computation[J]. Journal of Computer Research and Development, 2005, 42(6): 1073-1078.
  • Cited by

    Periodical cited type(1)

    1. 陈玉标,李建中,李英姝. SBS:基于固态盘内部并行性的R-树高效查询算法. 计算机研究与发展. 2020(11): 2404-2418 . 本站查看

    Other cited types(6)

Catalog

    Article views (644) PDF downloads (324) Cited by(7)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return