• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Wei Zheng, Dou Yu, Gao Yanzhen, Ma Jie, Sun Ninghui, Xing Jing. A Consistent Hash Data Placement Algorithm Based on Stripe[J]. Journal of Computer Research and Development, 2021, 58(4): 888-903. DOI: 10.7544/issn1000-1239.2021.20190732
Citation: Wei Zheng, Dou Yu, Gao Yanzhen, Ma Jie, Sun Ninghui, Xing Jing. A Consistent Hash Data Placement Algorithm Based on Stripe[J]. Journal of Computer Research and Development, 2021, 58(4): 888-903. DOI: 10.7544/issn1000-1239.2021.20190732

A Consistent Hash Data Placement Algorithm Based on Stripe

Funds: This work was supported by the National Key Research and Development Program of China (2018YFC0809300), the National Natural Science Foundation of China (61502454), and the Distributed Full Flash Project of ECR Team of Lenovo Research Institute.
More Information
  • Published Date: March 31, 2021
  • As the carrier of data storage, distributed storage system is widely used in the field of large data. Erasure codes are widely adopted by storage systems because of their high spatial efficiency and reliable data storage. In EB-level large-scale erasure coded distributed storage system, the cost of metadata management is high, and the query efficiency of metadata such as location information affects the I/O latency and throughput. The centralized data placement algorithm, based on location information records, needs frequent access to metadata servers, resulting in performance optimization constraints. More and more centerless data placement algorithms based on Hash mapping are applied. But some problems exist in the process of node change and data recovery, such as difficult location change, a large amount of migrated data, low concurrency of data recovery and migration. This paper proposes a consistent Hash data placement algorithm based on stripe (SCHash). SCHash places data in the unit of stripe. By transforming the mapping from data block to node into the mapping process from stripe to node group, it reduces the amount of data migration in the process of node change. Thus, in the recovery process, the proportion of data migration is reduced, and the recovery speed is accelerated. On the basis of SCHash, this paper designs and implements a recovery strategy of parallel I/O scheduling based on stripe. The recovery strategy avoids the selection of the data block in the same node in I/O operation, which also enhances the degree of parallelism of I/O. Compared with the APHash algorithm, SCHash algorithm reduces the data transfer by 46.71% to 85.28% in the data recovery. The recovery rate is improved by 48.16% when the nodes are rebuilt in the stripe, and the recovery rate is increased by 138.44% when the nodes are rebuilt out of the stripe.
  • Related Articles

    [1]Zhang Jing, Ju Jialiang, Ren Yonggong. Double-Generators Network for Data-Free Knowledge Distillation[J]. Journal of Computer Research and Development, 2023, 60(7): 1615-1627. DOI: 10.7544/issn1000-1239.202220024
    [2]Xiang Chaocan, Cheng Wenhui, Zhang Zhao, Jiao Xianlong, Qu Yuben, Chen Chao, Dai Haipeng. Intelligent Edge Computing-Empowered Adaptive Urban Traffic Sensing Data Recovery[J]. Journal of Computer Research and Development, 2023, 60(3): 619-634. DOI: 10.7544/issn1000-1239.202110962
    [3]Pu Yonglin, Yu Jiong, Lu Liang, Li Ziyang, Guo Binglei, Liao Bin. Energy-Efficient Strategy Based on Data Recovery in Storm[J]. Journal of Computer Research and Development, 2021, 58(3): 479-496. DOI: 10.7544/issn1000-1239.2021.20200489
    [4]Sun Zhenyuan, Xu Lu, Liu Zhenjun, Dong Huanqing, Liu Chang. Network Declustering BWRAID: Faster Scalability, Recovery and IO Performance[J]. Journal of Computer Research and Development, 2015, 52(11): 2568-2576. DOI: 10.7544/issn1000-1239.2015.20148038
    [5]Wang Qiang, Li Xiongfei, Wang Jing. A Data Placement and Task Scheduling Algorithm in Cloud Computing[J]. Journal of Computer Research and Development, 2014, 51(11): 2416-2426. DOI: 10.7544/issn1000-1239.2014.20130749
    [6]Zhang Tiantian, Cui Lizhen, and Xu Meng. A Pareto-Based Data Placement Strategy in Database as a Service Model[J]. Journal of Computer Research and Development, 2014, 51(6): 1373-1382.
    [7]Zhang Peng, Wang Guiling, Xu Xuehui. A Data Placement Approach for Workflow in Cloud[J]. Journal of Computer Research and Development, 2013, 50(3): 636-647.
    [8]Fu Wei, Xiao Nong, and Lu Xicheng. Replica Placement and Update Mechanism for Individual QoS-Restricted Requirement in Data Grids[J]. Journal of Computer Research and Development, 2009, 46(8): 1408-1415.
    [9]Wang Nianbin, Song Yibo, Yao Nianmin, Liu Daxin. A Parallel Data Processing Middleware Based on Clusters[J]. Journal of Computer Research and Development, 2007, 44(10): 1702-1708.
    [10]Sun Yongming, Lin Qi. 1.5Gbps High Speed Serial Data Recovery Circuit Made from Standard Cells[J]. Journal of Computer Research and Development, 2005, 42(10): 1826-1831.
  • Cited by

    Periodical cited type(4)

    1. 周海,周子强. 基于Flash混合存储的数据迁移技术研究. 电子设计工程. 2024(11): 51-54+59 .
    2. 徐澄,李民,东单锋. 融合物联网技术与混合加密算法的医疗数据信息优化框架设计研究. 微型电脑应用. 2024(10): 186-190 .
    3. 郑美光,化韬斐,张心宇,胡志刚. xStripeMerge:基于纠删码存储的高效宽条带生成方法. 通信学报. 2023(11): 213-224 .
    4. 宗江琴. NUMA架构下内存数据库日志恢复技术研究. 信息与电脑(理论版). 2021(10): 179-181 .

    Other cited types(4)

Catalog

    Article views (638) PDF downloads (323) Cited by(8)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return