一种基于条带的一致性散列数据放置算法

魏征; 窦禹; 高艳珍; 马捷; 孙凝晖; 邢晶

doi:10.7544/issn1000-1239.2021.20190732

摘要: 分布式存储系统作为数据存储的载体，广泛应用于大数据领域.纠删码存储方式相对副本方式，既具有较高的空间利用效率，又能保证数据存储的可靠性，因此被越来多的应用于存储系统当中.在EB级大规模纠删码分布式存储系统中，元数据管理成本较大，位置信息等元数据查询效率影响了I/O时延和吞吐量.基于位置信息记录的有中心数据放置算法需要频繁访问元数据服务器，导致性能优化受限，基于Hash映射的无中心数据放置算法越来越多地得到应用.但面向纠删码的无中心放置算法，在节点变更和数据恢复过程中，存在位置变更困难、迁移数据量大、数据恢复和迁移并发度低等问题.提出了一种基于条带的一致性Hash数据放置算法(consistent Hash data placement algorithm based on stripe, SCHash)，SCHash以条带为单位放置数据，通过把数据块到节点的映射转化为条带到节点组的映射过程，减少节点变动过程中的数据迁移量，从而在恢复过程中降低了变动数据的比例，加速了恢复带宽.并基于SCHash算法设计了一种基于条带的并发I/O调度恢复策略，通过避免选取同一节点的数据块进行I/O操作，提升了I/O并行度，通过调度恢复I/O和迁移I/O的执行顺序，减少了数据恢复的执行时间.相比APHash数据放置算法，SCHash在数据恢复过程中，减少了46.71%~85.28%数据的迁移.在条带内重建时，恢复带宽提升了48.16%，在条带外节点重建时，恢复带宽提升了138.44%.

Abstract: As the carrier of data storage, distributed storage system is widely used in the field of large data. Erasure codes are widely adopted by storage systems because of their high spatial efficiency and reliable data storage. In EB-level large-scale erasure coded distributed storage system, the cost of metadata management is high, and the query efficiency of metadata such as location information affects the I/O latency and throughput. The centralized data placement algorithm, based on location information records, needs frequent access to metadata servers, resulting in performance optimization constraints. More and more centerless data placement algorithms based on Hash mapping are applied. But some problems exist in the process of node change and data recovery, such as difficult location change, a large amount of migrated data, low concurrency of data recovery and migration. This paper proposes a consistent Hash data placement algorithm based on stripe (SCHash). SCHash places data in the unit of stripe. By transforming the mapping from data block to node into the mapping process from stripe to node group, it reduces the amount of data migration in the process of node change. Thus, in the recovery process, the proportion of data migration is reduced, and the recovery speed is accelerated. On the basis of SCHash, this paper designs and implements a recovery strategy of parallel I/O scheduling based on stripe. The recovery strategy avoids the selection of the data block in the same node in I/O operation, which also enhances the degree of parallelism of I/O. Compared with the APHash algorithm, SCHash algorithm reduces the data transfer by 46.71% to 85.28% in the data recovery. The recovery rate is improved by 48.16% when the nodes are rebuilt in the stripe, and the recovery rate is increased by 138.44% when the nodes are rebuilt out of the stripe.

一种基于条带的一致性散列数据放置算法

A Consistent Hash Data Placement Algorithm Based on Stripe