低跨云数据中心修复流量的纠删码的快速构造方法

包涵; 王意洁

doi:10.7544/issn1000-1239.202220580

低跨云数据中心修复流量的纠删码的快速构造方法

包涵,
王意洁

A Fast Construction Method of the Erasure Code with Small Cross-Cloud Data Center Repair Traffic

Bao Han,
Wang Yijie

摘要

摘要: 近年来，云数据中心故障频发，因而各大机构纷纷采用跨云数据中心多副本技术对数据进行容灾存储.与跨云数据中心多副本技术相比，跨云数据中心纠删码技术可靠性更高、冗余度更低. 但是，现有跨云数据中心纠删码技术无法同时满足低跨云数据中心修复流量、高编码参数适应性和高纠删码构造效率，因而尚未在生产系统中得到普遍应用. 提出一种低跨云数据中心修复流量的纠删码的快速构造方法(fast construction method of the erasure code with small cross-cloud data center repair traffic, FMEL)，该方法可在不同编码参数下快速构造具有低跨云数据中心修复流量的纠删码. 具体而言，FMEL首先将纠删码修复组分布方案及用户指定的编码参数转换为定长特征向量，并基于支持向量机对各特征向量进行快速分类以检验其对应纠删码修复组分布方案和编码参数的匹配性——某特征向量属于正类表示其对应纠删码修复组分布方案与编码参数相匹配. 而后，FMEL用一种并行搜索算法从所有通过检验的纠删码修复组分布方案中选出平均跨云数据中心修复流量较小的一个方案，并用一种试错算法将其转换为具有低跨云数据中心修复流量的纠删码的生成矩阵. 跨云数据中心环境中的实验表明，与现有的可在不同编码参数下构造出能达到平均跨云数据中心修复流量下限的最优码的工作相比，FMEL可将纠删码构造用时缩短89%，且在大部分编码参数下，二者构造的纠删码的跨云数据中心修复流量相同. 此外，与其他几类常用纠删码相比，FMEL构造的纠删码可将跨云数据中心修复流量降低42.9%~56.0%.

Abstract: Compared with cross-cloud data center replication, cross-cloud data center erasure code is more reliable and space-efficiency. However, existing cross-cloud data center erasure codes cannot achieve low cross-cloud data center repair traffic, high encoding parameters adaptability, and high erasure code construction efficiency at the same time, so they are rarely used in production. We propose a fast construction method of the erasure code with small cross-cloud data center repair traffic, called FMEL, which can obtain the erasure code with small cross-cloud data center repair traffic quickly under different encoding parameters. Specifically, FMEL converts erasure code repair group distribution schemes and the corresponding encoding parameters into fixed-length feature vectors, and verifies whether the erasure code repair group distribution schemes match the encoding parameter by classifying corresponding feature vectors with a support vector machine—a feature vector positively indicates that the corresponding erasure code repair group distribution scheme passes the verification. Then, FMEL uses a parallel search algorithm to pick the erasure code repair group distribution scheme with the smallest cross-cloud data center repair traffic from all distribution schemes passing the verification, and converts it into the generator matrix of the erasure code with small cross-cloud data center repair traffic. Experiments in a cross-cloud data center environment show that FMEL can construct the optimal code that can achieve the lower bound of cross-cloud data center repair traffic under most encoding parameters. Meanwhile, FMEL’s erasure code construction time is 89% less than the existing work’s optimal code construction time. Compared with several popular erasure codes, the erasure code constructed by FMEL can reduce the cross-cloud data center repair traffic by from 42.9% to 56.0%.

HTML全文

参考文献(39)

施引文献

资源附件(0)