基于多级网络编码的多副本云数据存储

徐光伟; 史春红; 冯向阳; 罗辛; 石秀金; 韩松桦; 李玮

doi:10.7544/issn1000-1239.2021.20200340

基于多级网络编码的多副本云数据存储

Multi-Replica Cloud Data Storage Based on Hierarchical Network Coding

摘要

摘要: 云数据存储的快速发展对数据的可用性提出了较高要求.目前，主要采用纠删码计算数据编码块进行分布式冗余数据存储来保证数据的可用性.虽然这种数据编码技术保证了存储数据的安全性并减少了额外的存储空间，但在损坏数据恢复时会产生较大的计算和通信开销.提出一种基于多级网络编码的多副本生成和损坏数据恢复算法.算法基于多级网络编码对纠删码的编码矩阵进行改进形成多级编码矩阵，利用其级联性生成多级编码(hierarchical coding, HC码)来构成多副本数据，使得各副本之间存在编码关系.在损坏数据恢复时，利用数据所有者提供的数据编码信息和云存储中保存的数据块直接计算进行恢复，从而避免从云存储中远程下载数据.理论分析和实验表明，所提算法在相同的存储空间下显著减少了损坏数据恢复时的通信开销并提高了数据的可用性.

Abstract: The rapid development of cloud data storage presents a high demand on the availability of stored data. Currently, the main technique of ensuring data availability is to use erasure coding to calculate coded blocks for the stored data, and then utilize distributed storage to store multiple redundant coded blocks in cloud storage space. Although this data coding technology can ensure the security of stored data and reduce extra storage space, it also causes a large calculation and communication overhead when recovering corrupted data. In this paper a multi-replica generation and corrupted data recovery algorithm is proposed based on hierarchical network coding. The algorithm improves the coding matrix of erasure coding based on hierarchical network coding to form the hierarchical coding (HC). Then multi-replicas which are built based on the cascade of the hierarchical coding forms the coding relationship between each other. In the process of corrupted data recovery, the data encoding information provided by the data owner and the complete data blocks stored by the cloud server are jointly computed to recover the corrupted data blocks, avoiding remote data downloading from the cloud storage space. Theoretical analysis and simulation experiments indicate that the proposed algorithm reduces the communication overhead significantly when recovering corrupted data and improves the availability of stored data under the same storage space.

HTML全文

参考文献(0)

施引文献

资源附件(0)