高级检索

    一种基于重复数据删除技术的云中云存储系统

    A Data Deduplication-Based Primary Storage System in Cloud-of-Clouds

    • 摘要: 随着云存储技术的快速发展和应用,越来越多的企业和用户都开始将数据从本地转移到云存储服务提供商进行存储.但是,在享受云存储高质量服务的同时,将数据仅仅存储于单个云存储服务商中会带来一定的风险,例如云存储服务提供商的垄断、数据可用性和安全性等问题.为了解决这个问题,提出了一种基于重复数据删除技术的云中云存储系统架构,首先消除云存储系统中的冗余数据量,然后基于重复数据删除集中的数据块引用率将数据块以复制和纠删码2种数据布局方式存储在多个云存储服务提供商中.基于复制的数据布局方式易于实现部署,但是存储开销大;基于纠删码的数据布局方式存储开销小,但是需要编码和解码,计算开销大.为了充分挖掘复制和纠删码数据布局的优点并结合重复数据删除技术中数据引用的特点,新方法用复制方式存储高引用数据块,用纠删码方式存储其他数据块,从而使系统整体性能和成本达到较优.通过原型系统的实现和测试验证了相比现有云中云存储策略,新方法在性能和成本上都有大幅度提高.

       

      Abstract: With the rapid development of cloud storage technology, more and more companies are beginning to upload data to the cloud storage platform. However, solely depending on the particular cloud storage provider has a number of potentially serious problems, such as vendor lock-in, availability, and security issues. To address the problems, we propose a deduplication-based primary storage system in cloud-of-clouds in this paper by eliminating the redundant data block in the cloud computing environment and distributing the data among multiple independent cloud storage providers. The data is stored in multiple cloud storage providers by combining the replication and erasure code schemes. The replication way is easy to implement and deploy but has high storage overhead. The storage overhead of erasure code is small, but it requires computational overhead for encode and decode operations. To better utilize the advantages of both replication and erasure code schemes and to exploit the reference characteristics in data deduplication, the high referenced data blocks are stored with replication scheme and the other data blocks are stored with erasure code scheme. The experiments conducted on our lightweight prototype implementation of new system show that the deduplication-based primary storage system in cloud-of-clouds improves the performance and cost efficiency significantly than the existing schemes.

       

    /

    返回文章
    返回