重复数据删除关键技术研究进展

付印金  肖  侬  刘  芳

重复数据删除关键技术研究进展

付印金肖侬刘芳

Research and Development on Key Techniques of Data Deduplication

Fu Yinjin, Xiao Nong, and Liu Fang

摘要

摘要: 企业数据量的不断增长和数据传输率要求的不断提高,使得数据中心海量存储空间和高带宽网络传输需求成为当前网络存储领域面临的严峻挑战.利用特定应用数据集内数据高度冗余的特性,重复数据删除技术能够极大地缩减数据存储容量需求,提高网络带宽利用率,降低企业IT运营成本.目前,重复数据删除技术已成为国内外的研究热点.首先介绍重复数据删除技术的概念、分类及其应用;阐述重复数据删除系统的体系结构和基本原理,并与传统存储系统进行对比.然后重点分析和总结重复数据删除各项关键技术的研究现状,包括数据划分方法、I/O优化技术、高可靠数据配置策略以及系统可扩展性.最后对重复数据删除技术的研究现状进行总结,并指出未来可能的研究方向.

Abstract: With the rapid growth of data capacity and the continuous improvement of data transfer rate requirement in enterprises, the needs of massive data storage capacity and high network bandwidth in data center become a grand challenge in networked storage area nowadays. Based on high data redundancy in application-specific datasets, data deduplication can reduce storage capacity needs greatly, improve the efficiency of network bandwidth and economize IT cost. Data deduplication technique is rapidly becoming a major research focus in recent years. Firstly, this paper introduces the concepts, categories and applications of data deduplication techniques, and describes the architecture along with the basic principle of data deduplication storage systems, meanwhile, contrasts them with traditional storage systems. Thereafter, it focuses on analyzing and summarizing current research status of key techniques on data deduplication, including data partition methods, I/O optimization techniques, high reliability data deployment strategies and system scalability. Finally, current research work about data deduplication is summarized, and future research directions are pointed out.

HTML全文

参考文献(0)

施引文献

资源附件(0)