Index of Meta-Data Set of the Similar Files for Inline De-Duplication in Distributed Storage Systems

Sun Jing, Yu Hongliang, and Zheng Weimin

Sun Jing, Yu Hongliang, and Zheng Weimin. Index of Meta-Data Set of the Similar Files for Inline De-Duplication in Distributed Storage SystemsJ. Journal of Computer Research and Development, 2013, 50(1): 197-205.

Citation:

Index of Meta-Data Set of the Similar Files for Inline De-Duplication in Distributed Storage Systems

Sun Jing, Yu Hongliang, and Zheng Weimin

Graphical Abstract

Abstract

Abstract

Distributed storage systems have been widely adopted in the cloud storages and enterprise storage infrastructure, because of their high scalability and cost effectiveness. In the storage systems, data de-duplication can save most of storage space for the devices, and can improve the efficiency of data transmission. The key of de-duplicating in the distributed storage systems is how to implement a high performance and scalability meta-data index that should not hurt the writing throughput. This paper proposes an index of meta-data sets of the similar files. The index uses a locality sensitive Hashing function to organize meta-data set, and accesses the disk only one time for the lookups for the chunks of a file. Consequently, the index improves the indexing performance with high scalability and a small memory footprint, which is suitable for the cloud and enterprise storages.

FullText(HTML)

References (0)

Cited By

Turn off MathJax

Article Contents

Index of Meta-Data Set of the Similar Files for Inline De-Duplication in Distributed Storage Systems

Abstract

Catalog

Export File

Citation

Format

Content