ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2015, Vol. 52 ›› Issue (4): 929-942.doi: 10.7544/issn1000-1239.2015.20131911

• 软件技术 • 上一篇    下一篇

一种面向海量存储系统的高效元数据集群管理方案

肖中正1, 陈宁江1, 魏峻2, 张文博2   

  1. 1(广西大学计算机与电子信息学院 南宁 530004); 2(中国科学院软件研究所软件工程技术研究开发中心 北京 100190) (zz.xiao.gx@gmail.com)
  • 出版日期: 2015-04-01
  • 基金资助: 
    基金项目:国家自然科学基金项目(61063012,61363003);广西自然科学基金项目(2012GXNSFAA053222);广西高校优秀人才资助计划项目([2011]40);广西科学研究与技术开发计划项目(桂科软13180015,桂科攻1348020-7);南宁市科学研究与技术开发计划项目(201109016A)

A High Performance Management Schema of Metadata Clustering for Large-Scale Data Storage Systems

Xiao Zhongzheng1,Chen Ningjiang1,Wei Jun2, Zhang Wenbo2   

  1. 1(School of Computer and Electronic Information, Guangxi University, Nanning 530004); 2(Technology Center of Software Engineering, Institute of Software, Chinese Academy of Sciences, Beijing 100190)
  • Online: 2015-04-01

摘要: 高效的、去中心化的元数据管理方案对大型分布式存储系统的可靠性、可扩展性起至关重要的作用.针对基于Hash划分和基于子树划分的元数据管理方案扩展代价巨大、对集群变动敏感等问题,提出一种基于一致性Hash结构的元数据服务器(metadata server, MDS)集群化方案——CH-MMS(consistent Hash based metadata management schema).CH-MMS在一致性MDS集群上引入虚拟MDS(Virtual MDS),有效平衡MDS集群负载;将Standby机制与延迟更新策略融合并应用于MDS集群,实现MDS快速失效恢复以及集群变动时零数据迁移量.阐述了CH-MMS的体系结构,介绍了核心数据结构layout-table、虚拟MDS结构、延迟更新机制及相关算法,并对CH-MMS扩展性、容错性作了定性分析.最后通过原型系统和模拟实验说明,CH-MMS具有元数据平衡分布、快速失效恢复、灵活的扩展性以及零结点变动数据迁移量等特点,能满足数据量不断增加的大规模存储集群元数据灵活、高效管理的需求.

关键词: 元数据管理, 一致性Hash, 大数据存储, 元数据服务器, 分布式文件系统

Abstract: An efficient, decentralized metadata management schema plays a vital role in large-scale distributed storage systems. The Hash-based partition schema and tree-based partition schema pay huge cost for expansion, and are sensitive to changes in cluster. In response to these problems, CH-MMS(consistent Hash based metadata management schema), is proposed. Virtual MDS (metadata server) is introduced in CH-MMS, and good effect for the cluster’s load balance is proved. Combining the standby mechanism with lazy-update policy, CH-MMS achieves fast failover and zero migration when the cluster changes. Due to its distributed metadata structure, CH-MMS has fast metadata lookup speed. In order to solve the problem that the Hash structure will cause damage to file system hierarchical semantics, a simple and flexible mechanism based on regular expression matching has been introduced. The following work is presented in the paper: 1)Expound the architecture of CH-MMS; 2)Introduce the core data structure of layout-table, virtual MDS and lazy-update policy, and their relevant algorithms; 3)Qualitatively analyze scalability and fault tolerance. The prototype system and simulation show that, CH-MMS is metadata-balancing and has fast failover, flexible expansion and zero migration when cluster changes. CH-MMS can meet the needs of flexible, efficient metadata management of large-scale storage systems with increasing data.

Key words: metadata management, consistent Hash, large-scale data storage, metadata server(MDS), distributed file system

中图分类号: