ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2016, Vol. 53 ›› Issue (2): 374-389.doi: 10.7544/issn1000-1239.2016.20148277

• 系统结构 • 上一篇    下一篇



  1. (高效能服务器和存储技术国家重点实验室 济南 250101) (浪潮电子信息产业股份有限公司 济南 250101) (
  • 出版日期: 2016-02-01
  • 基金资助: 

Architecture and Key Technologies of In-Cloud Smart Data Appliance

Zhang Dong, Qi Kaiyuan, Wu Nan, Xin Guomao, Liu Zhengwei, Yan Bingheng, Guo Feng   

  1. (State Key Laboratory of High-End Server & Storage Technology, Jinan 250101) (Inspur Electronic Information Industry Co., Ltd., Jinan 250101)
  • Online: 2016-02-01

摘要: 为了弥补从大数据技术到行业应用之间的鸿沟,针对当前行业用户对大数据处理平台的持续扩展、一体化和多样性需求,提出了大数据一体机的可扩展性、可定制性和多类型处理模型,并基于此设计了云海大数据一体机.该一体机采用兼顾横向和纵向可扩展的体系结构,并采用硬件可定制化设计和混合型软件架构支持多种大数据应用类型. 在此基础上,针对HDFS元数据服务瓶颈问题、MapReduce负载倾斜问题、HBase的跨域问题,介绍了在云海大数据一体机中采用的多元数据服务、负载均衡和跨数据中心大表技术.在电信、金融和环保行业实际案例中的应用和测试表明,上述体系结构和关键技术是可行和有效的.

关键词: 大数据一体机, 可扩展性, 可定制性, 混合型软件架构, 大数据行业应用

Abstract: To make up for the gap between big data technologies and industry applications, this paper proposes the models of scalability, customizability and multi-type processing of big data appliance, based on which the in-cloud smart data appliance, i.e. iSDA, is designed and implemented. First, the iSDA is assembled by optimally developing the cooperative computing units, heterogeneous storage and high-speed switching network to take fully advantages of both scale-out and scale-up architectures. Second, iSDA is devised to satisfy diversity requirements of industry big data applications by virtue of hardware customization from light-weight to heavy-load styles, and as well as hybrid software stack including real-time, interaction, streaming and batch processing all accelerated by the in-memory computing engine. Furthermore, in the consideration of the HDFS metadata service bottleneck, MapReduce load skew and HBase cross-domain issue, this paper as well introduces the technologies of multiple metadata servers, load balance algorithm and cross-datacenter big table used in iSDA. The practical use cases in the telecommunication, finance and environmental protection industries show that the proposed architecture and key technologies are feasible and effective, and the comprehensive comparisons with traditional MPP databases and other mainstream Hadoop distributions are also given to detail the advantages of iSDA from both hardware and software aspects.

Key words: big data appliance, scalability, customizability, hybrid software stack, big data industrial application