ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2016, Vol. 53 ›› Issue (2): 374-389.doi: 10.7544/issn1000-1239.2016.20148277

Previous Articles     Next Articles

Architecture and Key Technologies of In-Cloud Smart Data Appliance

Zhang Dong, Qi Kaiyuan, Wu Nan, Xin Guomao, Liu Zhengwei, Yan Bingheng, Guo Feng   

  1. (State Key Laboratory of High-End Server & Storage Technology, Jinan 250101) (Inspur Electronic Information Industry Co., Ltd., Jinan 250101)
  • Online:2016-02-01

Abstract: To make up for the gap between big data technologies and industry applications, this paper proposes the models of scalability, customizability and multi-type processing of big data appliance, based on which the in-cloud smart data appliance, i.e. iSDA, is designed and implemented. First, the iSDA is assembled by optimally developing the cooperative computing units, heterogeneous storage and high-speed switching network to take fully advantages of both scale-out and scale-up architectures. Second, iSDA is devised to satisfy diversity requirements of industry big data applications by virtue of hardware customization from light-weight to heavy-load styles, and as well as hybrid software stack including real-time, interaction, streaming and batch processing all accelerated by the in-memory computing engine. Furthermore, in the consideration of the HDFS metadata service bottleneck, MapReduce load skew and HBase cross-domain issue, this paper as well introduces the technologies of multiple metadata servers, load balance algorithm and cross-datacenter big table used in iSDA. The practical use cases in the telecommunication, finance and environmental protection industries show that the proposed architecture and key technologies are feasible and effective, and the comprehensive comparisons with traditional MPP databases and other mainstream Hadoop distributions are also given to detail the advantages of iSDA from both hardware and software aspects.

Key words: big data appliance, scalability, customizability, hybrid software stack, big data industrial application

CLC Number: