高级检索
    张东, 亓开元, 吴楠, 辛国茂, 刘正伟, 颜秉珩, 郭锋. 云海大数据一体机体系结构和关键技术[J]. 计算机研究与发展, 2016, 53(2): 374-389. DOI: 10.7544/issn1000-1239.2016.20148277
    引用本文: 张东, 亓开元, 吴楠, 辛国茂, 刘正伟, 颜秉珩, 郭锋. 云海大数据一体机体系结构和关键技术[J]. 计算机研究与发展, 2016, 53(2): 374-389. DOI: 10.7544/issn1000-1239.2016.20148277
    Zhang Dong, Qi Kaiyuan, Wu Nan, Xin Guomao, Liu Zhengwei, Yan Bingheng, Guo Feng. Architecture and Key Technologies of In-Cloud Smart Data Appliance[J]. Journal of Computer Research and Development, 2016, 53(2): 374-389. DOI: 10.7544/issn1000-1239.2016.20148277
    Citation: Zhang Dong, Qi Kaiyuan, Wu Nan, Xin Guomao, Liu Zhengwei, Yan Bingheng, Guo Feng. Architecture and Key Technologies of In-Cloud Smart Data Appliance[J]. Journal of Computer Research and Development, 2016, 53(2): 374-389. DOI: 10.7544/issn1000-1239.2016.20148277

    云海大数据一体机体系结构和关键技术

    Architecture and Key Technologies of In-Cloud Smart Data Appliance

    • 摘要: 为了弥补从大数据技术到行业应用之间的鸿沟,针对当前行业用户对大数据处理平台的持续扩展、一体化和多样性需求,提出了大数据一体机的可扩展性、可定制性和多类型处理模型,并基于此设计了云海大数据一体机.该一体机采用兼顾横向和纵向可扩展的体系结构,并采用硬件可定制化设计和混合型软件架构支持多种大数据应用类型. 在此基础上,针对HDFS元数据服务瓶颈问题、MapReduce负载倾斜问题、HBase的跨域问题,介绍了在云海大数据一体机中采用的多元数据服务、负载均衡和跨数据中心大表技术.在电信、金融和环保行业实际案例中的应用和测试表明,上述体系结构和关键技术是可行和有效的.

       

      Abstract: To make up for the gap between big data technologies and industry applications, this paper proposes the models of scalability, customizability and multi-type processing of big data appliance, based on which the in-cloud smart data appliance, i.e. iSDA, is designed and implemented. First, the iSDA is assembled by optimally developing the cooperative computing units, heterogeneous storage and high-speed switching network to take fully advantages of both scale-out and scale-up architectures. Second, iSDA is devised to satisfy diversity requirements of industry big data applications by virtue of hardware customization from light-weight to heavy-load styles, and as well as hybrid software stack including real-time, interaction, streaming and batch processing all accelerated by the in-memory computing engine. Furthermore, in the consideration of the HDFS metadata service bottleneck, MapReduce load skew and HBase cross-domain issue, this paper as well introduces the technologies of multiple metadata servers, load balance algorithm and cross-datacenter big table used in iSDA. The practical use cases in the telecommunication, finance and environmental protection industries show that the proposed architecture and key technologies are feasible and effective, and the comprehensive comparisons with traditional MPP databases and other mainstream Hadoop distributions are also given to detail the advantages of iSDA from both hardware and software aspects.

       

    /

    返回文章
    返回