ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2018, Vol. 55 ›› Issue (6): 1308-1319.doi: 10.7544/issn1000-1239.2018.20170024

• 人工智能 • 上一篇    下一篇

基于结构并行的MRBP算法

任刚1,2,3,邓攀2,杨超2,吴长茂2   

  1. 1(河南工学院计算机科学与技术系 河南新乡 453003); 2(中国科学院软件研究所并行软件与计算科学实验室 北京 100190); 3(中国科学院大学 北京 100049) (rengang2013@iscas.ac.cn)
  • 出版日期: 2018-06-01
  • 基金资助: 
    国家自然科学基金项目(61100066)

MapReduce Back Propagation Algorithm Based on Structure Parallelism

Ren Gang1,2,3, Deng Pan2, Yang Chao2, Wu Changmao2   

  1. 1(Department of Computer Science and Technology, Henan Institute of Technology, Xinxiang, Henan 453003); 2(Laboratory of Parallel Software and Computational Science, Institute of Software, Chinese Academy of Sciences, Beijing 100190); 3(University of Chinese Academy of Sciences, Beijing 100049)
  • Online: 2018-06-01

摘要: BP(back propagation)算法是一种常用的神经网络学习算法,而基于Hadoop集群MapReduce编程模型的BP(MapReduce back propagation, MRBP)算法在处理大数据问题时,表现出良好的性能,因而得到了广泛应用.但是,由于该算法缺乏神经节点之间细粒度结构并行的能力,当遇到数据维度较高、网络节点较多时,性能还显不足.另一方面,Hadoop集群计算节点通信不能由用户直接控制,现有基于集群系统的结构并行策略不能直接用于MRBP算法.为此,提出一种适合于Hadoop集群的结构并行MRBP (structure parallelism based MapReduce back propagation, SP-MRBP)算法,该算法将神经网络各层划分为多个结构,通过逐层并行-逐层集成(layer-wise parallelism,layer-wise ensemble, LPLE)的方式,实现了MRBP算法的结构并行.同时,推导出了SP-MRBP算法和MRBP算法计算时间解析表达式,以此分析了2种算法时间差和SP-MRBP算法最优并行规模.据了解,这是首次将结构并行策略引入MRBP算法中.实验表明,当神经网络规模较大时,SP-MRBP较之原算法,具有较好的性能.

关键词: MapReduce模型, 结构并行, BP算法, 多层神经网络, MRBP算法

Abstract: Back propagation (BP) algorithm is a widely used learning algorithm that is used for training multiple layer neural networks. BP algorithm based on Hadoop cluster and MapReduce parallel programming model (MRBP) shows good performance on processing big data problems. However, it lacks the capability of fine-grained parallelism. Thus, when confronted with high dimension data and neural networks with large nodes, the performance is low relatively. On the other hand, since the users can’t control the communication of Hadoop computing nodes, the existing structure parallel scheme based on clusters can’t be directly applied to MRBP algorithm. This paper proposes a structure parallelism based MRBP algorithm (SP-MRBP), which adopts layer-wise parallelism, layer-wise ensemble (LPLE) strategy to implement structure parallel computing. Also, we derive the analytical expressions of the proposed SP-MRBP algorithm and the classic MRBP algorithm, and obtain the time differences between the both algorithms as well as the optimal number of parallel structures of SP-MRBP algorithm. To the best knowledge of the authors, it is the first time to introduce the structure parallelism scheme to the MRBP algorithm. The experimental results show that, compared with the classic MRBP algorithm, our algorithm has better performance on processing efficiency when facing large neural networks.

Key words: MapReduce model, structure parallelism, back propagation (BP) algorithm, multiple layer neural networks, MapReduce back propagation (MRBP) algorithm

中图分类号: