• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Zhang Xiao, Zhi Tian. Machine Learning Inference Framework on Multi-Core Processor[J]. Journal of Computer Research and Development, 2019, 56(9): 1977-1987. DOI: 10.7544/issn1000-1239.2019.20180786
Citation: Zhang Xiao, Zhi Tian. Machine Learning Inference Framework on Multi-Core Processor[J]. Journal of Computer Research and Development, 2019, 56(9): 1977-1987. DOI: 10.7544/issn1000-1239.2019.20180786

Machine Learning Inference Framework on Multi-Core Processor

Funds: This work was supported by the National Key Research and Development Program of China (2017YFA0700900, 2017YFA0700902, 2017YFA0700901, 2017YFB1003101), the National Natural Science Foundation of China (61472396, 61432016, 61473275, 61522211, 61532016, 61521092, 61502446, 61672491, 61602441, 61602446, 61732002, 61702478, 61732020), the Beijing Natural Science Foundation (JQ18013), the National Basic Research Program of China (973 Program) (2015CB358800), the National Science and Technology Major Projects of Hegaoji (2018ZX01031102), the Transformation and Transfer of Scientific and Technological Achievements of Chinese Academy of Sciences (KFJ-HGZX-013), and the Strategic Priority Research Program of Chinese Academy of Sciences (XDB32050200).
More Information
  • Published Date: August 31, 2019
  • In recent years, deep neural network has been widely used in many domains and got huge success. Since the size and computation workload for neural network model is increasing rapidly, GPU and many new-designed domain-specific accelerators have been used in order to complete computing neural networks as soon as possible. However, the traditional general-purpose processor should not be ignored. Considering it is common and easy to get, exploring efficient way for using general-purpose processor in deep learning is meaningful. In training phase, the multi-core architecture is suitable for data parallelism which helps to increase system throughput. However, in inference phase, end-to-end latency is much more important than throughput, and traditional data parallelism could not fulfill the requirement of small batch and low latency. In order to utilize hardware resource of multi-core architecture, it is necessary to split the computation task into smaller parts which can be executed on multi-core processor in parallel. Besides, a sophisticated strategy is necessary to make sure the split plan will not affect computing efficiency on each core. In this paper, we propose a parallel framework for the multi-core general-purpose processor. It divides each operation in the neural network into smaller ones and executes them on the multiple cores in parallel. By offering some necessary assistant operations, this framework can be easily transplanted to support potential multi-core processors. Also, the framework can automatically generate an effective splitting plan for the given neural networks. The plan is designed with enough consideration of both network architecture and low-level hardware. The experimental results show that this framework can give an efficient splitting plan which substantially reduces the end-to-end latency of inference task on multi-core processor.
  • Related Articles

    [1]Zhang Yingjun, Chen Kai, Zhou Geng, Lü Peizhuo, Liu Yong, Huang Liang. Research Progress of Neural Networks Watermarking Technology[J]. Journal of Computer Research and Development, 2021, 58(5): 964-976. DOI: 10.7544/issn1000-1239.2021.20200978
    [2]Liu Qixu, Liu Xinyu, Luo Cheng, Wang Junnan, Chen Langping, Liu Jiaxi. Android Browser Fingerprinting Identification Method Based on Bidirectional Recurrent Neural Network[J]. Journal of Computer Research and Development, 2020, 57(11): 2294-2311. DOI: 10.7544/issn1000-1239.2020.20200459
    [3]Liu Ye, Huang Jinxiao, Ma Yutao. An Automatic Method Using Hybrid Neural Networks and Attention Mechanism for Software Bug Triaging[J]. Journal of Computer Research and Development, 2020, 57(3): 461-473. DOI: 10.7544/issn1000-1239.2020.20190606
    [4]Zhuang Liansheng, Lü Yang, Yang Jian, Li Houqiang. Long Term Recurrent Neural Network with State-Frequency Memory[J]. Journal of Computer Research and Development, 2019, 56(12): 2641-2648. DOI: 10.7544/issn1000-1239.2019.20180474
    [5]Zhang Xiangwen, Lu Ziyao, Yang Jing, Lin Qian, Lu Yu, Wang Hongji, Su Jinsong. Weighted Lattice Based Recurrent Neural Networks for Sentence Semantic Representation Modeling[J]. Journal of Computer Research and Development, 2019, 56(4): 854-865. DOI: 10.7544/issn1000-1239.2019.20170917
    [6]Chen Guilin, Ma Sheng, Guo Yang. Survey on Accelerating Neural Network with Hardware[J]. Journal of Computer Research and Development, 2019, 56(2): 240-253. DOI: 10.7544/issn1000-1239.2019.20170852
    [7]Chen Zhiming, Li Maoxi, Wang Mingwen. Sentence-Level Machine Translation Quality Estimation Based on Neural Network Features[J]. Journal of Computer Research and Development, 2017, 54(8): 1804-1812. DOI: 10.7544/issn1000-1239.2017.20170182
    [8]Zhang Lei, Zhang Yi. Big Data Analysis by Infinite Deep Neural Networks[J]. Journal of Computer Research and Development, 2016, 53(1): 68-79. DOI: 10.7544/issn1000-1239.2016.20150663
    [9]Zhang Zhong and Li Chuandong. Asymptotical Stability Analysis for Recurrent Neural Networks with Time-Varying Delays[J]. Journal of Computer Research and Development, 2007, 44(6): 973-979.
    [10]Wang Zhengqun, Chen Shifu, Chen Zhaoqian. An Active Learning Approach for Neural Network Ensemble[J]. Journal of Computer Research and Development, 2005, 42(3).

Catalog

    Article views (1337) PDF downloads (938) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return