Advanced Search
    Li Maowen, Qu Guoyuan, Wei Dazhou, Jia Haipeng. Performance Optimization of Neural Network Convolution Based on GPU Platform[J]. Journal of Computer Research and Development, 2022, 59(6): 1181-1191. DOI: 10.7544/issn1000-1239.20200985
    Citation: Li Maowen, Qu Guoyuan, Wei Dazhou, Jia Haipeng. Performance Optimization of Neural Network Convolution Based on GPU Platform[J]. Journal of Computer Research and Development, 2022, 59(6): 1181-1191. DOI: 10.7544/issn1000-1239.20200985

    Performance Optimization of Neural Network Convolution Based on GPU Platform

    • Image detection and recognition tasks have been applied in more and more production and life scenarios. The convolution-based neural network method is widely used because of its high accuracy. However, the convolution neural network has the problems of many weight parameters and high computational requirements, which are limited by the limited computational power and the variety of edge computing devices. Running high-performance codes across platforms, convolutional neural network optimization based on GPU is increasingly important. In view of the insufficiency of convolution scale and other GEMM methods in convolutional neural network, we present a GEMM optimization method for convolutional neural network size optimization based on block size, branch execution, memory access and calculation scale, which can be applied to Wingrad algorithm and operator combination to further optimize convolution. At the same time, the convolution operator with the best performance is selected based on traversal self-tuning, combining offline compilation, memory pool, 16 b quantization, network scale clipping, etc. to improve the performance of convolutional neural network. Finally, experiments are carried out on AMD V1605B platform to verify the effectiveness of the algorithm. By comparing with other GEMM algorithms and deep learning networks, it is verified that this method can achieve better acceleration than GEMM and Winograd algorithms, and can effectively accelerate the convolutional neural network.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return