Advanced Search
    Ye Guixin, Zhang Yuxiang, Zhang Cheng, Zhao Jiaqi, Wang Huanting. Automatic Optimization Heuristics Method for OpenCL Program Based on Graph Neural Network[J]. Journal of Computer Research and Development, 2023, 60(5): 1121-1135. DOI: 10.7544/issn1000-1239.202110943
    Citation: Ye Guixin, Zhang Yuxiang, Zhang Cheng, Zhao Jiaqi, Wang Huanting. Automatic Optimization Heuristics Method for OpenCL Program Based on Graph Neural Network[J]. Journal of Computer Research and Development, 2023, 60(5): 1121-1135. DOI: 10.7544/issn1000-1239.202110943

    Automatic Optimization Heuristics Method for OpenCL Program Based on Graph Neural Network

    • The last decade years witnessed the rapid development of heterogeneous computer architecture due to the popularization of the Internet of things. As the first cross-platform heterogeneous parallel computing framework, OpenCL(open computing language)has the advantages of standardization and portability. However, OpenCL has certain defects in performance portability because of the complexity and diversity of software and hardware platforms. To address this problem, prior methods leverage deep learning to build an optimization model. But they suffer from an insignificant code optimization effect because existing deep learning-based methods only capture the order dependencies of the program while ignoring the syntactic and semantic information. To this end, we propose ACCEPT, an automated heuristic optimization on OpenCL programs by building a multi-relational graph neural network. Differ from existing methods, ACCEPT first extracts the deep structure and semantic features of the OpenCL program by constructing a multi-relational code graph, then applies an improved graph neural network to extract the high-dimensional feature representation of the constructed code graph. Finally, a decision neural network is used to yield the optimization parameters. We evaluate ACCEPT with heterogeneous device mapping and thread coarsening factor prediction tasks. The experimental results show that ACCEPT outperforms state-of-the-art (SOTA) methods. On the heterogeneous device mapping task, the accuracy can reach 88.7%, and the speedup can be increased by 7.6% compared with the SOTA methods. On the thread coarsening task, the speedup is 5.2% higher than SOTA methods.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return