Generalized Stencil Auto-Tuning Framework on GPU Platform

Sun Qingxiao; Yang Hailong

doi:10.7544/issn1000-1239.202440612

Sun Qingxiao, Yang Hailong. Generalized Stencil Auto-Tuning Framework on GPU Platform[J]. Journal of Computer Research and Development, 2025, 62(10): 2622-2634. DOI: 10.7544/issn1000-1239.202440612

Citation:

Sun Qingxiao, Yang Hailong. Generalized Stencil Auto-Tuning Framework on GPU Platform[J]. Journal of Computer Research and Development, 2025, 62(10): 2622-2634. DOI: 10.7544/issn1000-1239.202440612

Citation:

Sun Qingxiao, Yang Hailong. Generalized Stencil Auto-Tuning Framework on GPU Platform[J]. Journal of Computer Research and Development, 2025, 62(10): 2622-2634. DOI: 10.7544/issn1000-1239.202440612

Generalized Stencil Auto-Tuning Framework on GPU Platform

Graphical Abstract

Graphical Abstract

Abstract

Abstract

Stencil computations are widely adopted in scientific applications. Many HPC platforms utilize the high computation capability of GPUs to accelerate Stencil computations. In recent years, Stencils have become more complex in terms of Stencil order, memory accesses, and computation patterns. To adapt Stencil computations to GPU architectures, the academic community has proposed a variety of optimization techniques based on streaming and tiling. Due to the diversity of Stencil computational patterns and GPU architectures, no single optimization technique fits all Stencil instances. Therefore, researchers have proposed Stencil auto-tuning mechanisms to conduct parameter searches for a given combination of optimization techniques. However, existing mechanisms introduce huge offline profiling costs and online prediction overhead, unable to be flexible to arbitrary Stencil patterns. To address the above problems, we propose a generalized Stencil auto-tuning framework GeST, which achieves the ultimate performance optimization of Stencil computations on GPU platforms. Specifically, GeST constructs the global search space through the zero-padding format, quantifying parameter correlations via the coefficient of variation to generate parameter groups. After that, GeST iteratively selects parameter values from the parameter groups, adjusting the sampling ratio according to the reward policy and avoiding redundant execution through Hash coding. The experimental results show that GeST can identify better-performing parameter settings in a short time compared with other state-of-the-art auto-tuning work.

FullText(HTML)

References (42)

Cited By

Turn off MathJax

Article Contents

Generalized Stencil Auto-Tuning Framework on GPU Platform

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content