Locality Analysis and Optimization for Stream Programs Based on Iteration Sequence
-
Graphical Abstract
-
Abstract
The stream programming model has been widely studied in recent years, and it is proved to be a good candidate for the stream architecture based on software-managed streaming memory such as stream register file. However, some researches point out that the stream programming model is also beneficial to the hardware-managed coherent cache-based architectures. Besides, general data cache has been integrated into the GPGPU recently, which is the most important scenario where the stream programming model is used. Thus, exploiting the cache locality becomes very critical to improve the performance of the stream programs on these architectures. Due to the particular execution model, the way that the reuse carried by the stream programs is transformed into locality differs from that of serial programs. The traditional locality analysis method cannot be used to analyze the stream programs directly. Based on the detailed analysis of transformation from reuse to locality, we propose the concept of “iteration sequence” to capture the difference between the execution model of stream and serial programs. Then we extend the traditional locality analysis method and propose a general locality analysis method based on the iteration sequence. Further, we propose two locality optimization techniques derived from the locality analysis model. The experimental results obtained on the GPGPUSim simulator illustrate that our quantitative analysis about the locality of stream program is efficient and the locality optimization techniques can evidently improve the locality and the performance.
-
-