高级检索
    马 琳, 陈 莉, 冯晓兵. 基于动态profiling技术的流水粒度调优[J]. 计算机研究与发展, 2005, 42(6): 1065-1072.
    引用本文: 马 琳, 陈 莉, 冯晓兵. 基于动态profiling技术的流水粒度调优[J]. 计算机研究与发展, 2005, 42(6): 1065-1072.
    Ma Lin, Chen Li, Feng Xiaobing. Tuning Pipeline Granularity Based on Dynamic Profiling Framework[J]. Journal of Computer Research and Development, 2005, 42(6): 1065-1072.
    Citation: Ma Lin, Chen Li, Feng Xiaobing. Tuning Pipeline Granularity Based on Dynamic Profiling Framework[J]. Journal of Computer Research and Development, 2005, 42(6): 1065-1072.

    基于动态profiling技术的流水粒度调优

    Tuning Pipeline Granularity Based on Dynamic Profiling Framework

    • 摘要: 结点间流水是解决数据分布和计算分割不一致时的一种重要的并行发掘技术.结点间流水通过计算与通信的重叠获得并行度.精确的流水粒度是获得良好的流水性能的关键.流水分块取决于很多因素,如程序规模、程序的访问模式、结点规模、结点的计算能力和存储体系、通信系统的性能、通信库开销等等.提出了动态profiling方式并实现在流水粒度的推导中,运行时信息收集部分典型分块,结合代价模型推导流水粒度,该模型考虑局部性优化;探索如何减少插桩执行的开销的同时保证代价模型的精度.实验证明,这种方式有更好的适应性,能获得较好的流水并行.

       

      Abstract: Pipelining is one of useful parallelization techniques for those loops which have cross-processor data dependences. And the pipeline granularity is the key to make the computation time be suitable for communication time and obtain good pipeline performance. Loop strip-mining and loop interchange are good methods to help find the optimal pipeline granularity. And the amount of computation between communication operations in each node is called pipeline granularity or block size. A lot of factors decide the optimal pipeline granularity, such as access mode of application program, program size, total computing node, computation ability and memory architecture of the computing node, performance of communication network, communication mode, and overheads of runtime library, etc. It's hard to assume the block computation time by using static scheme, and the run time scheme will have more extra runtime overhead and may lose more optimization of the application. An approach is presented and realized to compute the pipeline granularity by dynamic profiling and the cost model including the cache locality by loop transform. How to decrease the time of profiling running and guarantee the precision of the cost model is also considered. The results of the experiments prove that the pipeline granularity achieved by dynamic profiling framework has good adaptability and speedup of the execution time of pipelined loop.

       

    /

    返回文章
    返回