• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
高级检索

面向局部性和并行优化的循环分块技术

刘松, 伍卫国, 赵博, 蒋庆

刘松, 伍卫国, 赵博, 蒋庆. 面向局部性和并行优化的循环分块技术[J]. 计算机研究与发展, 2015, 52(5): 1160-1176. DOI: 10.7544/issn1000-1239.2015.20131387
引用本文: 刘松, 伍卫国, 赵博, 蒋庆. 面向局部性和并行优化的循环分块技术[J]. 计算机研究与发展, 2015, 52(5): 1160-1176. DOI: 10.7544/issn1000-1239.2015.20131387
Liu Song, Wu Weiguo, Zhao Bo, Jiang Qing. Loop Tiling for Optimization of Locality and Parallelism[J]. Journal of Computer Research and Development, 2015, 52(5): 1160-1176. DOI: 10.7544/issn1000-1239.2015.20131387
Citation: Liu Song, Wu Weiguo, Zhao Bo, Jiang Qing. Loop Tiling for Optimization of Locality and Parallelism[J]. Journal of Computer Research and Development, 2015, 52(5): 1160-1176. DOI: 10.7544/issn1000-1239.2015.20131387
刘松, 伍卫国, 赵博, 蒋庆. 面向局部性和并行优化的循环分块技术[J]. 计算机研究与发展, 2015, 52(5): 1160-1176. CSTR: 32373.14.issn1000-1239.2015.20131387
引用本文: 刘松, 伍卫国, 赵博, 蒋庆. 面向局部性和并行优化的循环分块技术[J]. 计算机研究与发展, 2015, 52(5): 1160-1176. CSTR: 32373.14.issn1000-1239.2015.20131387
Liu Song, Wu Weiguo, Zhao Bo, Jiang Qing. Loop Tiling for Optimization of Locality and Parallelism[J]. Journal of Computer Research and Development, 2015, 52(5): 1160-1176. CSTR: 32373.14.issn1000-1239.2015.20131387
Citation: Liu Song, Wu Weiguo, Zhao Bo, Jiang Qing. Loop Tiling for Optimization of Locality and Parallelism[J]. Journal of Computer Research and Development, 2015, 52(5): 1160-1176. CSTR: 32373.14.issn1000-1239.2015.20131387

面向局部性和并行优化的循环分块技术

基金项目: 国家自然科学基金项目(91330117);国家”八六三”高技术研究发展计划基金项目(2012AA01A306,2012AA010901)
详细信息
  • 中图分类号: TP314

Loop Tiling for Optimization of Locality and Parallelism

  • 摘要: 循环分块是一种广泛用于改善数据局部性和开发并行性的程序变换优化技术.主要分为2类:固定分块技术和参数化分块技术,系统地总结了这2类技术,并分析了其优缺点.由于分块大小的选择会严重影响分块代码的性能,因此介绍分析了选择最优分块大小的各种方法.此外,总结了循环分块在多级分块、并行性开发和不完美嵌套循环等方面应用的各项技术.通过对循环分块技术当前研究现状的分析,得出如下结论:1)循环分块技术中的计算复杂度和生成代码效率问题还未得到完全解决,如何利用循环边界有效地约束迭代空间并提高数据局部性还需要更深入的研究;2)最优分块大小的选择依然是一个开放式难题,研究清楚分级存储架构中每级分块对性能的影响具有重要的意义;3)从循环分块的应用角度,如何有效地构建面向任意嵌套循环集的自动分块代码生成系统,同时充分利用深度共享存储资源和多核架构实现分块代码的高并行度,也是一个需要深入研究的问题.
    Abstract: Loop tiling is a widely used loop transformation for exposing/exploiting parallelism and data locality in modern computer architecture. It is mainly divided into two categories: fixed and parameterized. These two types of tiling technologies are systematically summarized and their advantages and disadvantages are analyzed comprehensively. Since the tile size would significantly affect the performance of the tiled code, various methods of optimal tile size selection are described. Besides, various kinds of technologies applied to multi-level tiling, parallelism exploration and imperfectly nested loops are surveyed in this paper. Based on the detailed analysis of the current researches on loop tiling technologies, several conclusions are drawn as follows: 1) How to balance the trade-off between computation complexity and generation efficiency of tiled code has not been completely solved, and how to use loop boundaries to efficiently bound the iteration spaces for data locality enhancement also needs further study. 2) Optimal tile size selection is still a difficult and open question, and it would be significant to understand the influence of different level tile size in hierarchical memory system on performance. 3) From the perspective of application, how to automatically generate effective tiled code for arbitrarily nested loops needs further research. On the other hand, how to take full advantage of shared hierarchical memory and multi-core architectures to achieve high degree of parallelism for tiled code is another interesting direction.
  • 期刊类型引用(4)

    1. 钟小妹,肖美华,杨科,罗运先. 基于事件逻辑的PUFs认证协议形式化分析. 华中科技大学学报(自然科学版). 2024(02): 69-76 . 百度学术
    2. 苏霞,张晶晶,孙静. 基于形式化模型的电力信息审计系统安全协议验证方法. 微型电脑应用. 2022(07): 197-200 . 百度学术
    3. 马军,黄慧,夏传福,张丽丽. 基于标识认证和SM2算法的北斗终端接入认证协商协议. 电子设计工程. 2020(19): 67-70+75 . 百度学术
    4. 赖宇阳,陈海倩,张丽娟,孙宏棣. 基于DES算法的IPSec协议安全性改进. 电子设计工程. 2020(20): 25-28+34 . 百度学术

    其他类型引用(1)

计量
  • 文章访问数:  2594
  • HTML全文浏览量:  13
  • PDF下载量:  911
  • 被引次数: 5
出版历程
  • 发布日期:  2015-04-30

目录

    /

    返回文章
    返回