• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Zhang Jun, Xie Jingcheng, Shen Fanfan, Tan Hai, Wang Lümeng, He Yanxiang. Performance Optimization of Cache Subsystem in General Purpose Graphics Processing Units: A Survey[J]. Journal of Computer Research and Development, 2020, 57(6): 1191-1207. DOI: 10.7544/issn1000-1239.2020.20200113
Citation: Zhang Jun, Xie Jingcheng, Shen Fanfan, Tan Hai, Wang Lümeng, He Yanxiang. Performance Optimization of Cache Subsystem in General Purpose Graphics Processing Units: A Survey[J]. Journal of Computer Research and Development, 2020, 57(6): 1191-1207. DOI: 10.7544/issn1000-1239.2020.20200113

Performance Optimization of Cache Subsystem in General Purpose Graphics Processing Units: A Survey

Funds: This work was supported by the National Natural Science Foundation of China (61662002, 61972293, 61902189), the Project of Jiangxi Engineering Laboratory on Radioactive Geoscience and Big Data Technology (JELRGBDT201905), the Natural Science Foundation of Jiangsu Province(BK20180821).
More Information
  • Published Date: May 31, 2020
  • With the development of process technology and the improvement of architecture, the parallel computing performance of GPGPU(general purpose graphics processing units) is updated a lot, which makes GPGPU applied more and more widely in the fields of high performance and high throughput. GPGPU can obtain high parallel computing performance, as it can hide the long latency incurred by the memory accesses via supporting thousands of concurrent threads. Due to the existance of irregular computation and memory access in some applications, the performance of the memory subsystem is affected a lot, especially the contention of the on-chip cache can become serious, and the performance of GPGPU can not be up to the maximum. Alleviating the contention and optimizing the performance of the on-chip cache have become one of the main solutions to the optimization of GPGPU. At present, the studies of the performance optimization of the on-chip cache focus on five aspects, including TLP(thread level parallelism) throttling, memory access reordering, data flux enhancement, LLC(last level cache) optimization, and new architecture design based on NVM(non-volatile memory). This paper mainly discusses the performance optimization research methods of the on-chip cache from these aspects. In the end, some interesting research fields of the on-chip cache optimization in future are discussed. The contents of this paper have important significance on the research of the cache subsystem in GPGPU.
  • Related Articles

    [1]Guo Jiang, Wang Miao, Zhang Yujun. Content Type Based Jumping Probability Caching Mechanism in NDN[J]. Journal of Computer Research and Development, 2021, 58(5): 1118-1128. DOI: 10.7544/issn1000-1239.2021.20190871
    [2]Li Li, Liu Huanyu, Lu Laifeng. Probabilistic Caching Content Placement Method Based on Content-Centrality[J]. Journal of Computer Research and Development, 2020, 57(12): 2648-2661. DOI: 10.7544/issn1000-1239.2020.20190704
    [3]Su Wen, Zhang Longbing, Gao Xiang, Su Menghao. A Cache Locking and Direct Cache Access Based Network Processing Optimization Method[J]. Journal of Computer Research and Development, 2014, 51(3): 681-690.
    [4]Zhao Xinjie, Wang Tao, Guo Shize, Liu Huiying. Cache Attacks on Block Ciphers[J]. Journal of Computer Research and Development, 2012, 49(3): 453-468.
    [5]Zhao Xinjie, Wang Tao, Guo Shize, Liu Huiying. Cache Attacks on Block Ciphers[J]. Journal of Computer Research and Development, 2012, 49(3): 453-468.
    [6]Jia Yaocang, Wu Chenggang, Zhang Zhaoqing. Program’s Performance Profiling Optimization for Guiding Static Cache Partitioning[J]. Journal of Computer Research and Development, 2012, 49(1): 93-102.
    [7]Xiao Junhua, Feng Zijun, Zhang Longbing. The Tradeoff Cache Between Latency and Capacity in Chip Multiprocessors[J]. Journal of Computer Research and Development, 2009, 46(1): 167-175.
    [8]Gao Xiang, Zhang Longbing, Hu Weiwu. A CapacityShared Heterogeneous CMP Cache[J]. Journal of Computer Research and Development, 2008, 45(5): 877-885.
    [9]Zhou Qian, Feng Xiaobing, and Zhang Zhaoqing. Software Pipelining with Cache Profiling Information[J]. Journal of Computer Research and Development, 2008, 45(5): 834-840.
    [10]Huan Dandan, Li Zusong, Hu Weiwu, Liu Zhiyong. A Cache Adaptive Write Allocate Policy[J]. Journal of Computer Research and Development, 2007, 44(2): 348-354.
  • Cited by

    Periodical cited type(6)

    1. 赵迪,赵祖高,何克勤,聂磊. 混杂条件下的三维点云目标识别. 组合机床与自动化加工技术. 2023(06): 58-62 .
    2. 赵迪,赵祖高,程煜林,聂磊. 多特征关键点的自适应尺度融合特征点云配准. 电子测量技术. 2023(10): 68-75 .
    3. 孙昊. 基于改进随机森林的海量高维数据最近邻检索. 自动化技术与应用. 2022(11): 73-76 .
    4. 孟祥福,王丹丹,张霄雁,贾江浩. Top-k集合空间关键字近似查询方法. 计算机工程与应用. 2022(23): 104-116 .
    5. 宋涛,曹利波,赵明富,刘帅,罗宇航,杨鑫. 三维点云中关键点的配准与优化算法. 激光与光电子学进展. 2021(04): 375-383 .
    6. 孟祥福,王丹丹,张峰. 空间关键字查询综述. 计算机工程与应用. 2021(20): 13-24 .

    Other cited types(10)

Catalog

    Article views (1020) PDF downloads (498) Cited by(16)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return