• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Xie Zhen, Tan Guangming, Sun Ninghui. Research on Optimal Performance of Sparse Matrix-Vector Multiplication and Convoulution Using the Probability-Process-Ram Model[J]. Journal of Computer Research and Development, 2021, 58(3): 445-457. DOI: 10.7544/issn1000-1239.2021.20180601
Citation: Xie Zhen, Tan Guangming, Sun Ninghui. Research on Optimal Performance of Sparse Matrix-Vector Multiplication and Convoulution Using the Probability-Process-Ram Model[J]. Journal of Computer Research and Development, 2021, 58(3): 445-457. DOI: 10.7544/issn1000-1239.2021.20180601

Research on Optimal Performance of Sparse Matrix-Vector Multiplication and Convoulution Using the Probability-Process-Ram Model

Funds: This work was supported by the National Key Research and Development Program of China (2018YFB0204400), the Strategic Priority Research Program of Chinese Academy of Sciences (C)(XDC05010100), and the National Natural Science Foundation of China (62032023, 61972377, 61702483).
More Information
  • Published Date: February 28, 2021
  • Performance models provide insightful perspectives to allow us to predict performance and propose optimization guidance. Although there has been much research, pinpointing bottlenecks of various memory access patterns and reaching high performance of both regular and irregular programs on various hardware configurations are still not trivial. In this work, we propose a novel model called probability-process-ram (PPR) to quantify the amount of compute and data transfer time on general-purpose multicore processors. The PPR model predicts the number of instruction for single-core and probability of memory access between each memory hierarchy through a newly designed cache simulator. By using the automatically extracted best optimization method and expectation, we use PPR model for analyzing and optimizing sparse matrix-vector multiplication and 1D convolution as case study for typical irregular and regular computational kernels. Then we obtain best block sizes for sparse matrices with various sparsity structures, as well as optimal optimization guidance for 1D convolution with different instruction sets support and data sizes. Comparison with Roofline model and ECM model, the proposed PPR model greatly improves prediction accuracy by the newly designed cache simulator and achieves comprehensive feedback ability.
  • Related Articles

    [1]Xu Chuanfu, Qiu Haozhong, Che Yonggang. Optimizing Sequences of Sparse Matrix-Vector Multiplications via Cache Data Reuse[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202550125
    [2]Liu Yuchen, Wang Jia, Chen Yunji, Jiao Shuai. Survey on Computer System Simulator[J]. Journal of Computer Research and Development, 2015, 52(1): 3-15. DOI: 10.7544/issn1000-1239.2015.20140104
    [3]Su Wen, Zhang Longbing, Gao Xiang, Su Menghao. A Cache Locking and Direct Cache Access Based Network Processing Optimization Method[J]. Journal of Computer Research and Development, 2014, 51(3): 681-690.
    [4]Zhao Xinjie, Wang Tao, Guo Shize, Liu Huiying. Cache Attacks on Block Ciphers[J]. Journal of Computer Research and Development, 2012, 49(3): 453-468.
    [5]Zhao Xinjie, Wang Tao, Guo Shize, Liu Huiying. Cache Attacks on Block Ciphers[J]. Journal of Computer Research and Development, 2012, 49(3): 453-468.
    [6]Gao Xiang, Zhang Longbing, Hu Weiwu. A CapacityShared Heterogeneous CMP Cache[J]. Journal of Computer Research and Development, 2008, 45(5): 877-885.
    [7]Zhou Qian, Feng Xiaobing, and Zhang Zhaoqing. Software Pipelining with Cache Profiling Information[J]. Journal of Computer Research and Development, 2008, 45(5): 834-840.
    [8]Zhou Hongwei, Zhang Chengyi, and Zhang Minxuan. A Method of Statistics-Based Cache Leakage Power Estimation[J]. Journal of Computer Research and Development, 2008, 45(2): 367-374.
    [9]Zhou Xuehai, Yu Jie, Li Xi, and Wand Zhigang. Research on Reliability Evaluation of Cache Based on Instruction Behavior[J]. Journal of Computer Research and Development, 2007, 44(4): 553-559.
    [10]Huan Dandan, Li Zusong, Hu Weiwu, Liu Zhiyong. A Cache Adaptive Write Allocate Policy[J]. Journal of Computer Research and Development, 2007, 44(2): 348-354.
  • Cited by

    Periodical cited type(5)

    1. 朱明达,薛济擎,艾纯瑶. SpMV计算的ARM和FPGA异构加速器设计. 电讯技术. 2024(02): 302-309 .
    2. 杜臻,谭光明,孙凝晖. 高性能稀疏矩阵向量乘的程序设计综述. 高技术通讯. 2024(08): 807-823 .
    3. 颜志远 ,解壁伟 ,包云岗 . HVMS:基于混合向量化的SpMV优化机制. 计算机研究与发展. 2024(12): 2969-2984 . 本站查看
    4. 夏天,付格林,曲劭儒,罗中沛,任鹏举. 基于高预测性的稀疏矩阵向量乘法并行计算优化. 计算机研究与发展. 2023(09): 1973-1987 . 本站查看
    5. 苗俊田,刘冬冬,李卓军,赵博,鹿德台. 基于双正交样条小波的输油管道焊接缺陷漏磁信号识别技术. 现代电子技术. 2023(21): 55-58 .

    Other cited types(0)

Catalog

    Article views (1209) PDF downloads (597) Cited by(5)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return