• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Gu Beibei, Qiu Jiyan, Wang Ning, Chen Jian, Chi Xuebin. A Performance Data Collection Method for Computing Software in Heterogeneous Systems[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440512
Citation: Gu Beibei, Qiu Jiyan, Wang Ning, Chen Jian, Chi Xuebin. A Performance Data Collection Method for Computing Software in Heterogeneous Systems[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440512

A Performance Data Collection Method for Computing Software in Heterogeneous Systems

Funds: This work was supported by the National Natural Science Foundation of China (62372428).
More Information
  • Author Bio:

    Gu Beibei: born in 1982. PhD candidate, associate professor. Member of CCF. Her main research interest includes performance evaluation and analysis of high performance computing software applications

    Qiu Jiyan: born in 1998. PhD candidate. His main research interests include AI for science and high performance computing

    Wang Ning: born in 1988. Bachelor. His main research interest includes performance analysis and optimization of general-purpose processors

    Chen Jian: born in 1977. PhD. Vice chairman of CCF, standing committee member of CCF TCHPC, executive committee member of CCF TCAIPR. His main research interests include AI and high performance computing

    Chi Xuebin: born in 1963. PhD, professor, PhD supervisor. His main research interest includes parallel computing

  • Received Date: June 16, 2023
  • Revised Date: January 14, 2025
  • Accepted Date: January 25, 2025
  • Available Online: January 25, 2025
  • Supercomputing has rapidly developed from traditional CPU clusters to heterogeneous platforms. With the type conversion of hardware platforms, it faces significant challenges in optimizing computing software programs and performance evaluation. Currently, some international mainstream parallel program performance analysis tools and software generally have low compatibility with domestic supercomputing heterogeneous system processor products, often requiring instrumentation and recompilation of code, and low accuracy in single node performance data collection. To improve these shortcomings, this article proposes a floating-point performance data collection method for heterogeneous system computing software. This method is based on the domestic supercomputing system verification platform to develop and verify the floating-point performance collection prototype. At present, effective collection of single node and multi node performance indicator data has been achieved, and it is non-invasive to the original program. There is no need to modify the code of the monitored program for monitoring in a plug-in manner, making it highly versatile. Finally, we conducted comparative experimental analysis with three types of programs: rocHPL, Cannon, and mixbench, and conducted performance data collection monitoring research on ResNet (residual network, ResNet) program for AI computing. We have demonstrated that the collection method proposed in this article has high accuracy, achieves the expected collection effect in experiments, and has good reference value for program optimization, verifying the effectiveness of the proposed method.

  • [1]
    Szegedy C, Liu Wei, Jia Yangqing, et al. Going deeper with convolutions[C/OL]//Proc of the 28th Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2015[2025-01-09]. https://ieeexplore.ieee.org/document/7298594
    [2]
    Madsen J R, Awan M G, Brunie H, et al. Timemory: Modular performance analysis for HPC[C]//Proc of the 35th Conf on ISC High Performance 2020(ISC 2020). Berlin: Springer, 2020: 434–452
    [3]
    Martin B, Kim B D, Jeff D, et al. PerfExpert: An easy-to-use performance diagnosis tool for HPC applications[C/OL]//Proc of the 24th Int Conf for High Performance Computing, Networking, Storage and Analysis. Piscataway, NJ: EEE, 2010[2025-01-09]. https://ieeexplore.ieee.org/document/5644905
    [4]
    Dieter M,Scott B. Bischof C,et al. Score-P:A unified performance measurement system for petascale applications[C]//Proc of Competence in High Performance Computing,2010. Berlin:Springer,2012:85–97(没有届
    [5]
    Miceli R, Civario G, Sikora A, et. al. Autotune: A plugin-driven approach to the automatic tuning of parallel applications[C]//Proc of the 11th Int Conf on Applied Parallel and Scientic Computing. Berlin: Springer, 2013: 328−342
    [6]
    Gerndt M, Kereku E. Periscope: Advanced techniques for performance analysis[C]//Proc of the Int Conf on Parallel Computing: Current & Future Issues of High-End Computing 2005. Julich: John von Neumann Institute for Computing, 2006: 15-
    [7]
    Parasyris K, Lna I, Menon H, et al. HPC-MixPBench: An HPC benchmark suite for mixed-precision analysis[C]//Proc of the 17th Int Conf in 2020 IEEE Int Symp on Workload Characterization. Piscataway, NJ: IEEE, 2020: 25−36
    [8]
    Chalmers N, Kurzak J, McDougall D, et al. Optimizing high-performance linpack for exascale accelerated architectures[J]. arXiv preprint, arXiv: 2304.10397v1, 2023
    [9]
    Dongarra J, Luszczek P, Petitet A. The LINPACK benchmark: Past, present and future[J]. Concurrency and Computation: Practice and Experience, 2003, 15(9): 803−820 doi: 10.1002/cpe.728
    [10]
    黎雷生,杨文浩,马文静,等. 复杂异构计算系统 HPL的优化[J]. 软件学报,2021,32(8):2307−2318

    Li Leisheng, Yang Wenhao, Ma Wenjing, et al. Optimization of HPL on complex heterogeneous computing system[J]. Journal of Software, 2021, 32(8): 2307−2318 (in Chinese)
    [11]
    Eustace A, Srivastava A. ATOM: A flexible interface for building high performance program analysis tools[C/OL]//Proc of the Winter 1995 USENIX Conf. New York: ACM, 1995[2025-01-09]. https://dl.acm.org/doi/abs/10.5555/1267411.1267436(没有届

    Eustace A, Srivastava A. ATOM: A flexible interface for building high performance program analysis tools[C/OL]//Proc of the Winter 1995 USENIX Conf. New York: ACM, 1995[2025-01-09]. https://dl.acm.org/doi/abs/10.5555/1267411.1267436(没有届)
    [12]
    Browne S, Dongarra J, Garner N, et al. A scalable cross-platform infrastructure for application performance tuning using hardware counters[C]//Proc of the 12th Int Conf for High Performance Computing, Networking, Storage and Analysis. Piscataway, NJ: IEEE, 2000: 42−55
    [13]
    He Kaiming, Zhang Xiangyu, Ren Shaoqing, et al. Deep residual learning for image recognition[C]//Proc of the 29th Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2016: 770−778
  • Related Articles

    [1]Wu Huanhuan, Xie Ruilin, Qiao Yuanxin, Chen Xiang, Cui Zhanqi. Optimizing Deep Neural Network Based on Interpretability Analysis[J]. Journal of Computer Research and Development, 2024, 61(1): 209-220. DOI: 10.7544/issn1000-1239.202220803
    [2]Li Qian, Lin Chenhao, Yang Yulong, Shen Chao, Fang Liming. Adversarial Attacks and Defenses Against Deep Learning Under the Cloud-Edge-Terminal Scenes[J]. Journal of Computer Research and Development, 2022, 59(10): 2109-2129. DOI: 10.7544/issn1000-1239.20220665
    [3]Shen Zhengchen, Zhang Qianli, Zhang Chaofan, Tang Xiangyu, Wang Jilong. Location Privacy Attack Based on Deep Learning[J]. Journal of Computer Research and Development, 2022, 59(2): 390-402. DOI: 10.7544/issn1000-1239.20200843
    [4]Gu Mianxue, Sun Hongyu, Han Dan, Yang Su, Cao Wanying, Guo Zhen, Cao Chunjie, Wang Wenjie, Zhang Yuqing. Software Security Vulnerability Mining Based on Deep Learning[J]. Journal of Computer Research and Development, 2021, 58(10): 2140-2162. DOI: 10.7544/issn1000-1239.2021.20210620
    [5]Wang Huijiao, Cong Peng, Jiang Hua, Wei Yongzhuang. Security Analysis of SIMON32/64 Based on Deep Learning[J]. Journal of Computer Research and Development, 2021, 58(5): 1056-1064. DOI: 10.7544/issn1000-1239.2021.20200900
    [6]Zhou Chunyi, Chen Dawei, Wang Shang, Fu Anmin, Gao Yansong. Research and Challenge of Distributed Deep Learning Privacy and Security Attack[J]. Journal of Computer Research and Development, 2021, 58(5): 927-943. DOI: 10.7544/issn1000-1239.2021.20200966
    [7]Chen Jinyin, Chen Yipeng, Chen Yiming, Zheng Haibin, Ji Shouling, Shi Jie, Cheng Yao. Fairness Research on Deep Learning[J]. Journal of Computer Research and Development, 2021, 58(2): 264-280. DOI: 10.7544/issn1000-1239.2021.20200758
    [8]Wang Ruiqin, Wu Zongda, Jiang Yunliang, Lou Jungang. An Integrated Recommendation Model Based on Two-stage Deep Learning[J]. Journal of Computer Research and Development, 2019, 56(8): 1661-1669. DOI: 10.7544/issn1000-1239.2019.20190178
    [9]Zhou Yucong, Liu Yi, Wang Rui. Training Deep Neural Networks for Image Applications with Noisy Labels by Complementary Learning[J]. Journal of Computer Research and Development, 2017, 54(12): 2649-2659. DOI: 10.7544/issn1000-1239.2017.20170637
    [10]Zhang Lei, Zhang Yi. Big Data Analysis by Infinite Deep Neural Networks[J]. Journal of Computer Research and Development, 2016, 53(1): 68-79. DOI: 10.7544/issn1000-1239.2016.20150663

Catalog

    Article views (49) PDF downloads (14) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return