A Performance Data Collection Method for Computing Software in Heterogeneous Systems

Gu Beibei; Qiu Jiyan; Wang Ning; Chen Jian; Chi Xuebin

doi:10.7544/issn1000-1239.202440512

Gu Beibei, Qiu Jiyan, Wang Ning, Chen Jian, Chi Xuebin. A Performance Data Collection Method for Computing Software in Heterogeneous Systems[J]. Journal of Computer Research and Development, 2025, 62(9): 2382-2395. DOI: 10.7544/issn1000-1239.202440512

Citation:

A Performance Data Collection Method for Computing Software in Heterogeneous Systems

Graphical Abstract

Graphical Abstract

Abstract

Abstract

Supercomputing has rapidly developed from traditional CPU clusters to heterogeneous platforms. With the type conversion of hardware platforms, it faces significant challenges in optimizing computing software programs and performance evaluation. Currently, some international mainstream parallel program performance analysis tools and software generally have low compatibility with domestic supercomputing heterogeneous system processor products, often requiring instrumentation and recompilation of code, and low accuracy in single node performance data collection. To improve these shortcomings, we propose a floating-point performance data collection method for heterogeneous system computing software. This method is based on the domestic supercomputing system verification platform to develop and verify the floating-point performance collection prototype. At present, effective collection of single node and multi node performance indicator data has been achieved, and it is non-invasive to the original program. There is no need to modify the code of the monitored program for monitoring in a plug-in manner, making it highly versatile. Finally, we conduct comparative experimental analysis with three types of programs: rocHPL, Cannon, and mixbench, and conduct performance data collection monitoring research on ResNet (residual network, ResNet) program for AI computing. We demonstrate that the collection method proposed in this article has high accuracy, achieves the expected collection effect in experiments, and has good reference value for program optimization, verifying the effectiveness of the proposed method.

FullText(HTML)

References (13)

Cited By

Turn off MathJax

Article Contents

A Performance Data Collection Method for Computing Software in Heterogeneous Systems

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content