• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Heterogeneous Programming and Optimization of Gyrokinetic Simulation Code on Arithmetic Intensity System[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202330872
Citation: Heterogeneous Programming and Optimization of Gyrokinetic Simulation Code on Arithmetic Intensity System[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202330872

Heterogeneous Programming and Optimization of Gyrokinetic Simulation Code on Arithmetic Intensity System

More Information
  • Received Date: October 30, 2023
  • Available Online: April 03, 2025
  • The magnetic confinement fusion particle-in-cell (PIC) gyrokinetic simulation code, VirtEx, has been capable of studying the confinement and transport of the fusion product Alpha, which is the key to fusion energy realization. Alpha particle simulation relies heavily on the computational code of the kinetic ion, which has more complex memory access than the electron, and contains both non-regular accesses and atomic write-back operations, belong to memory-intensive application. MT-3000 as a new heterogeneous acceleration device, provided by Tianhe's new-generation supercomputing platform, which have powerful computational performance with its extremely high computational density. Heterogeneous porting of alpha particle simulations for this device is a great challenge, in order to fully exploit the computational power of the acceleration array in MT-3000, we combine application characteristics propose some optimization methods, such as recalculation of intermediate variables, customized software cache design, memory locality optimization, and hotspot function merging, are designed and implemented to reduce the total amount of memory accesses in the program. The medium scale benchmark with gyrokinetic ion shows an overall speedup of 4.2 times, with 10.9, 13.3 and 16.2 times of speedup on hotspot functions "push", "locate" and "charge", respectively, meanwhile it shows a good scaling of scalability with 88.4% efficiency with 5,898,240 accelerator cores in 3840 nodes.
  • Related Articles

    [1]Guo Husheng, Zhang Yutong, Wang Wenjian. Elastic Gradient Ensemble for Concept Drift Adaptation[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440407
    [2]Guo Husheng, Zhang Yang, Wang Wenjian. Two-Stage Adaptive Ensemble Learning Method for Different Types of Concept Drift[J]. Journal of Computer Research and Development, 2024, 61(7): 1799-1811. DOI: 10.7544/issn1000-1239.202330452
    [3]Guo Husheng, Cong Lu, Gao Shuhua, Wang Wenjian. Adaptive Classification Method for Concept Drift Based on Online Ensemble[J]. Journal of Computer Research and Development, 2023, 60(7): 1592-1602. DOI: 10.7544/issn1000-1239.202220245
    [4]Cai Derun, Li Hongyan. A Metric Learning Based Unsupervised Domain Adaptation Method with Its Application on Mortality Prediction[J]. Journal of Computer Research and Development, 2022, 59(3): 674-682. DOI: 10.7544/issn1000-1239.20200693
    [5]Cai Huan, Lu Kezhong, Wu Qirong, Wu Dingming. Adaptive Classification Algorithm for Concept Drift Data Stream[J]. Journal of Computer Research and Development, 2022, 59(3): 633-646. DOI: 10.7544/issn1000-1239.20201017
    [6]Yu Xian, Li Zhenyu, Sun Sheng, Zhang Guangxing, Diao Zulong, Xie Gaogang. Adaptive Virtual Machine Consolidation Method Based on Deep Reinforcement Learning[J]. Journal of Computer Research and Development, 2021, 58(12): 2783-2797. DOI: 10.7544/issn1000-1239.2021.20200366
    [7]Bai Chenjia, Liu Peng, Zhao Wei, Tang Xianglong. Active Sampling for Deep Q-Learning Based on TD-error Adaptive Correction[J]. Journal of Computer Research and Development, 2019, 56(2): 262-280. DOI: 10.7544/issn1000-1239.2019.20170812
    [8]Zhang Yuanpeng, Deng Zhaohong, Chung Fu-lai, Hang Wenlong, Wang Shitong. Fast Self-Adaptive Clustering Algorithm Based on Exemplar Score Strategy[J]. Journal of Computer Research and Development, 2018, 55(1): 163-178. DOI: 10.7544/issn1000-1239.2018.20160937
    [9]Ma Anxiang, Zhang Bin, Gao Kening, Qi Peng, and Zhang Yin. Deep Web Data Extraction Based on Result Pattern[J]. Journal of Computer Research and Development, 2009, 46(2): 280-288.
    [10]Dandan, Li Zusong, Wang Jian, Zhang Longbing, Hu Weiwu, Liu Zhiyong. Adaptive Stack Cache with Fast Address Generation[J]. Journal of Computer Research and Development, 2007, 44(1): 169-176.

Catalog

    Article views (9) PDF downloads (2) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return