高级检索
    张志远, 周宇峰, 刘利, 杨广文. MASNUM海浪模式的性能特点分析与并行优化[J]. 计算机研究与发展, 2015, 52(4): 851-860. DOI: 10.7544/issn1000-1239.2015.20131415
    引用本文: 张志远, 周宇峰, 刘利, 杨广文. MASNUM海浪模式的性能特点分析与并行优化[J]. 计算机研究与发展, 2015, 52(4): 851-860. DOI: 10.7544/issn1000-1239.2015.20131415
    Zhang Zhiyuan, Zhou Yufeng, Liu Li, Yang Guangwen. Performance Characterization and Efficient Parallelization of MASNUM Wave Model[J]. Journal of Computer Research and Development, 2015, 52(4): 851-860. DOI: 10.7544/issn1000-1239.2015.20131415
    Citation: Zhang Zhiyuan, Zhou Yufeng, Liu Li, Yang Guangwen. Performance Characterization and Efficient Parallelization of MASNUM Wave Model[J]. Journal of Computer Research and Development, 2015, 52(4): 851-860. DOI: 10.7544/issn1000-1239.2015.20131415

    MASNUM海浪模式的性能特点分析与并行优化

    Performance Characterization and Efficient Parallelization of MASNUM Wave Model

    • 摘要: 海浪模式MASNUM(marine science and numerical modeling)是我国自主研发的海浪数值模式,该模式已广泛应用于我国海洋防灾减灾、海上交通运输、军事活动保障等方面的海浪预报中.随着提升业务预报精度和气候研究需求的不断增长,高分辨率成为海浪模式发展的必由之路.尽管高性能计算机的快速发展为高分辨率数值模式提供了强大的计算能力支持,但当前很多并行数值模式效率还不高,无法获得更高并行加速比,无法提高模式并行效率并缩短运行墙钟时间.结合现代高性能计算机体系结构特点,深入分析MASNUM模式的性能瓶颈,继而有针对性地对其开展并行优化,明显地提升了通信性能、I/O性能和二维剖分负载平衡性,进而提升了MASNUM模式整体并行效率和可扩展规模.这里以串行性能为基准,当扩展规模达到960个CPU核时,改进后版本加速比可达431.5.该研究也为其他数值模式提供了一些可供借鉴的并行优化策略.

       

      Abstract: Marine science and numerical modeling (MASNUM) is a numerical wave model developed by China, which has been widely used in wave forecasting for ocean disaster prevention and reduction, ocean transportation and military activities. With the increasing demands on higher forecasting precision and climate research, higher and higher resolution becomes a main stream in wave model development. Although the fast development of high-performance computer provides increasing computing power for high-resolution model, parallel version of model is always inefficient to achieve sufficient performance acceleration that can improve the parallel efficiency of the wave model and can shorten the running wall time. In this paper, we firstly characterize the performance of the MASNUM model on a modern high-performance computer to reveal several performance bottlenecks. Then, we propose several parallel optimizations, which dramatically improve communication performance, I/O performance and load balance of two dimension parallel decomposition. And these parallel optimizations consequently significantly improve the overall parallel efficiency and scaling performance of MASNUM model. When we use 960 CPU cores in order to check the MASNUM performance acceleration, the improved parallel version can achieve 4315-fold speedup with the baseline of sequential performance. Based on our experiments, we suggest setting some parallel efficient strategies in order to achieve the high parallel efficiency of other numerical models.

       

    /

    返回文章
    返回