ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2015, Vol. 52 ›› Issue (4): 851-860.doi: 10.7544/issn1000-1239.2015.20131415

• 系统结构 • 上一篇    下一篇

MASNUM海浪模式的性能特点分析与并行优化

张志远1,2,3, 周宇峰1,2, 刘利2, 杨广文1,2   

  1. 1(清华大学计算机科学与技术系 北京 100084); 2(地球系统数值模拟教育部重点实验室(清华大学) 北京 100084); 3(海军海洋水文气象中心 北京 100161) (generalzzy@139.com)
  • 出版日期: 2015-04-01
  • 基金资助: 
    基金项目:国家自然科学基金项目(41275098);国家“八六三”高技术研究发展计划基金项目(2013AA01A208)

Performance Characterization and Efficient Parallelization of MASNUM Wave Model

Zhang Zhiyuan1,2,3,Zhou Yufeng1,2,Liu Li2, Yang Guangwen1,2   

  1. 1(Department of Computer Science and Technology, Tsinghua University, Beijing 100084); 2(Key Laboratory for Earth System Modeling, Center for Earth System Science(Tsinghua University), Ministry of Education, Beijing 100084); 3(Hydro-Meteorological Center of Navy, Beijing 100161)
  • Online: 2015-04-01

摘要: 海浪模式MASNUM(marine science and numerical modeling)是我国自主研发的海浪数值模式,该模式已广泛应用于我国海洋防灾减灾、海上交通运输、军事活动保障等方面的海浪预报中.随着提升业务预报精度和气候研究需求的不断增长,高分辨率成为海浪模式发展的必由之路.尽管高性能计算机的快速发展为高分辨率数值模式提供了强大的计算能力支持,但当前很多并行数值模式效率还不高,无法获得更高并行加速比,无法提高模式并行效率并缩短运行墙钟时间.结合现代高性能计算机体系结构特点,深入分析MASNUM模式的性能瓶颈,继而有针对性地对其开展并行优化,明显地提升了通信性能、I/O性能和二维剖分负载平衡性,进而提升了MASNUM模式整体并行效率和可扩展规模.这里以串行性能为基准,当扩展规模达到960个CPU核时,改进后版本加速比可达431.5.该研究也为其他数值模式提供了一些可供借鉴的并行优化策略.

关键词: 海浪模式, 大规模数值并行计算, 性能分析, 并行优化, 二维剖分

Abstract: Marine science and numerical modeling (MASNUM) is a numerical wave model developed by China, which has been widely used in wave forecasting for ocean disaster prevention and reduction, ocean transportation and military activities. With the increasing demands on higher forecasting precision and climate research, higher and higher resolution becomes a main stream in wave model development. Although the fast development of high-performance computer provides increasing computing power for high-resolution model, parallel version of model is always inefficient to achieve sufficient performance acceleration that can improve the parallel efficiency of the wave model and can shorten the running wall time. In this paper, we firstly characterize the performance of the MASNUM model on a modern high-performance computer to reveal several performance bottlenecks. Then, we propose several parallel optimizations, which dramatically improve communication performance, I/O performance and load balance of two dimension parallel decomposition. And these parallel optimizations consequently significantly improve the overall parallel efficiency and scaling performance of MASNUM model. When we use 960 CPU cores in order to check the MASNUM performance acceleration, the improved parallel version can achieve 4315-fold speedup with the baseline of sequential performance. Based on our experiments, we suggest setting some parallel efficient strategies in order to achieve the high parallel efficiency of other numerical models.

Key words: wave model, large-scale numerical parallel computing, performance characterization, efficient parallelization, 2,dimension parallel decomposition

中图分类号: