ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2015, Vol. 52 ›› Issue (4): 833-842.doi: 10.7544/issn1000-1239.2015.20131922

• 系统结构 • 上一篇    下一篇

高阶精度CFD应用在天河2系统上的异构并行模拟与性能优化

王勇献1,2, 张理论1,2, 车永刚1,2, 徐传福1, 刘巍1, 程兴华1   

  1. 1(国防科学技术大学计算机学院 长沙 410073); 2(国防科学技术大学并行与分布处理重点实验室 长沙 410073) (yxwang@nudt.edu.cn)
  • 出版日期: 2015-04-01
  • 基金资助: 
    基金项目:国家自然科学基金项目(61379056, 11272352);国家“九七三”重点基础研究发展计划基金项目(2009CB723803)

Heterogeneous Computing and Optimization on Tianhe-2,Supercomputer System for High-Order Accurate CFD Applications

Wang Yongxian1,2,Zhang Lilun1,2,Che Yonggang1,2,Xu Chuanfu1,Liu Wei1,Cheng Xinghua1   

  1. 1(College of Computer, National University of Defense Technology, Changsha 410073); 2(Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Changsha 410073)
  • Online: 2015-04-01

摘要: 在当前主流的众核异构高性能计算机平台上开展超大规模计算流体力学(computational fluid dynamics, CFD)应用的高效并行数值模拟仍然面临着一系列挑战性技术问题,也是该领域的热点研究问题之一.面向天河2高性能异构并行计算平台,针对高阶精度CFD流场数值模拟程序的高效并行进行了探索,重点讨论了CFD应用特点与众核异构高性能计算机平台特征相适应的性能优化策略,从任务分解、并行度挖掘、多线程优化、SIMD向量化、CPU与加速器协同优化等方面,提出一系列性能提升技术.通过在天河2高性能异构并行计算平台上进行了多个算例的数值模拟,模拟的最大CFD规模达到1228亿个网格点,共使用约59万CPU+MIC处理器核,测试结果表明移植优化后的程序性能提高2.6倍左右,且具有良好的可扩展性.

关键词: 计算流体动力学, 高阶精度格式, 并行计算, CPU+MIC异构协同并行, 性能优化, 天河2超级计算机

Abstract: There still exist great challenges when simulating the large-scale computational fluid dynamics (CFD) applications on the contemporary supercomputer systems with many-core heterogeneous architecture like Tianhe-2, which is also one of the research hotspots in this field. In this paper, we focus on exploring the techniques of efficient parallel simulations on the heterogeneous high-performance computing (HPC) platform for large-scale CFD applications with high-order accurate scheme. Some approaches and strategies of performance optimization matched with both the characteristic of CFD application and the architectures of heterogeneous HPC platform are proposed from the perspective of task decomposition, exploration of parallelism, optimization for multi-threaded running, vectorization by employing single-instruction multiple-data (SIMD), optimization for the cooperation of both CPUs and co-processors, and so on. To evaluate the performance of these techniques, some numerical experiments are performed on Tianhe-2,supercomputer system with the maximum number of grid points achieving 1.228×1011, and the total amount of processors and/or co-processors being 590000. Such a large-scale CFD simulation with high-order accurate scheme has to our best knowledge never been attempted before. It shows that the optimized code can get the speedup of 2.6X on CPU and co-processor hybrid platform than that on the CPU platform only, and perfect scalability is also observed from the test results. The present work redefines the frontier of high performance computing for fluid dynamics simulations on heterogeneous platform.

Key words: computational fluid dynamics (CFD), high-order accurate scheme, parallel computing, CPU+MIC heterogeneous computing, performance optimization, Tianhe-2,supercompter

中图分类号: