高级检索
    王勇献, 张理论, 车永刚, 徐传福, 刘巍, 程兴华. 高阶精度CFD应用在天河2系统上的异构并行模拟与性能优化[J]. 计算机研究与发展, 2015, 52(4): 833-842. DOI: 10.7544/issn1000-1239.2015.20131922
    引用本文: 王勇献, 张理论, 车永刚, 徐传福, 刘巍, 程兴华. 高阶精度CFD应用在天河2系统上的异构并行模拟与性能优化[J]. 计算机研究与发展, 2015, 52(4): 833-842. DOI: 10.7544/issn1000-1239.2015.20131922
    Wang Yongxian, Zhang Lilun, Che Yonggang, Xu Chuanfu, Liu Wei, Cheng Xinghua. Heterogeneous Computing and Optimization on Tianhe-2,Supercomputer System for High-Order Accurate CFD Applications[J]. Journal of Computer Research and Development, 2015, 52(4): 833-842. DOI: 10.7544/issn1000-1239.2015.20131922
    Citation: Wang Yongxian, Zhang Lilun, Che Yonggang, Xu Chuanfu, Liu Wei, Cheng Xinghua. Heterogeneous Computing and Optimization on Tianhe-2,Supercomputer System for High-Order Accurate CFD Applications[J]. Journal of Computer Research and Development, 2015, 52(4): 833-842. DOI: 10.7544/issn1000-1239.2015.20131922

    高阶精度CFD应用在天河2系统上的异构并行模拟与性能优化

    Heterogeneous Computing and Optimization on Tianhe-2,Supercomputer System for High-Order Accurate CFD Applications

    • 摘要: 在当前主流的众核异构高性能计算机平台上开展超大规模计算流体力学(computational fluid dynamics, CFD)应用的高效并行数值模拟仍然面临着一系列挑战性技术问题,也是该领域的热点研究问题之一.面向天河2高性能异构并行计算平台,针对高阶精度CFD流场数值模拟程序的高效并行进行了探索,重点讨论了CFD应用特点与众核异构高性能计算机平台特征相适应的性能优化策略,从任务分解、并行度挖掘、多线程优化、SIMD向量化、CPU与加速器协同优化等方面,提出一系列性能提升技术.通过在天河2高性能异构并行计算平台上进行了多个算例的数值模拟,模拟的最大CFD规模达到1228亿个网格点,共使用约59万CPU+MIC处理器核,测试结果表明移植优化后的程序性能提高2.6倍左右,且具有良好的可扩展性.

       

      Abstract: There still exist great challenges when simulating the large-scale computational fluid dynamics (CFD) applications on the contemporary supercomputer systems with many-core heterogeneous architecture like Tianhe-2, which is also one of the research hotspots in this field. In this paper, we focus on exploring the techniques of efficient parallel simulations on the heterogeneous high-performance computing (HPC) platform for large-scale CFD applications with high-order accurate scheme. Some approaches and strategies of performance optimization matched with both the characteristic of CFD application and the architectures of heterogeneous HPC platform are proposed from the perspective of task decomposition, exploration of parallelism, optimization for multi-threaded running, vectorization by employing single-instruction multiple-data (SIMD), optimization for the cooperation of both CPUs and co-processors, and so on. To evaluate the performance of these techniques, some numerical experiments are performed on Tianhe-2,supercomputer system with the maximum number of grid points achieving 1.228×1011, and the total amount of processors and/or co-processors being 590000. Such a large-scale CFD simulation with high-order accurate scheme has to our best knowledge never been attempted before. It shows that the optimized code can get the speedup of 2.6X on CPU and co-processor hybrid platform than that on the CPU platform only, and perfect scalability is also observed from the test results. The present work redefines the frontier of high performance computing for fluid dynamics simulations on heterogeneous platform.

       

    /

    返回文章
    返回