ISSN 1000-1239 CN 11-1777/TP

• 系统结构 •

### 基于MRT-LBM方法的大规模可扩展并行计算研究

1. 1(上海大学通信与信息工程学院 上海 200444); 2(上海大学高性能计算中心 上海 200444); 3(上海大学计算机工程与科学学院 上海 200444) (zxliu@shu.edu.cn)
• 出版日期: 2016-05-01
• 基金资助:
国家自然科学基金重大研究计划培育项目(91330116)

### Large-Scale Scalable Parallel Computing Based on LBM with Multiple-Relaxation-Time Model

Liu Zhixiang1,2, Fang Yong1, Song Anping3, Xu Lei3, Wang Xiaowei2,3, Zhou Liping2,3, Zhang Wu2,3

1. 1(School of Communication and Information Engineering, Shanghai University, Shanghai 200444); 2(High Performance Computing Center, Shanghai University, Shanghai 200444); 3(School of Computer Engineering and Science, Shanghai University, Shanghai 200444)
• Online: 2016-05-01

Abstract: In the large-scale numerical simulation of three-dimensional complex flows, the multiple-relaxation-time model (MRT) of lattice Boltzmann method (LBM) has better property of numerical stability than single-relaxation-time model. Based on the turbulence model of large eddy simulation (LES) and the interpolation scheme of surface boundary, three iteration calculations of grid generation, initialization of flow information and parallelism property are analyzed respectively under the discrete velocity model D3Q19. Distributed architecture and the communication between different compute nodes using message passing interface (MPI) are often used by current high performance computing clusters. By considering both the features of distributed clusters and the load balance of calculation and using MPI programming model, the grid generation, initialization of flow information and the parallel algorithm of iteration calculation suitable for large-scale distributed cluster are studied, respectively. The proposed parallel algorithm also can be suitable for D3Q15 discrete velocity model and D3Q27 discrete velocity model. Two different cases, solving problem with fixed total calculation and solving problem with fixed calculate amount in every computing cores, are considered in the process of numerical simulation. The performances of parallelism are analyzed for these two cases, respectively. Experimental results on Sunway Blue Light supercomputer show that the proposed parallel algorithm still has good speedup and scalability on the order of hundreds of thousands of computing cores.