高级检索
    蔡 勇 李光耀 王 琥. GPU通用计算平台上中心差分格式显式有限元并行计算[J]. 计算机研究与发展, 2013, 50(2): 412-419.
    引用本文: 蔡 勇 李光耀 王 琥. GPU通用计算平台上中心差分格式显式有限元并行计算[J]. 计算机研究与发展, 2013, 50(2): 412-419.
    Cai Yong, Li Guangyao, and Wang Hu. Parallel Computing of Central Difference Explicit Finite Element Based on GPU General Computing Platform[J]. Journal of Computer Research and Development, 2013, 50(2): 412-419.
    Citation: Cai Yong, Li Guangyao, and Wang Hu. Parallel Computing of Central Difference Explicit Finite Element Based on GPU General Computing Platform[J]. Journal of Computer Research and Development, 2013, 50(2): 412-419.

    GPU通用计算平台上中心差分格式显式有限元并行计算

    Parallel Computing of Central Difference Explicit Finite Element Based on GPU General Computing Platform

    • 摘要: 显式有限元是解决平面非线性动态问题的有效方法.由于显式有限元算法的条件稳定性,对于大规模的有限元问题的求解需要很长的计算时间.图形处理器(GPU)作为一种高度并行化的通用计算处理器,可以很好解决大规模科学计算的速度问题.统一计算架构(CUDA)为实现GPU通用计算提供了高效、简便的方法.因此,建立了基于GPU通用计算平台的中心差分格式的显式有限元并行计算方法.该方法针对GPU计算的特点,对串行算法的流程进行了优化和调整,通过采用线程与单元或节点的一一映射策略,实现了迭代过程的完全并行化.通过数值算例表明,在保证计算精度一致的前提下,采用NVIDIA GTX 460显卡,该方法能够大幅度提高计算效率,是求解平面非线性动态问题的一种高效简便的数值计算方法.

       

      Abstract: Explicit finite element method has been widely used for the plane nonlinear dynamic problems. Because of the limitation of time step due to the conditional stability, the analysis of large-scale problem always requires long computing time. Graphics processing unit (GPU) is a parallel device with single instruction, multiple data classification. GPU offers high computation power and increases memory bandwidth at a relatively low cost, and it is well suited for problems that can be expressed as data-parallel computations with high arithmetic intensity. Nowadays, it has gained more and more attention as a kind of general parallel processor, followed by various general purpose GPU computing technologies represented by NVIDIA CUDA. In this paper, a method of explicit finite element parallel computing based on central difference method and GPU general computing platform for plane nonlinear dynamic problems is developed. The original serial algorithm is adjusted and optimized for GPU computing based on the characteristics of GPU. Finally, GPU is used for the whole explicit iterative process by mapping one element or one node to one CUDA thread, and the iterative process can be solved parallelly in any order. The numerical examples indicate that this method can greatly improve the computational efficiency with the same computing precision on the NVIDIA GTX 460, and it provides an efficient and simple method for the plane nonlinear dynamic problem.

       

    /

    返回文章
    返回