高级检索
    方 维 孙广中 吴 超 陈国良. 一种三维快速傅里叶变换并行算法[J]. 计算机研究与发展, 2011, 48(3): 440-446.
    引用本文: 方 维 孙广中 吴 超 陈国良. 一种三维快速傅里叶变换并行算法[J]. 计算机研究与发展, 2011, 48(3): 440-446.
    Fang Wei, Sun Guangzhong, Wu Chao, and Chen Guoliang. A Parallel Algorithm of Three-Dimensional Fast Fourier Transform[J]. Journal of Computer Research and Development, 2011, 48(3): 440-446.
    Citation: Fang Wei, Sun Guangzhong, Wu Chao, and Chen Guoliang. A Parallel Algorithm of Three-Dimensional Fast Fourier Transform[J]. Journal of Computer Research and Development, 2011, 48(3): 440-446.

    一种三维快速傅里叶变换并行算法

    A Parallel Algorithm of Three-Dimensional Fast Fourier Transform

    • 摘要: 三维快速傅里叶变换在物理计算领域中被广泛地使用.传统并行算法所使用的面划分和块划分方法并不适合稀疏三维向量的傅里叶变换.提出了一种新三维快速傅里叶变换的并行算法,针对稀疏三维向量的傅里叶变换,新算法通过重新调整x,y,z三个方向的计算顺序,能最大限度地减少计算量以及进程间的通信量,从而减少计算时间,提高并行加速比.详尽的理论分析以及多个高性能计算平台上的实验结果证明:在对稀疏三维向量作傅里叶变换时,新算法优于传统算法.

       

      Abstract: Three-dimensional fast Fourier transform(3D-FFT) is widely used in physics. It is crucial to many applications because it demands heavy calculation and communications. Thus in most cases it is 3D-FFT that dominates the computational time. The traditional parallel algorithms of 3D-FFT are not suitable for the sparse lattice which is often encountered in the field of quantum computing, because the block partitioning used may involve many redundant computing and communications, due to the sparse of non-zero elements in FFT grid. In this paper we propose a noval parallel algorithm of 3D-FFT. Unlike the previous methods, the new algorithm uses slice partitioning, and redesigns the computing order in order to minimize the calculation time and communication cost. Taking advantage of the slice partitioning, the new method are highly scalable and can automatically satisfy the demands of load balancing. We compare it with traditional algorithms in theory and in practice. Theoretical performance analysis shows that the new method can greatly reduce the computational time and increase parallel speedup. The experiments have been carried cut in some high-performance machines, such as KD-50, IBM JS22 and DAWNING. The results show that our new algorithm behaves much better than traditional algorithms in performing 3D-FFT for sparse lattice.

       

    /

    返回文章
    返回