Optimizing MPI Alltoall Communications in Multicore Clusters

Li Qiang; Sun Ninghui; Huo Zhigang; Ma Jie

Li Qiang, Sun Ninghui, Huo Zhigang, Ma Jie. Optimizing MPI Alltoall Communications in Multicore Clusters[J]. Journal of Computer Research and Development, 2013, 50(8): 1744-1754.

Citation:

Li Qiang, Sun Ninghui, Huo Zhigang, Ma Jie. Optimizing MPI Alltoall Communications in Multicore Clusters[J]. Journal of Computer Research and Development, 2013, 50(8): 1744-1754.

Citation:

Li Qiang, Sun Ninghui, Huo Zhigang, Ma Jie. Optimizing MPI Alltoall Communications in Multicore Clusters[J]. Journal of Computer Research and Development, 2013, 50(8): 1744-1754.

Optimizing MPI Alltoall Communications in Multicore Clusters

Graphical Abstract

Graphical Abstract

Abstract

Abstract

MPI Alltoall is an important collective operation. In multicore clusters, many processes run in a node. On the one hand, shared memory can be adopted to optimize Alltoall communications of small messages by leader-based schemes. However, as these schemes adopt a fixed number of leader processes, the optimal performance can't be obtained for all small messages. On the other hand, processes within a node contend for the same network resource. In Alltoall communications of large messages, many synchronization messages are used. Nevertheless, the contention makes their latency increase many times and the synchronization overhead can't be ingored. To solve these problems, two optimizations are presented. For small messages, the PLP method adopts changeable numbers of leader processes. For large messages, the LSS method reduces the number of synchronization messages from 3N to 2N. The evaluations prove two methods. For small messages, the PLP method always obtains optimal performance. For large messages, the LSS method brings almost constant improvement percentage. The performance is improved by 25% for 32KB and 64KB messages.

FullText(HTML)

References (0)

Cited By

Turn off MathJax

Article Contents

Optimizing MPI Alltoall Communications in Multicore Clusters

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content