ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2014, Vol. 51 ›› Issue (8): 1863-1870.doi: 10.7544/issn1000-1239.2014.20121117

Previous Articles     Next Articles

Optimizing All_to_All Communication in Infiniband

Chen Shuping, Lu Deping, Chen Zhongping   

  1. (Jiangnan Institute of Computing Technology, Wuxi, Jiangsu 214000)
  • Online:2014-08-15

Abstract: All_to_All operation is an important collective function. No effective congestion control mechanism exists in current commercial Infiniband networks. Two typical All_to_All algorithms, the make-pair algorithm and the post-all algorithm, are studied in this paper. We found that their utilization of network bandwidth were between 30% and 70% when sending the messages which are larger than 32KB. Further analysis and experiments demonstrate that it is the result of heavy congestion in the networks. In this paper, we adopt a novel algorithm to alleviate the congestion. It splits the large message into many small sized messages and sends them independently by an efficient schedule scheme. It creates one reliable connection for each process pair, and maintains a counter for each connection, which counts for the outstanding send requests. When the counter exceeds a predefined threshold value, congestion occurs between the two processes. Then it pauses the posting of send requests to the send queue of the congested connection in order to avoid the congestion spreading out through the entire network. The experiment results demonstrate that the new algorithm can alleviate the congestion effectively and improve All_to_All operation performance. Compared with the existed algorithms, its utilization of network bandwidth can improve 10% at least and 20% at most.

Key words: All_to_All algorithm, congestion control, message split, message schedule, Infiniband

CLC Number: