ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2014, Vol. 51 ›› Issue (8): 1863-1870.doi: 10.7544/issn1000-1239.2014.20121117

• 网络技术 • 上一篇    下一篇



  1. (江南计算技术研究所 江苏无锡 214000) (
  • 出版日期: 2014-08-15
  • 基金资助: 

Optimizing All_to_All Communication in Infiniband

Chen Shuping, Lu Deping, Chen Zhongping   

  1. (Jiangnan Institute of Computing Technology, Wuxi, Jiangsu 214000)
  • Online: 2014-08-15

摘要: All_to_All操作是一种重要的集合操作.目前的商用Infiniband网络中没有有效的拥塞控制机制.通过实验研究了2种典型的All_to_All算法在Infiniband网络中的性能,发现这些算法在传输大于32KB的大消息时会在网络中产生严重的拥塞,从而导致网络带宽利用率仅有30%~70%.尝试通过将大消息拆分成小消息、调度小消息的发送来减少网络拥塞.在任意2对进程间都建立可靠的连接,为每个连接都维护一个正在处理的发送请求计数器.当该计数器超过某个阈值后,认为这2个进程间的通信链路上发生了拥塞,此时停止向该连接的发送队列投递新的发送请求,以避免拥塞扩散到整个网络.实验结果表明该优化算法可以改善网络的拥塞程度;相比现有算法带宽利用率可以提高10%以上,最多可以提高20%.

关键词: All_to_All算法, 拥塞控制, 消息拆分, 消息调度, Infiniband

Abstract: All_to_All operation is an important collective function. No effective congestion control mechanism exists in current commercial Infiniband networks. Two typical All_to_All algorithms, the make-pair algorithm and the post-all algorithm, are studied in this paper. We found that their utilization of network bandwidth were between 30% and 70% when sending the messages which are larger than 32KB. Further analysis and experiments demonstrate that it is the result of heavy congestion in the networks. In this paper, we adopt a novel algorithm to alleviate the congestion. It splits the large message into many small sized messages and sends them independently by an efficient schedule scheme. It creates one reliable connection for each process pair, and maintains a counter for each connection, which counts for the outstanding send requests. When the counter exceeds a predefined threshold value, congestion occurs between the two processes. Then it pauses the posting of send requests to the send queue of the congested connection in order to avoid the congestion spreading out through the entire network. The experiment results demonstrate that the new algorithm can alleviate the congestion effectively and improve All_to_All operation performance. Compared with the existed algorithms, its utilization of network bandwidth can improve 10% at least and 20% at most.

Key words: All_to_All algorithm, congestion control, message split, message schedule, Infiniband