Abstract:
All_to_All operation is an important collective function. No effective congestion control mechanism exists in current commercial Infiniband networks. Two typical All_to_All algorithms, the make-pair algorithm and the post-all algorithm, are studied in this paper. We found that their utilization of network bandwidth were between 30% and 70% when sending the messages which are larger than 32KB. Further analysis and experiments demonstrate that it is the result of heavy congestion in the networks. In this paper, we adopt a novel algorithm to alleviate the congestion. It splits the large message into many small sized messages and sends them independently by an efficient schedule scheme. It creates one reliable connection for each process pair, and maintains a counter for each connection, which counts for the outstanding send requests. When the counter exceeds a predefined threshold value, congestion occurs between the two processes. Then it pauses the posting of send requests to the send queue of the congested connection in order to avoid the congestion spreading out through the entire network. The experiment results demonstrate that the new algorithm can alleviate the congestion effectively and improve All_to_All operation performance. Compared with the existed algorithms, its utilization of network bandwidth can improve 10% at least and 20% at most.