高级检索

    硬件集合通信中聚合树构建方法

    Method to Create Aggregate Tree for Hardware Supported Collectives

    • 摘要: 传统的MPI (message passing interface)集合通信是基于点到点消息实现的,性能较低;而硬件集合通信具有性能高、CPU占用率低等优点,正受到越来越多的关注. 硬件集合通信中,聚合树对集合通信性能具有至关重要的影响. 研究了影响硬件集合通信性能的因素,提出了硬件集合通信开销模型,并以此为基础提出了构建硬件集合通信聚合树的方法. 该方法主要包括3个部分:1)根据操作类型、聚合数据包大小等确定聚合树类型及聚合树宽度,从而在网络传输开销与数据计算开销之间取得平衡;2)提出了最小高度分层k项Ⅰ型聚合树构建方法,降低了跨组聚合包的个数;3)提出了构建最小代价Ⅱ型聚合树的方法,减少所使用的交换机数量. 在神威互连网络中对聚合树构建方法进行了全面测试,当存在网络噪声的情况及分层k项Ⅰ型聚合树构建方法下的消息延迟相比传统构建方法下降了24%~89%;典型通信模式时,最小代价Ⅱ型聚合树使用的交换机聚合条目数相比优化前下降了约90%.

       

      Abstract: Traditional MPI (message passing interface) collectives are implemented by point-to-point messages, and have poor performance. Hardware supported collectives have attracted more and more attention due to their high performance and low CPU utilization. Aggregate tree has crucial impact on the performance of hardware supported collectives. We study the factors that affect the performance of hardware supported collectives, and propose a cost model for hardware supported collectives and an efficient method to create aggregate trees. The method includes three parts. Firstly, we choose appropriate aggregate tree type and breadth according to the operation type and the size of aggregate messages to do tradeoff between network transmission time and data processing time. Secondly, we propose a method to create hierarchical minimum height aggregate tree of type Ⅰ, which reduces the number of inter-group communication. Thirdly, we put forward a method to create the minimum cost aggregate tree of type Ⅱ, which minimizes the number of used switches. In the Sunway interconnection network, we test the proposals. In the presence of network noise, the message latency of the hierarchical minimum height aggregate tree of type Ⅰ is reduced by 24%−89% compared with that of the traditional method. The aggregate entries used by the minimum cost aggregate tree of type Ⅱ for typical communication patterns are reduced by 90% compared with that of the traditional method.

       

    /

    返回文章
    返回