高级检索

    使用自适应缓存行粒度优化图计算

    Optimizing Graph Computing with an Adaptive Cache Structure

    • 摘要: 随着计算机硬件与软件的不断进步,图计算的应用领域日益广泛。图计算的访存行为多变。图算法在GPU执行时不同kernel访问属性数组展现出不同的空间局部性,同一个kernel访问不同的属性数组展现出不同的空间局部性。当前GPU架构中的缓存优化策略未能有效挖掘图计算的性能潜力。最先进的缓存优化策略无法针对不同重用性的数据实施不同的管理策略,这是导致图计算性能低下的一个重要原因。本文专为GPU平台设计AB-cache架构。AB-cache采用自适应的思想,巧妙地运用两种缓存结构分别优化具有不同空间局部性特征的内存访问请求。同时,我们还配套提出了与AB-cache架构相契合的在线自动分类机制,能够以极低的开销迅速分类具有不同空间局部性的内存访问请求。通过对多个图算法及广泛图数据集的综合评估,AB-cache方案相较于基线方案实现了1.1377倍的加速效果,证明了该方案在图计算性能优化方面的有效性与实用性。

       

      Abstract: With the continuous progress of computer hardware and software, graph computing has been widely used in many fields. The memory access behavior of graph computation is variable. Different kernels accessing property arrays exhibit different spatial localities when graph algorithms are executed on a GPU, and the same kernel accessing different property arrays shows different spatial localities. Current cache optimization strategies in GPU architectures fail to enhance the performance of graph computing. The most advanced cache optimization strategies cannot implement different management strategies for data with different reusability, which is an essential reason for graph computing’s poor performance. In this paper, the AB-cache architecture is designed specifically for the GPU platform. AB-cache adopts an adaptive approach, cleverly utilizing two cache structures to optimize memory access requests with different spatial locality characteristics. An online automatic classification mechanism that fits the AB-cache architecture is also proposed; it can quickly classify memory access requests with different spatial locality with very low overhead. Through comprehensive evaluation of multiple graph algorithms and a wide range of graph datasets, the AB-cache scheme achieves a 1.1377 times speedup compared with the baseline scheme, which demonstrates the effectiveness and practicability of the scheme in graph computing performance optimization.

       

    /

    返回文章
    返回