高级检索

    基于穿行次数的大规模图数据路径查询

    Pass-Count-Based Path Query on Big Graph Datasets

    • 摘要: 在涉及复杂图(graph)数据的场景中,图的距离查询和路径查询有着重要的应用.有些应用涉及到规模巨大的图,并且要求快速的查询响应.为此需要高效的查询策略.通过研究可以发现,图内部节点的重要程度往往是不同的,并且可以利用节点的“穿行次数”度量节点的重要性.根据穿行次数为节点构建标签,并保证仅根据节点标签就能处理图的距离查询和路径查询,从而避免对图的遍历,这是一个基本的查询策略.这些标签的规模要尽量小,以降低空间开销、提高查询速度;而其构建过程却要足够快,以保证构建效率.将这个基于穿行次数的查询处理策略称为“穿行次数算法”,最终的实验结果验证了该算法的有效性.

       

      Abstract: Distance and path queries in graphs are fundamental to numerous applications, ranging from geographic navigation systems to Internet routing. Some of these applications involve huge graphs and yet require fast query answering. A new data structure is created for representing all distances in a graph. The data structure is distributed in the sense that it may be viewed as assigning labels to the vertices, such that a query involving vertices u and v may be answered using only the labels of u and v. In this paper, a new concept “pass count” is proposed to measure the importance of a vertex in the graph. Based on the pass-count, a special heuristic rule is used to build distance labels for each vertex of the graph, so distance queries and path queries can be processed on those labels exclusively. This method is efficient by avoiding graph traversal, and hence reduces time complexity. A “pass count algorithm” is also presented based on the pass-count of each vertex in the graph, which aims to minimize labels’ size, improve the query processing, and accelerate the construction of the labels. Extensive experiments are conducted to prove the effectiveness and efficiency of the algorithm.

       

    /

    返回文章
    返回