ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2020, Vol. 57 ›› Issue (9): 1949-1960.doi: 10.7544/issn1000-1239.2020.20190569

• 软件技术 • 上一篇    下一篇

标签约束可达查询的高效处理方法

杜明1,杨云1,周军锋1,陈子阳2,杨安平1   

  1. 1(东华大学计算机科学与技术学院 上海 201620);2(上海立信会计金融学院信息管理学院 上海 201620) (duming@dhu.edu.cn)
  • 出版日期: 2020-09-01
  • 基金资助: 
    国家重点研发计划项目(2017YFB0309800);国家自然科学基金项目(61472339,61572421,61873337)

Efficient Methods for Label-Constraint Reachability Query

Du Ming1, Yang Yun1, Zhou Junfeng1, Chen Ziyang2, Yang Anping1   

  1. 1(School of Computer Science and Technology, Donghua University, Shanghai 201620);2(School of Information Management, Shanghai Lixin Institute of Accounting and Finance, Shanghai 201620)
  • Online: 2020-09-01
  • Supported by: 
    This work was supported by the National Key Research and Development Program of China (2017YFB0309800) and the National Natural Science Foundation of China (61472339, 61572421, 61873337).

摘要: 基于标签约束的可达性查询s→\-Lt用于回答给定图中顶点s到顶点t是否存在路径标签属于L的有向路径.针对现有方法索引构建时间长、索引规模大、查询效率低的问题,首先基于k个点构建双向路径标签索引,并提出相应的优化措施减小索引规模,以此来加速可达查询的处理速度.由于其索引没有完全覆盖可达查询,虽然索引规模小,但仍然无法避免查询过程中的图遍历操作.为此,进一步提出覆盖所有可达信息的双向路径标签索引,基于该索引,查询处理时可以完全避免图上的遍历操作.最后,基于多个真实数据集进行测试,实验结果从索引大小、索引构建时间和查询响应时间方面验证了所提方法相对现有方法具有索引规模小、索引时间短且查询响应快的优势.

关键词: 图数据管理, 有向图, 可达性查询处理, 标签约束可达性, 双向路径标签索引

Abstract: The label-constraint reachability query s→\-Lt is used to answer whether there is a directed path from s to t in the given graph, such that every label on the path belongs to L. Considering that existing approaches suffer from long index construct time, large index size and long query time, we first propose to construct a bidirectional path label index based on k nodes with large degrees, such that we can efficiently construct a smaller index, while at the same time covering a large portion of reachable information. For this large nodes based index, we propose several optimization techniques to reduce the index size, which can in turn speed up the processing of reachable queries. Further, even though the index size is small for large nodes based bidirectional path label, it does not cover all the reachable queries, and cannot avoid the graph traversal operation during query processing. To this end, we further propose a bidirectional path label index which is constructed based on all the nodes and therefore covers all the reachable information. Based on the bidirectional path label index, a label-constraint reachability query s→\-Lt can be answered by comparing the labels of the two query nodes s and t, and the traversal operation on the graph can be completely avoided during query processing. Finally, we conduct rich experiment study. In terms of index size, index construction time and query response time, the experimental results on multiple real datasets show that our approaches achieve smaller index size, and can construct the index and answer label-constraint reachability queries more efficiently compared with the state-of-the-art approaches.

Key words: graph data management, directed graph, reachability query processing, label-constraint reachability, bidirectional path label index

中图分类号: