ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2015, Vol. 52 ›› Issue (1): 116-129.doi: 10.7544/issn1000-1239.2015.20131567

• 软件技术 • 上一篇    下一篇

大规模图数据可达性索引技术:现状与展望

富丽贞1,2,孟小峰2   

  1. 1(中北大学软件学院 太原 030051); 2(中国人民大学信息学院 北京 100872) (fulizhen303@163.com)
  • 出版日期: 2015-01-01
  • 基金资助: 
    基金项目:国家自然科学基金项目(61379050, 91224008)|国家“八六三”高技术研究发展计划基金项目(2013AA013204)|高等学校博士学科点专项科研基金项目(20130004130001)

Reachability Indexing for Large-Scale Graphs: Studies and Forecasts

Fu Lizhen1,2,Meng Xiaofeng2   

  1. 1(School of Software, North University of China, Taiyuan 030051); 2(School of Information, Renmin University of China, Beijing 100872)
  • Online: 2015-01-01

摘要: 随着社交网络、生物信息网、本体等新兴领域的飞速发展,在现实应用中涌现出大量的图数据.可达性查询是有向图上一类最基本的查询.当图的规模非常小时,利用深度优先遍历(depth-first search, DFS)或可达性传递闭包可以很容易处理可达性查询.但是,随着图的规模越变越大,由于DFS方法的查询效率太低而可达性传递闭包方法占用的存储空间太大,这2种方法不再适用.因此,许多可达性索引方法相继被提出.这些方法已经被广泛应用于多个计算机科学领域,如软件工程、 编程语言、分布式计算、社交网络分析、 生物网络分析、XML和RDF数据库、路由规划等领域.此外,可达性索引还可用于加速其他图算法,如最短路径查询和子图模式匹配.首先介绍了可达性索引的应用背景.接着,依据支持的数据规模、数据类型以及查询类别,将现有可达性索引工作进行了分类,并对代表性工作进行分类比较;最后,讨论了现有的大规模图数据可达性索引方法存在的问题,并指出了未来的研究方向.

关键词: 可达性, 索引, 查询处理, 编码, 图数据

Abstract: With the rapid development of social networks, biology, ontology and other emerging areas, there are a lot of graph data in real applications. Reachability query is one of the fundamental queries in graphs. For small graphs, depth-first search (DFS) or reachability transitive closure can answer reachability query easily. However, as graphs become larger and larger, above two approaches are no longer feasible, because the query time of DFS is very long and the storage cost of reachability transitive closure is very high. Therefore, lots of reachability indices have been proposed. These approaches have been widely applied in many areas of computer science, such as software engineering, programming languages, distributed computing, social network analysis, biological network analysis, XML database, RDF databases, routing planning and other fields. In addition, reachability indices can also speed up other graph algorithms, such as the shortest path queries and sub-graph matching. In this survey, applications of reachability indexing schemes are introduced firstly. Secondly, according to supported graph size, supported graph type and supported query type, we provide taxonomy of existing works. Then, we provide a comparison of representative works. Finally, we discuss the disadvantages of existing scalable reachability indexing schemes and put forward some suggestions for future research works.

Key words: reachability, indexing, query processing, labeling, graph data

中图分类号: