ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2019, Vol. 56 ›› Issue (6): 1338-1355.doi: 10.7544/issn1000-1239.2019.20180371

• 综述 • 上一篇    



  1. 1(宁波大学信息科学与工程学院 浙江宁波 315211);2(百度在线网络技术有限公司 北京 100085) (
  • 出版日期: 2019-06-01
  • 基金资助: 

Progress and Challenges of Graph Summarization Techniques

Wang Xiong1, Dong Yihong1, Shi Weijie1, Pan Jianfei1,2   

  1. 1(Faculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo, Zhejiang 315211);2(Baidu Online Network Technology Company, Beijing 100085)
  • Online: 2019-06-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (61572266), the Natural Science Foundation of Zhejiang Province of China (LY16F020003), and the Natural Science Foundation of Ningbo City of China (2017A610114).

摘要: 图的概要化,简称图概要,旨在寻找一组简洁的超图或稀疏图,阐明原始图的主要结构信息或变化趋势.当前图概要的研究大多结合原始图的应用领域和背景,使用不同的概要技术构建一个特定的概要图,解决目前大图面临的信息过载、查询优化、空间压缩、影响分析、社交网络可视化等问题.对现有的图概要技术进行了汇总,以概要主要目的作为分类标准划分为基于空间压缩的图概要、基于查询优化的图概要、基于模式可视化的图概要和基于影响分析的图概要四大类,针对部分属性图和无属性图概要算法在真实数据集上进行了相关实验,并从压缩率、信息保持率、信息熵和时间进行对比分析.点明图概要的发展趋势,并指出图概要面临的挑战和可深入探索的研究方向,结合热门的深度学习技术提出了部分有价值的的宏观想法用以解决当前挑战.

关键词: 综述, 图概要, 图聚集, 图概化, 图压缩, 可视化

Abstract: Graph summarization aims to search a group of simple hypergraphs or sparse graphs, which illustrate the main structural information or change trend of the original graph. Based on the application field and background of original graph, different graph summarization techniques are used to construct a specific summary graph, which can solve the problems of information overload, query optimization, spatial compression, impact analysis, social network visualization and so on. According to the classification criteria of the main purpose of the summary, the existing graph summarization techniques are divided into four categories: the graph summarization based on spatial compression, the graph summarization based on query optimization, the graph summarization based on pattern visualization and the graph summarization based on impact analysis. The partial graph summarization algorithms of non-attribute graphs and attribute graphs are tested on real data sets to analyze the indexes of information retention rate, compression rate, information entropy and running time experimentally. At last, not only the development trends of the graph summarization are highlighted, but also the challenges and the future research directions that can be explored in depth are pointed out. Combining with the popular deep learning technology, some valuable and potential Macro coutermeasures are put forward to solve these challenges.

Key words: survey, graph summarization, graph aggregation, graph synopsis, graph compression, visualization