Abstract:
Many real world datasets can be modeled as graphs, where nodes represent objects and edges indicate relationships between nodes. Today, large graphs are common in many domains, such as social networks and road networks. Graph aggregation is a new technique for representing a large-scale graph by a concise graph that can capture the structural and attributive information of the original large graph. Graph aggregation plays an important role in the management, analysis and visualization of graph data. However, there are very few research results in graph aggregation. Moreover, the existing results are far from systematic. The main problems are: 1)depending on specific applications; 2)only considering partial information of graph, such as structures or attributes of original graphs; 3)having strict constraints on users’ interactions and feedbacks. To this end, this paper proposes a new graph aggregation algorithm on directed graphs, which adopts a new quality measuring function for aggregation graphs, characterizing the diversity, coverage, conciseness and utility of aggregated graphs. The algorithm guarantees the quality of aggregation graphs by means of Locality Sensitive Hashing (LSH) according to the Jaccard similarity of node attributes and an entropy-based partitioning method. Experiments on real datasets demonstrate the effectiveness and efficiency of the algorithm.