Abstract:
Recently, a large amount of naturally graph-structured data are generated in various fields, such as social network, bioinformatics, software engineering, knowledge engineering, and etc. Consequently, querying, searching and mining such “graph data” have become the popular research topics. However, due to the extremely high computational complexity, existing keyword search approaches for graph data do not scale well on large graphs. This paper is motivated by a novel study of the user information needs of graph search. We discuss some possible types of graph search and their potentials of being optimized, and propose an idea that we should adopt customized optimization strategies according to the features of different types of graph search. Based on that, we propose an effective heuristic optimization approach for an important and usual particular type of search called “known-item search”. We construct an index to capture the local topology information in the graph. For large-scale graph data, we can use the MapReduce technique to handle the indexing task. By using the index, we can prune matched vertices before the search, and thereby significantly reduce the search space with a little possible loss of the top-k answers as trade-offs. Our experiments testify that the approach can decrease the response time of the known-item search dramatically.