Abstract:
Graph data is ubiquitous in various data applications, such as chemical compounds, proteins, and social network. Effective subgraph containment query processing on large graph databases is one of the most challenging issues. The “filtering-verification” mechanism is widely used for processing subgraph containment queries. Firstly, it constructs feature-based index structures; then filters out a small set of candidates from the database with the help of indices; finally, a verification procedure is conducted on each candidate to obtain final results. The “filtering” phase is critical to getting as few candidates as possible which leads to better performance; while the “verification” phase is quite simple, there is no room to improve the overall performance by optimization in this phase. “Selection-verification-filtering”, a novel iterative three-phase subgraph containment query processing strategy is proposed, which processes the queries iteratively by utilizing the information in the “verification” phase and the graph similarity mapping relationships. Firstly, it selects one graph from the database for subgraph isomorphism verification with the query graph. If the verification fulfills, the final results are obtained directly. Otherwise, the search space of next selection is narrowed. Then, a graph selection method based on search space prediction is introduced to improve the probability of successful verification. Extensive experimental results show that the time complexity of the proposed algorithm outperforms the “filtering-verification” mechanism significantly.