高级检索

    选择-验证-过滤:一种迭代的子图包含查询处理机制

    Selection-Verification-Filtering: An Iterative Subgraph Containment Query Processing Strategy

    • 摘要: 近年来,图模型广泛应用于生物信息、计算化学、语义网等领域.目前,“过滤-验证”机制被广泛用于子图包含查询,即首先根据图数据的特征构造索引,然后根据索引产生候选集,最后对候选集中的每一个图进行子图同构验证.在这类算法中,“过滤”阶段是关注的重点,力争过滤掉更多的数据;而“验证”阶段则只是单纯地进行候选图子图同构检测,并没有进一步优化查询性能的可能.因此,提出了一种新的子图包含查询的迭代处理机制:“选择-验证-过滤”,可利用从子图同构验证过程中得到的信息,结合数据库中图数据之间的相关关系,进行迭代查询处理.该机制首先选择数据库中的图与查询图进行同构验证,然后根据本次验证得到的信息,结合图数据之间的子图映射关系,进行迭代查询处理.一旦子图同构验证成功则可直接获得查询结果,而若验证不成功,则可以缩小下次迭代的查询搜索空间.为提高验证成功概率,提出了一种基于搜索空间预测的图选择策略.大量实验表明,该算法具有较“过滤-验证”机制更高的查询处理性能.

       

      Abstract: Graph data is ubiquitous in various data applications, such as chemical compounds, proteins, and social network. Effective subgraph containment query processing on large graph databases is one of the most challenging issues. The “filtering-verification” mechanism is widely used for processing subgraph containment queries. Firstly, it constructs feature-based index structures; then filters out a small set of candidates from the database with the help of indices; finally, a verification procedure is conducted on each candidate to obtain final results. The “filtering” phase is critical to getting as few candidates as possible which leads to better performance; while the “verification” phase is quite simple, there is no room to improve the overall performance by optimization in this phase. “Selection-verification-filtering”, a novel iterative three-phase subgraph containment query processing strategy is proposed, which processes the queries iteratively by utilizing the information in the “verification” phase and the graph similarity mapping relationships. Firstly, it selects one graph from the database for subgraph isomorphism verification with the query graph. If the verification fulfills, the final results are obtained directly. Otherwise, the search space of next selection is narrowed. Then, a graph selection method based on search space prediction is introduced to improve the probability of successful verification. Extensive experimental results show that the time complexity of the proposed algorithm outperforms the “filtering-verification” mechanism significantly.

       

    /

    返回文章
    返回