高级检索

    RB树:一种支持空间近似关键字查询的外存索引

    An Index Supporting Spatial Approximate Keyword Search on Disks

    • 摘要: 空间近似关键字查询包含一个空间条件和一组关键字相似性条件,这种查询在空间数据库中返回同时满足以下条件的对象:1)对象的位置信息满足查询中的空间条件;2)对于查询中的任何一个关键字,对象中至少包含一个关键字与其相似度大于给定阈值.随着当前数据的爆炸性增长,空间数据库无法完整地存放在内存中,因此空间数据库需要支持空间近似关键字查询的外存索引.目前,还没有在外存中支持精确的空间近似关键字查询的索引结构.设计了一种新型的外存索引RB树,在外存中支持精确的空间近似关键字查询.RB树支持的空间近似关键字查询包括多种空间条件,如范围查询、NN查询,同时支持多种关键字相似性度量,包括编辑距离、规范化编辑距离等.通过真实数据中的性能测试验证了RB树的效率.

       

      Abstract: Spatial approximate keyword queries consist of a spatial condition and a set of keywords as the textual condition. The spatial condition requires that the returned objects are inside a spatial region or nearby a location, and the textual condition requires that the returned objects are labeled with a set of keywords similar to the queried keywords. Such queries enable users to find objects they are interested in within a spatial database, and make mismatches between users’ query keywords and spatial object keywords tolerant. With the rapid growth of data, spatial databases storing objects from diverse geographical regions can be no longer held in memories. Thus, it is essential to answer spatial approximate keyword queries over disk resident datasets. Existing works present methods either that return incomplete answers or index in memory, and effective solutions in disks are in demand. This paper presents a novel disk resident index RB-tree to support spatial approximate keyword queries. We study the principle of augmenting R-tree with the capacity of approximate keyword searching based on existing solutions, and store two kinds of bitmaps in R-tree nodes to build an RB-tree. RB-tree supports a wide range of spatial conditions such as range and nearest neighbor, combined with keyword similarity metrics such as edit distance, dice etc. Experimental results against R-tree on two real world datasets demonstrate the efficiency of our solution.

       

    /

    返回文章
    返回