高级检索

    空间数据库中全局最近邻查询处理方法

    All-Nearest-Neighbor Queries Processing in Spatial Databases

    • 摘要: 空间数据库中基于层次化索引结构的全局最近邻(all-nearest-neighbor, All-NN)计算采用单节点展开策略的嵌套循环技术来降低计算开销.在同一数据集合的全局最近邻计算中,基于索引结构带来的对象空间位置临近性特点,抛弃传统理论距离裁剪规则和嵌套循环技术来减少计算和索引节点访问开销.提出了采用局部计算和完备计算两阶段的计算模型来获得全局最近邻结果.首先以叶节点为单位,采用扫描线算法获得节点内部所有对象的局部最近邻结果,然后根据计算结果得到启发式裁剪距离.在第2阶段采用层次化过滤的范围查询算法来获取外部的(可能的)最近邻对象.实验与分析表明该方法可以很好地支持不同种类、大小、分布的数据集合All-NN查询处理,具有良好的实用价值.

       

      Abstract: All-nearest-neighbor(All-NN) query processing for R-tree-like hierarchical access methods in spatial database system follows a batch paradigm, based on indexed nested-loop for NN querying of all objects and the single-directional expansion strategy in tree traversal. The authors propose a method for All-NN queries by exploiting location proximity of objects, when the primary data set P and the reference data set R are referred to the same one (P=R). A heuristic distance metric is used to reduce the volume of search space, instead of conventional MINDIST and MINMAXDIST metrics. And hence, the number of retrieved index nodes is reduced considerably, as well as the CPU time. The All-NN processing consists of two steps. NN query is performed within each leaf node by using plane-sweeping algorithm, in which all objects, as well as their nearest neighbor candidates, reside in current leaf node and, then an enhanced range query is performed to identify nearest neighbor and discard false alarms by retrieving objects that locate in other leaf nodes, based on the distance between objects and its might-be nearest neighbor of the first step. Experimental study shows the superb of this method over other approaches(e.g., PaS, BNN), in terms of IO cost and CPU time, by avoiding indexed nested-loop traversal, MINDIST and MINMAXDIST computation, and the maintenance of priority heap overhead.

       

    /

    返回文章
    返回