高级检索

    VA-Trie: 一种用于近似k近邻查询的高维索引结构

    VA-Trie: A New and Efficient High Dimensional Index Structure for Approximate k Nearest Neighbor Query

    • 摘要: 近年来,随着多媒体信息检索技术的不断发展,如何实现高维特征矢量的快速相似性查询成为一个重要的研究课题.为此,人们提出了许多索引结构,包括:R-Tree及其变种、对矢量进行量化近似的VA-File、引入量化思想的A-Tree等等.从公开发表的成果看,这些索引结构在较低维数时,都能够表现出较好的查询性能;而当维数增加时,性能则急剧恶化.为了在更高维数下实现快速相似查询,可采用VA-File和A-Tree中的近似思想,并借助Trie结构来组织和管理压缩后的近似矢量,即所谓的VA-Trie.实验结果表明,在高达128维时VA-Trie仍有查询加速,其性能远好于A-Tree.

       

      Abstract: Since 1990's, great progress has been made in the area of content-based multimedia information retrieval. A very challenging problem emerged at the same time: how to organize high dimensional vectors so that efficient similarity query could be realized. Many index structures have been proposed to solve this problem, such as R-Tree and its variants, VA-File, A-Tree etc. From the published results, it can be concluded that most of these methods could achieve good query performance when the dimensionality is less than 20. However, the performance suffers greatly as the dimensionality increases. To obtain efficient similarity query in higher dimensional spaces, a new index structure called VA-Trie is introduced. The key idea behind VA-Trie is adopting the idea of quantization to compress the vectors and then employing the Trie structure to organize and manage the approximations. The experimental results show that VA-Trie outperforms A-Tree and sequential scan in high dimensional spaces.

       

    /

    返回文章
    返回