高级检索

    M/+2+-树:一种支持医学病例多度量空间检索的高效索引

    M/+2+-Tree: Processing Multiple Metric Space Queries of Medical Cases Efficiently with Just One Index

    • 摘要: 由于从病例库中进行病例的相似性检索关系到能否提供给医生充分且正确的候选病例,因此如何高效、准确地实现影像病例的相似性检索是学术界和医学界的研究热点之一.迄今为止,很多文献提出了用于提高查询精度的检索策略,但涉及检索效率的文章还为之甚少.基于此,提出了一种融多种度量空间相似性计算于一体的M/+2+-树高维索引技术.该索引将病例中的文本和影像合成一个高维多特征向量,该向量在度量空间上将数据空间划分成若干子空间,并借助关键向量对划分后的数据子空间再进行向量空间上的二次划分.关键向量的无重叠划分和三角不等式过滤原理可以加快病例的检索速度.总之,在度量和向量空间上的两次数据划分使得M/+2+-索引树大大减少了待查询病例与数据库病例间的不必要相似性计算的次数,从而加快了相似性病例的检索速度.实验结果表明,M/+2+-树的性能优于典型的度量空间多特征索引代表M/+2-树的性能.

       

      Abstract: How to process similarity retrieval of medical cases efficiently and effectively, which affects whether it can provide exact and plentiful candidate cases for doctors, has become one of the hot research topics in both academic community and medicine science. So far, although many retrieval strategies used for improving query precision have been proposed, yet few of them discuss the issue of retrieval efficiency. Motivated by this, an index, i.e., M/+2+-tree, is proposed in this paper. M/+2+-tree combines, within one single index structure, information from multiple metric spaces, such as text features from diagnostic reports, physical features from medical images and so on. M/+2+-tree divides the data space made of medical cases into multiple sub-spaces based on metric space, and each sub-space is further divided into left and right twin parts based on key vector. Furthermore, it takes advantage of divisions without overlap over key vector and filtering principle of triangle inequality to speedup similarity search of medical cases. In a word, by using two kinds of divisions over metric space and vector space, many unnecessary similarity calculations are avoided, which improves the retrieval efficiency of medical cases dramatically. Experimental results show that the search performance of M/+2+-tree is better than that of typical multi-feature index M/+2-tree.

       

    /

    返回文章
    返回