Abstract:
How to process similarity retrieval of medical cases efficiently and effectively, which affects whether it can provide exact and plentiful candidate cases for doctors, has become one of the hot research topics in both academic community and medicine science. So far, although many retrieval strategies used for improving query precision have been proposed, yet few of them discuss the issue of retrieval efficiency. Motivated by this, an index, i.e., M/+2+-tree, is proposed in this paper. M/+2+-tree combines, within one single index structure, information from multiple metric spaces, such as text features from diagnostic reports, physical features from medical images and so on. M/+2+-tree divides the data space made of medical cases into multiple sub-spaces based on metric space, and each sub-space is further divided into left and right twin parts based on key vector. Furthermore, it takes advantage of divisions without overlap over key vector and filtering principle of triangle inequality to speedup similarity search of medical cases. In a word, by using two kinds of divisions over metric space and vector space, many unnecessary similarity calculations are avoided, which improves the retrieval efficiency of medical cases dramatically. Experimental results show that the search performance of M/+2+-tree is better than that of typical multi-feature index M/+2-tree.