Exploration of Weighted Proximity Measure in Information Retrieval

Xue Yuanhai; Yu Xiaoming; Liu Yue; Guan Feng; Cheng Xueqi

doi:10.7544/issn1000-1239.2014.20130339

Xue Yuanhai, Yu Xiaoming, Liu Yue, Guan Feng, Cheng Xueqi. Exploration of Weighted Proximity Measure in Information RetrievalJ. Journal of Computer Research and Development, 2014, 51(10): 2216-2224. DOI: 10.7544/issn1000-1239.2014.20130339

Citation:

Exploration of Weighted Proximity Measure in Information Retrieval

Graphical Abstract

Graphical Abstract

Abstract

Abstract

A key problem of information retrieval is to provide information takers with relevant, accurate and even complete information. Lots of traditional information retrieval models are based on the bag-of-words assumption, without considering the implied associations among the query terms. Although term proximity has been widely used for boosting the performance of the classical information retrieval models, most of those efforts do not fully consider the different importance between the query terms. For queries in modern information retrieval, the query terms are not only dependent of each other, but also different in importance. Thus, computing the term proximity with taking into account the different importance of terms will be helpful to improve the retrieval performance. In order to achieve this, a weighted term proximity measure method is introduced, which distinguishes the significance of the query terms based on the collections to be searched. Weighted proximity BM25 model(WP-BM25) that integrating this method into the Okapi BM25 model is proposed to rank the retrieved documents. A large number of experiments are conducted on three standard TREC collections which are FR88-89, WT2G and WT10G. The results show that the weighted proximity BM25 model can significantly improve the retrieval performance, and it has good robustness.

FullText(HTML)

References (0)

Cited By

Turn off MathJax

Article Contents

Exploration of Weighted Proximity Measure in Information Retrieval

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content