Abstract:
The effectiveness of information retrieval (IR) systems is influenced by the degree of term overlap between user queries and relevant documents. Query-document term mismatch, whether partial or total, is a fact that must be dealt with by IR systems. query expansion (QE) is one method for dealing with term mismatch. Classical query expansion techniques such as the local context analysis make use of term co-occurrence statistics to incorporate additional contextual terms for enhancing passage retrieval. However, relevant contextual terms do not always co-occur frequently with the query terms and vice versa. Hence the use of such methods often brings in noise, which leads to reduced precision. On the basis of analyzing the process of producing query, the authors propose a new method of query expansion on the basis of context and global information. At the same time, the expansion terms are selected according to their relation with the whole query. Additionally, the position information between terms is considered. The experiment result on TREC data collection shows that the method proposed outperforms the language model without expansion by 5%~19%. Compared with the popular approach of query expansion, pseudo feedback, the method has the competitive average precision.