Abstract:
Parameter setting in information retrieval (IR) systems affects retrieval performances greatly. These parameters are always data-dependent and sensitive, which causes the fallibility of experiential values. Moreover, supervised parameter learning approaches are not applicable for lacking of relevant information while retrieving. Therefore, an automatic unsupervised parameter learning mechanism is necessary and important. In this paper, the effectiveness of traditional manual parameter setting with fixed experiential values is studied first, which indicates that the traditional way is not feasible or reliable to use widely in practice. Then, a dynamic parameter learning approach with genetic algorithm (GA) is proposed. Experiments have been done on Okapi system using large scale data sets of TREC11, TREC10 and TREC9 web track collections, each of which is more than 10GB. Results show that by dynamic parameter learning, the system always gets or approaches the best retrieval performance.