Wang Yiting, Lan Yanyan, Pang Liang, Guo Jiafeng, Cheng Xueqi. Unbiased Learning to Rank Based on Relevance Correction[J]. Journal of Computer Research and Development, 2022, 59(12): 2867-2877. DOI: 10.7544/issn1000-1239.20210865
Citation:
Wang Yiting, Lan Yanyan, Pang Liang, Guo Jiafeng, Cheng Xueqi. Unbiased Learning to Rank Based on Relevance Correction[J]. Journal of Computer Research and Development, 2022, 59(12): 2867-2877. DOI: 10.7544/issn1000-1239.20210865
Wang Yiting, Lan Yanyan, Pang Liang, Guo Jiafeng, Cheng Xueqi. Unbiased Learning to Rank Based on Relevance Correction[J]. Journal of Computer Research and Development, 2022, 59(12): 2867-2877. DOI: 10.7544/issn1000-1239.20210865
Citation:
Wang Yiting, Lan Yanyan, Pang Liang, Guo Jiafeng, Cheng Xueqi. Unbiased Learning to Rank Based on Relevance Correction[J]. Journal of Computer Research and Development, 2022, 59(12): 2867-2877. DOI: 10.7544/issn1000-1239.20210865
1(CAS Key Laboratory of Network Data Science and Technology (Institute of Computing Technology, Chinese Academy of Sciences), Beijing 100190)
2(University of Chinese Academy of Sciences, Beijing 100049)
3(Institute for AI Industry Research, Tsinghua University, Beijing 100084)
4(Data Intelligence System Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190)
Funds: This work was supported by the National Key Research and Development Program of China (2020AAA0105200) and the National Natural Science Foundation of China (61773362, 61906180).
Compared with the human annotated relevance labels, the user click data are easily obtained and can better reflect user preferences. Using clicks as training labels can reduce the cost, and the ranking models can be updated in real time. However, the raw clicks are biased and noisy, so it is necessary to design an effective method of unbiased learning to rank. Aiming at the problem that the dual learning algorithm achieve sub-optimal solutions thus cannot eliminate the bias completely, we propose a new method of unbiased learning to rank based on relevance correction. Firstly, we use the existing small-scale query-document pairs with relevance labels to train the ranking model and then use it to get more accurate predictions of the relevance score. Secondly, the click data and the predicted relevance scores are used to train the propensity model. Finally, we take the parameter values of the obtained model as the initial values of the dual learning process, and then jointly train the models with user clicks. The proposed method does not affect the online calculation speed and can be used in online learning scenarios. Tested in different degrees of click bias and real click scenarios, the proposed method can enhance the performance of the existing method as showed in the results.