Abstract:
A novel topic-based language model for sentence retrieval in Chinese question answering is presented in this paper. The main idea is to make use of the peculiar characteristics in question answering scenario, that is, the semantic category of the expected answer, to conduct topic segmentation, and then incorporate the topic information of the sentence into the standard language model. For the topic segmentation, two approaches are presented, that is, one-sentence-one-topic and one-sentence-multi-topics. The experimental results show that the performance of sentence retrieval based on the proposed topic-based language model is improved significantly.