ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2018, Vol. 55 ›› Issue (5): 958-967.doi: 10.7544/issn1000-1239.2018.20170232

Previous Articles     Next Articles

Rough Set Knowledge Discovery Based Open Domain Chinese Question Answering Retrieval

Han Zhao1,2,3, Miao Duoqian1,2, Ren Fuji3,Zhang Hongyun1,2   

  1. 1(College of Electronic and Information Engineering, Tongji University, Shanghai 201804); 2(Key Laboratory of Embedded System and Service Computing (Tongji University), Ministry of Education, Shanghai 201804); 3(The Faculty of Engineering, Tokushima University, Tokushima, Japan 7708506)
  • Online:2018-05-01

Abstract: In the information retrieval (IR) based open domain question answering system (QA system), the main principle is that first use the semantic tools and knowledgebase to get the semantic and knowledge information, then calculate the matching value of both semantic and knowledge. However, in some practical applications of Chinese question answering, because of the uncertainty of both the Chinese language representation and the Chinese knowledge representation, the current methods are not very effective. To solve this problem, a rough set knowledge discovery based Chinese question answering method is proposed in this paper. It uses the method of rough set equivalence partitioning to represent the rough set knowledge of the QA pairs, then uses the idea of attribute reduction to mine out the upper approximation representations of all the knowledge items. Based on the rough set QA knowledgebase, the knowledge match value of a QA pair can be calculated as a kind of knowledge item similarity. After all the knowledge similarities of one question and its answer candidates are given, the final matching values which combines rough set knowledge similarity with traditional sentence similarity can be used to rank the answer candidates. The experiment shows that the proposed method can improve the MAP and MRR compared with the baseline information retrieval methods.

Key words: question answering (QA) system, information retrieval (IR), rough set, knowledge discovery, text mining

CLC Number: