ISSN 1000-1239 CN 11-1777/TP

• Paper • Previous Articles     Next Articles

Using Pattern and Linguistic Features to Improve Reading Comprehension Performance

Du Yongping1, Huang Xuanjing2, and Wu Lide2   

  1. 1(Institute of Computer Science, Beijing University of Technology, Beijing 100022) 2(Department of Computer Science and Engineering, Fudan University, Shanghai 200433)
  • Online:2008-02-15

Abstract: A reading comprehension (RC) system aims to understand a single document (i.e. story or passage) in order to be able to automatically answer questions about it. RC resembles the ad hoc question answering (QA) task that aims to extract an answer from a collection of documents when posed with a question. However, since RC focuses only on a single document, the system needs to draw upon external knowledge sources to achieve deep analysis of passage sentences for answer sentence extraction. Proposed in this paper is an approach towards RC that attempts to utilize external knowledge to improve performance, including (i) automatic acquisition of Web-based answer patterns and its application in answer sentence matching; (ii) linguistic feature matching; (iii)lexical semantic relation inference, and (iv)context assistance. This approach gives improved RC performances for both the Remedia and ChungHwa corpora, attaining HumSent accuracies of 45% and 69% respectively. In particular, performance analysis based on Remedia shows that relative performances of 24.1% is due to the application of Web-derived answer patterns and a further 11.1% is due to linguistic feature matching. Pairwise t-tests are also conducted and the result shows that the performance improvements due to Web-derived answer patterns, linguistic feature matching and lexical semantic relation inference technique are statistically significant.

Key words: pattern, reading comprehension, question answering, natural language processing