Abstract:
A reading comprehension (RC) system aims to understand a single document (i.e. story or passage) in order to be able to automatically answer questions about it. RC resembles the ad hoc question answering (QA) task that aims to extract an answer from a collection of documents when posed with a question. However, since RC focuses only on a single document, the system needs to draw upon external knowledge sources to achieve deep analysis of passage sentences for answer sentence extraction. Proposed in this paper is an approach towards RC that attempts to utilize external knowledge to improve performance, including (i) automatic acquisition of Web-based answer patterns and its application in answer sentence matching; (ii) linguistic feature matching; (iii)lexical semantic relation inference, and (iv)context assistance. This approach gives improved RC performances for both the Remedia and ChungHwa corpora, attaining HumSent accuracies of 45% and 69% respectively. In particular, performance analysis based on Remedia shows that relative performances of 24.1% is due to the application of Web-derived answer patterns and a further 11.1% is due to linguistic feature matching. Pairwise t-tests are also conducted and the result shows that the performance improvements due to Web-derived answer patterns, linguistic feature matching and lexical semantic relation inference technique are statistically significant.