Abstract:
Currently, question-answering(Q&A) systems such as Baidu Zhidao, SoSo WenWen, etc., have been able to find out questions semantically relevant to most queries. However, for questions with time constraint, the performance of searching results is much worse than that of the queries without such constraint. To solve this problem, an automatical recognition and retrieval method for time-sensitive questions are proposed. At first, time-sensitive questions is recognized by using classification algorithms; next, time-range of the time-sensitive question is resolved; finally, the question search results are filtered by resolved time-range. To recognize time-sensitive questions, lexical, syntactic and semantic features are extracted; machine learning methods including the decision-tree, naveBayes and SVM are employed; and AdaBoost algorithm is also adopted to solve the corpus imbalance issue. A resolving method is proposed to calculate question time-range. Based on those, a prototype system of question retrieval is used for validation, which is built from question and answer pairs of financial domain collected from Web. Experimental results show that, by using the C5.0 decision tree algorithm, the precision of time-sensitive questions recognition reaches 0.901; the mean average precision(MAP) of the retrieval result for time-sensitive questions is enhanced 0.039 2 compared with SoSo WenWen, and is enhanced 0.195 6 compared with Baidu Zhidao, increasing by 74.24% and 197.58% respectively. The average system response time of the question retrieval prototype system is 0.628 7 s.