ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2016, Vol. 53 ›› Issue (2): 284-302.doi: 10.7544/issn1000-1239.2016.20150842

Special Issue: 2016数据融合与知识融合专题

Previous Articles     Next Articles

Chinese Named Entity Relation Extraction Based on Syntactic and Semantic Features

Gan Lixin, Wan Changxuan, Liu Dexi, Zhong Qing, Jiang Tengjiao   

  1. (School of Information Technology, Jiangxi University of Finance and Economics, Nanchang 330013) (Jiangxi Key Laboratory of Data and Knowledge Engineering (Jiangxi University of Finance and Economics), Nanchang 330013)
  • Online:2016-02-01

Abstract: Named entity relations are a foundation of semantic networks and ontology, and are widely used in information retrieval and machine translation, as well as automatic question and answering systems. In named entity relationships, relationship feature selection and extraction are two key issues. Characteristics of Chinese long sentences with complicated sentence patterns and many entities, as well as the data sparse problem, bring challenges for Chinese entity relationship detection and extraction tasks. To deal with above problems, a novel method based on syntactic and semantic features is proposed. The feature of dependency relation composition is obtained through the combination of their respective dependency relations between two entities. And the verb feature with the nearest syntactic dependency is captured from dependency relation and POS (part of speech). The above features are incorporated into feature-based relationship detection and extraction using SVM. Evaluation on a real text corpus in tourist domain shows above two features from syntactic and semantic aspects can effectively improve the performance of entity relationship detection and extraction, and outperform previously best-reported systems in terms of precision, recall and F1 value. In addition, the verb feature with nearest syntactic dependency achieves high effectiveness for relationship detection and extraction, especially obtaining the most prominent contribution to the performance improvement of data sparse entity relationships, and significantly outperforms the state-of-the-art based on the verb feature.

Key words: relationship extraction, relationship detection, syntactic feature, semantic feature, support vector machine (SVM)

CLC Number: