Abstract:
Extracting relational triplets from unstructured natural language texts are the most critical step in building a large-scale knowledge graph, but existing researches still have the following problems: 1) Existing models ignore the problem of relation overlapping caused by multiple triplets sharing the same entity in text; 2) The current joint extraction model based on encoder-decoder does not fully consider the dependency relationship among words in the text; 3) The excessively long sequence of triplets leads to the accumulation and propagation of errors, which affects the precision and efficiency of relation extraction in entity. Based on this, a graph convolution-enhanced multi-channel decoding joint entity and relation extraction model (GMCD-JERE) is proposed. First, the BiLSTM is introduced as a model encoder to strengthen the two-way feature fusion of words in the text; second, the dependency relationship between the words in the sentence is merged through the graph convolution multi-hop mechanism to improve the accuracy of relation classification; third, through multi-channel decoding mechanism, the model solves the problem of relation overlapping, and alleviates the effect of error accumulation and propagation at the same time; fourth, the experiment selects the current three mainstream models for performance verification, and the results on the NYT (New York times) dataset show that the accuracy rate, recall rate, and
F1 are increased by 4.3%, 5.1% and 4.8%. Also, the extraction order starting with the relation is verified in the WebNLG (Web natural language generation) dataset.