Graph Convolution-Enhanced Multi-Channel Decoding Joint Entity and Relation Extraction Model
-
摘要:
从无结构化自然语言文本中抽取实体关系三元组是构建大型知识图谱中最为关键的一步,但现有研究仍存在3方面问题:1)忽略文本中因多个三元组共享同一实体而产生的实体关系重叠问题;2)当前以编码器−解码器为基础的联合抽取模型未充分考虑文本语句词之间的依赖关系;3)部分三元组序列过长导致误差累积与传播,影响实体关系抽取的精度和效率.基于此,提出基于图卷积增强多路解码的实体关系联合抽取模型 (graph convolution-enhanced multi-channel decoding joint entity and relation extraction model, GMCD-JERE).首先,基于BiLSTM作为模型编码器,强化文本中词的双向特征融合;其次,通过图卷积多跳特征融合句中词之间的依赖关系,提高关系抽取准确性;此外,改进传统模型按三元组先后顺序的解码机制,通过多路解码三元组机制,解决实体关系重叠问题,同时缓解三元组序列过长造成误差累积、传播的影响;最后,实验选用当前3个主流模型进行性能验证,在NYT (New York times)数据集上结果表明在精确率、召回率和F1这3个指标上分别提升了4.3%,5.1%,4.8%,同时在WebNLG (Web natural language generation)数据集上验证以关系为开始的抽取顺序.
Abstract:Extracting relational triplets from unstructured natural language texts are the most critical step in building a large-scale knowledge graph, but existing researches still have the following problems: 1) Existing models ignore the problem of relation overlapping caused by multiple triplets sharing the same entity in text; 2) The current joint extraction model based on encoder-decoder does not fully consider the dependency relationship among words in the text; 3) The excessively long sequence of triplets leads to the accumulation and propagation of errors, which affects the precision and efficiency of relation extraction in entity. Based on this, a graph convolution-enhanced multi-channel decoding joint entity and relation extraction model (GMCD-JERE) is proposed. First, the BiLSTM is introduced as a model encoder to strengthen the two-way feature fusion of words in the text; second, the dependency relationship between the words in the sentence is merged through the graph convolution multi-hop mechanism to improve the accuracy of relation classification; third, through multi-channel decoding mechanism, the model solves the problem of relation overlapping, and alleviates the effect of error accumulation and propagation at the same time; fourth, the experiment selects the current three mainstream models for performance verification, and the results on the NYT (New York times) dataset show that the accuracy rate, recall rate, and F1 are increased by 4.3%, 5.1% and 4.8%. Also, the extraction order starting with the relation is verified in the WebNLG (Web natural language generation) dataset.
-
-
表 1 GMCD-JERE模型变量定义
Table 1 Variable Definitions for the GMCD-JERE
符号 描述 {\boldsymbol{h}}_i^{\text{E} } 语句上下文特征 {{s} _{{ij} }} 语句序列第i个和第j个词间注意力 {{M} _{{ij} }} 语句序列第i个和第j个词间依赖权重 {\boldsymbol{h}}_{i} ^{l} 第i个词在第l层卷积后的隐层状态 {{\boldsymbol{o}}^{ {(1)} } } 解码器第1阶段的输出 {\boldsymbol{o} }_{ {{\rm{relation}}} }^{} 句中各关系存在的概率 {{\boldsymbol{o}}^{ {(2)} } } 解码器第2阶段的输出 {\boldsymbol{h}}_\lambda ^{} 关系嵌入向量 {\boldsymbol{o} }_{ {{\rm{head}}} }^{ {{\rm{start}}} } 头实体首位置在语句中各位置的概率 {\boldsymbol{o} }_{ {{\rm{head}}} }^{ {{\rm{end}}} } 头实体尾位置在语句中各位置的概率 {{\boldsymbol{o}}^{ {(3)} } } 解码器第3阶段的输出 {\boldsymbol{o} }_{ {{\rm{tail}}} }^{ {{\rm{start}}} } 尾实体首位置在语句中各位置的概率 {\boldsymbol{o} }_{ {{\rm{tail}}} }^{ {{\rm{end}}} } 尾实体尾位置在语句中各位置的概率 \bar {\boldsymbol{o} }_{ {{\rm{head}}} }^{ {{\rm{start}}} } 头实体首位置真实值 \bar {\boldsymbol{o} }_{ {{\rm{head}}} }^{ {{\rm{end}}} } 头实体尾位置真实值 \bar {\boldsymbol{o} }_{ {{\rm{tail}}} }^{ {{\rm{start}}} } 尾实体首位置真实值 \bar {\boldsymbol{o} }_{ { {\rm{tail} } } }^{ { {\rm{end} } } } 尾实体尾位置真实值 \bar {\boldsymbol{o} }_{ { {\rm{relation} } } }^{} 对应关系真实值 表 2 NYT和WebNLG数据集相关信息
Table 2 Related Information of NYT and WebNLG Datasets
类型 NYT WebNLG 训练 验证 测试 训练 验证 测试 Normal 35703 3208 3136 1600 182 246 EPO 11991 1030 1168 227 16 26 SEO 8502 762 696 3192 302 431 总计 56196 5000 5000 5019 500 703 表 3 模型性能对比
Table 3 Performance Comparison of the Models
% 模型 NYT WebNLG Precision Recall F1 Precision Recall F1 CopyREOne 59.4 53.1 56.0 32.2 28.9 30.5 CopyREMul 61.0 56.6 58.7 37.7 36.4 37.1 GraphRel1p 62.9 57.3 60.0 42.3 39.2 40.7 GraphRel2p 63.9 60.0 61.9 44.7 41.1 42.9 CopyMTLOne 72.7 69.2 70.9 57.8 60.1 58.9 CopyMTLMul 75.7 68.7 72.0 58.0 54.9 56.4 GMCD-JERE 80.0 73.8 76.8 43.9 42.7 43.3 -
[1] 李冬梅,张扬,李东远,等. 实体关系抽取方法研究综述[J]. 计算机研究与发展,2020,57(7):1424−1448 doi: 10.7544/issn1000-1239.2020.20190358 Li Dongmei, Zhang Yang, Li Dongyuan, et al. Review of entity relation extraction methods[J]. Journal of Computer Research and Development, 2020, 57(7): 1424−1448 (in Chinese) doi: 10.7544/issn1000-1239.2020.20190358
[2] Zeng Daojian, Liu Kang, Lai Siwei, et al. Relation classification via convolutional deep neural network[C] //Proc of the 25th Int Conf on Computational Linguistics. Stroudsburg, PA: ACL, 2014: 2335−2344
[3] Xu Kun, Feng Yansong, Huang Songfang, et al. Semantic relation classification via convolutional neural networks with simple negative sampling[C] //Proc of the 2015 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2015: 536−540 [4] Chan S Y, Roth D. Exploiting syntactico-semantic structures for relation extraction[C] //Proc of the 49th Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2011: 551−560
[5] Li Qi, Ji Heng. Incremental joint extraction of entity mentions and relations[C] //Proc of the 52nd Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2014: 402−412
[6] Miwa M, Bansal M. End-to-end relation extraction using LSTMs on sequences and tree structures[C] //Proc of the 54th Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2016: 1105−1116
[7] 曹明宇,杨志豪,罗凌,等. 基于神经网络的药物实体与关系联合抽取[J]. 计算机研究与发展,2019,56(7):1432−1440 doi: 10.7544/issn1000-1239.2019.20180714 Cao Mingyu, Yang Zhihao, Luo Ling, et al. Joint drug entities and relations extraction based on neural networks[J]. Journal of Computer Research and Development, 2019, 56(7): 1432−1440 (in Chinese) doi: 10.7544/issn1000-1239.2019.20180714
[8] Zhang Meishan, Zhang Yue, Fu Guohong. End-to-end neural relation extraction with global optimization[C] //Proc of the 2017 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2017: 1730−1740 [9] Gupta P, Schtze H, Andrassy B. Table filling multi-task recurrent neural network for joint entity and relation extraction[C] //Proc of the 26th Int Conf on Computational Linguistics. Stroudsburg, PA: ACL, 2016: 2537−2547
[10] Zheng Suncong, Wang Feng, Bao Hongyun, et al. Joint extraction of entities and relations based on a novel tagging scheme[C] //Proc of the 55th Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2017: 1227−1236
[11] Zeng Xiangrong, Zeng Daojian, He Shizhu, et al. Extracting relational facts by an end-to-end neural model with copy mechanism[C] //Proc of the 56th Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2018: 506−514
[12] Peng Nanyun, Hoifung P, Chris Q, et al. Cross-sentence n-ary relation extraction with graph LSTMs[J]. Transactions of the ACL, 2017, 5: 101–115 [13] Sahu K S, Christopoulou F, Miwa M, et al. Inter-sentence relation extraction with document-level graph convolutional neural network[C] //Proc of the 57th Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2019: 4309−4316
[14] Dai Dai, Xiao Xinyan, Lu Yajuan, et al. Joint extraction of entities and overlapping relations using position-attentive sequence labeling[C] //Proc of the 33rd AAAI Conf on Artificial Intelligence. Palo Alto, CA: AAAI, 2019: 6300−6308
[15] Nayak T, Ng H T. Effective modeling of encoder-decoder architecture for joint entity and relation extraction[C] //Proc of the 34th AAAI Conf on Artificial Intelligence. Palo Alto, CA: AAAI, 2020: 8528−8535
[16] 甘丽新,万常选,刘德喜,等. 基于句法语义特征的中文实体关系抽取[J]. 计算机研究与发展,2016,53(2):284−302 doi: 10.7544/issn1000-1239.2016.20150842 Gan Lixin, Wan Changxuan, Liu Dexi, et al. Chinese named entity relation extraction based on syntactic and semantic features[J]. Journal of Computer Research and Development, 2016, 53(2): 284−302 (in Chinese) doi: 10.7544/issn1000-1239.2016.20150842
[17] 田驰远,陈德华,王梅,等. 基于依存句法分析的病例报告结构化处理方法[J]. 计算机研究与发展,2016,52(12):2669−2680 doi: 10.7544/issn1000-1239.2016.20160611 Tian Chiyuan, Chen Dehua, Wang Mei, et al. Structured processing for pathological reports based on dependency parsing[J]. Journal of Computer Research and Development, 2016, 52(12): 2669−2680 (in Chinese) doi: 10.7544/issn1000-1239.2016.20160611
[18] 刘克彬,李芳,刘磊,等. 基于核函数中文关系自动抽取系统的实现[J]. 计算机研究与发展,2007,44(8):1406−1411 doi: 10.1360/crad20070818 Liu Kebin, Li Fang, Liu Lei, et al. Implement of a kernel-based chinese relation extraction system[J]. Journal of Computer Research and Development, 2007, 44(8): 1406−1411 (in Chinese) doi: 10.1360/crad20070818
[19] Zelenko D, Aone C, Richardella A. Kernel methods for relation extraction[C] //Proc of the 2002 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2002: 71−78 [20] Hendrickx I, Kim N S, Kozareva Z, et al. Semeval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals[C] //Proc of the 5th Int Workshop on Semantic Evaluation. Stroudsburg, PA: ACL, 2010: 33−38
[21] Xu Yan, Mou Lili, Li Ge, et al. Classifying relations via long short term memory networks along shortest dependency paths[C] //Proc of the 2015 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2015: 1785−1794 [22] Florian R, Jing Hongyan, Kambhatla N, et al. Factorizing complex models: A case study in mention detection[C] //Proc of the 44th Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2006: 473−480
[23] Florian R, Pitrelli J, Roukos S. et al. Improving mention detection robustness to noisy input[C] //Proc of the 2010 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2010: 335−345 [24] Zhao Shubin, Grishman R. Extracting relations with integrated information using kernel methods[C] //Proc of the 43rd Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2005: 419−426
[25] Jiang Jing, Zhai Chengxiang. A systematic exploration of the feature space for relation extraction[C] //Proc of the 2007 Conf of the North American Chapter of the ACL. Stroudsburg, PA: ACL, 2007: 113−120 [26] Sun A, Grishman R, Sekine S. Semi-supervised relation extraction with large-scale word clustering[C] //Proc of the 49th Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2011: 521−529
[27] Plank B, Moschitti A. Embedding semantic similarity in tree kernels for domain adaptation of relation extraction[C] //Proc of the 51st Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2013: 1498−1507
[28] Mintz M, Bills S, Snow R, et al. Distant supervision for relation extraction without labeled data[C] //Proc of the Conf of the 47th Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2009: 1003−1011
[29] Cai Rui, Zhang Xiaodong, Wang Houfang. Bidirectional recurrent convolutional neural network for relation classification[C] //Proc of the 54th Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2016: 756−765
[30] Christopoulou F, Miwa M, Ananiadou S. A walk-based model on entity graphs for relation extraction[C] //Proc of the 56th Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2018: 81−88
[31] Qin Pengda, Xu Weiran, Wang Yang William. Robust distant supervision relation extraction via deep reinforcement learning[C] //Proc of the 56th Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2018: 2137−2147
[32] Yu Xiaofeng, Lam W. Jointly identifying entities and extracting relations in encyclopedia text via a graphical model approach[C] //Proc of the 2010 Conf on Computational Linguistics. Stroudsburg, PA: ACL, 2010: 1399−1407
[33] Miwa M, Sasaki Y. Modeling joint entity and relation extraction with table representation[C] //Proc of the 2014 Conf on EMNLP. Stroudsburg, PA: ACL, 2014: 1858−1869 [34] Ren Xiang, Wu Zeqiu, He Wenqi, et al. CoType: Joint extraction of typed entities and relations with knowledge bases[C] //Proc of the 26th Int Conf on World Wide Web. New York: ACM, 2017: 1015−1024
[35] 黄培馨,赵翔,方阳,等. 融合对抗训练的端到端知识三元组联合抽取[J]. 计算机研究与发展,2019,56(12):2536−2548 doi: 10.7544/issn1000-1239.2019.20190640 Huang Peixin, Zhao Xiang, Fang Yang, et al. End-to-end knowledge triplet extraction combined with adversarial training[J]. Journal of Computer Research and Development, 2019, 56(12): 2536−2548 (in Chinese) doi: 10.7544/issn1000-1239.2019.20190640
[36] Fu T J, Li P, Ma W Y. GraphRel: Modeling text as relational graphs for joint entity and relation extraction[C] //Proc of the 57th Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2019: 1409−1418
[37] Zeng Daojian, Zhang Haoran, Liu Qianying. CopyMTL: Copy mechanism for joint extraction of entities and relations with multi-task learning[C] //Proc of the 34th AAAI Conf on Artificial Intelligence. Palo Alto, CA: AAAI, 2020: 9507−9514
[38] Marcheggiani D, Titov I. Encoding sentences with graph convolutional networks for semantic role labeling[C] //Proc of the 2017 Conf on EMNLP. Stroudsburg, PA: ACL, 2017: 1506−1515 [39] Liu Bang, Zhang Ting, Niu Di, et al. Matching long text documents via graph convolutional networks[J]. arXiv preprint, arXiv: 1802.07459, 2018
[40] Cetoli A, Bragaglia S, O’Harney A, et al. Graph convolutional networks for named entity recognition[C] //Proc of the 16th Int Workshop on Treebanks and Linguistic Theories. Stroudsburg, PA: ACL, 2018: 37−45
[41] Zhang Yuhao, Qi Peng, Christopher D M. Graph convolution over pruned dependency trees improves relation extraction[C] //Proc of the 2018 Conf on EMNLP. Stroudsburg, PA: ACL, 2018: 2205−2215 [42] Luan Yi, Wadden D, He Luheng, et al. A general framework for information extraction using dynamic span graphs[C] //Proc of the 2019 Conf of the North American Chapter of the ACL. Stroudsburg, PA: ACL, 2019: 3036−3046 [43] Qian Yujie, Santus E, Jin Zhijing, et al. GraphIE: A graph-based framework for information extraction[C] //Proc of the 2019 Conf of the North American Chapter of the ACL. Stroudsburg, PA: ACL, 2019: 751−761 -
期刊类型引用(2)
1. 刘阳,鲁圆圆,郭成城. 基于优先级的数据中心任务优化调度算法设计. 计算机仿真. 2025(01): 497-500+507 . 百度学术
2. 骆海霞. 基于递推估计的Web前端偶发任务能耗感知方法. 黑龙江工业学院学报(综合版). 2023(10): 115-120 . 百度学术
其他类型引用(1)