图卷积增强多路解码的实体关系联合抽取模型

乔勇鹏; 于亚新; 刘树越; 王子腾; 夏子芳; 乔佳琪

doi:10.7544/issn1000-1239.202110767

图卷积增强多路解码的实体关系联合抽取模型

东北大学计算机科学与工程学院　沈阳　110169
医学影像智能计算教育部重点实验室（东北大学）　沈阳　110169

基金项目: 国家自然科学基金项目（61871106，61973059）；国家重点研发计划项目（2016YFC0101500）

详细信息

作者简介:
乔勇鹏: 1997年生.硕士.主要研究方向为知识图谱和自然语言处理

于亚新: 1971年生.博士，副教授，硕士生导师.IEEE，ACM，CCF会员.主要研究方向为数据挖掘、社交网络

刘树越: 1997年生.硕士.主要研究方向为强化学习、推荐系统和知识图谱

王子腾: 1998年生.硕士研究生.主要研究方向为深度强化学习和迁移学习

夏子芳: 1998年生.硕士研究生.主要研究方向为推荐系统、因果关系推理、知识图谱

乔佳琪: 1998年生.硕士研究生.主要研究方向为自然语言处理和计算机视觉

通讯作者:
于亚新（yuyx@mail.neu.edu.cn）

中图分类号: TP311
计量
- 文章访问数: 231
- HTML全文浏览量: 37
- PDF下载量: 116
出版历程
- 收稿日期: 2021-07-14
- 修回日期: 2022-03-21
- 网络出版日期: 2023-02-10
- 刊出日期: 2022-12-31

Graph Convolution-Enhanced Multi-Channel Decoding Joint Entity and Relation Extraction Model

School of Computer Science and Engineering, Northeastern University, Shenyang 110169
Key Laboratory of Intelligent Computing in Medical Image (Northeastern University), Ministry of Education, Shenyang 110169

Funds: This work was supported by the National Natural Science Foundation of China (61871106，61973059) and the National Key Research and Development Program of China (2016YFC0101500).

摘要

摘要:
从无结构化自然语言文本中抽取实体关系三元组是构建大型知识图谱中最为关键的一步，但现有研究仍存在3方面问题：1）忽略文本中因多个三元组共享同一实体而产生的实体关系重叠问题；2）当前以编码器−解码器为基础的联合抽取模型未充分考虑文本语句词之间的依赖关系；3）部分三元组序列过长导致误差累积与传播，影响实体关系抽取的精度和效率.基于此，提出基于图卷积增强多路解码的实体关系联合抽取模型 (graph convolution-enhanced multi-channel decoding joint entity and relation extraction model, GMCD-JERE).首先，基于BiLSTM作为模型编码器，强化文本中词的双向特征融合；其次，通过图卷积多跳特征融合句中词之间的依赖关系，提高关系抽取准确性；此外，改进传统模型按三元组先后顺序的解码机制，通过多路解码三元组机制，解决实体关系重叠问题，同时缓解三元组序列过长造成误差累积、传播的影响；最后，实验选用当前3个主流模型进行性能验证，在NYT (New York times)数据集上结果表明在精确率、召回率和F1这3个指标上分别提升了4.3%，5.1%，4.8%，同时在WebNLG (Web natural language generation)数据集上验证以关系为开始的抽取顺序.
- 关系抽取 /
- 编码器–解码器 /
- 多路解码 /
- 关系重叠 /
- 图卷积神经网络
Abstract:
Extracting relational triplets from unstructured natural language texts are the most critical step in building a large-scale knowledge graph, but existing researches still have the following problems: 1) Existing models ignore the problem of relation overlapping caused by multiple triplets sharing the same entity in text; 2) The current joint extraction model based on encoder-decoder does not fully consider the dependency relationship among words in the text; 3) The excessively long sequence of triplets leads to the accumulation and propagation of errors, which affects the precision and efficiency of relation extraction in entity. Based on this, a graph convolution-enhanced multi-channel decoding joint entity and relation extraction model (GMCD-JERE) is proposed. First, the BiLSTM is introduced as a model encoder to strengthen the two-way feature fusion of words in the text; second, the dependency relationship between the words in the sentence is merged through the graph convolution multi-hop mechanism to improve the accuracy of relation classification; third, through multi-channel decoding mechanism, the model solves the problem of relation overlapping, and alleviates the effect of error accumulation and propagation at the same time; fourth, the experiment selects the current three mainstream models for performance verification, and the results on the NYT (New York times) dataset show that the accuracy rate, recall rate, and F1 are increased by 4.3%, 5.1% and 4.8%. Also, the extraction order starting with the relation is verified in the WebNLG (Web natural language generation) dataset.
- relation extraction /
- encoder-decoder /
- multi-channel decoding /
- relation overlapping /
- graph convolution neural network

HTML全文

图 1 实体关系重叠类型

Figure 1. Types of entity relationship overlapping

下载: 全尺寸图片幻灯片

图 2 相关模型技术对比

Figure 2. Comparison of related model techniques

下载: 全尺寸图片幻灯片

图 3 GMCD-JERE模型整体架构图

Figure 3. Overall architecture diagram of GMCD-JERE model

下载: 全尺寸图片幻灯片

图 4 基准模型框架对比

Figure 4. Comparison of baseline model architecture

下载: 全尺寸图片幻灯片

图 5 NYT不同抽取顺序下模型性能

Figure 5. Model performance under different extraction sequences in NYT

下载: 全尺寸图片幻灯片

图 6 WebNLG不同抽取顺序下模型性能

Figure 6. Model performance under different extraction sequences in WebNLG

下载: 全尺寸图片幻灯片

表 1 GMCD-JERE模型变量定义

Table 1 Variable Definitions for the GMCD-JERE

符号	描述
${\boldsymbol{h}}_i^{\text{E} }$	语句上下文特征
${{s} _{{ij} }}$	语句序列第i个和第j个词间注意力
${{M} _{{ij} }}$	语句序列第i个和第j个词间依赖权重
${\boldsymbol{h}}_{i} ^{l}$	第i个词在第l层卷积后的隐层状态
${{\boldsymbol{o}}^{ {(1)} } }$	解码器第1阶段的输出
${\boldsymbol{o} }_{ {{\rm{relation}}} }^{}$	句中各关系存在的概率
${{\boldsymbol{o}}^{ {(2)} } }$	解码器第2阶段的输出
${\boldsymbol{h}}_\lambda ^{}$	关系嵌入向量
${\boldsymbol{o} }_{ {{\rm{head}}} }^{ {{\rm{start}}} }$	头实体首位置在语句中各位置的概率
${\boldsymbol{o} }_{ {{\rm{head}}} }^{ {{\rm{end}}} }$	头实体尾位置在语句中各位置的概率
${{\boldsymbol{o}}^{ {(3)} } }$	解码器第3阶段的输出
${\boldsymbol{o} }_{ {{\rm{tail}}} }^{ {{\rm{start}}} }$	尾实体首位置在语句中各位置的概率
${\boldsymbol{o} }_{ {{\rm{tail}}} }^{ {{\rm{end}}} }$	尾实体尾位置在语句中各位置的概率
$\bar {\boldsymbol{o} }_{ {{\rm{head}}} }^{ {{\rm{start}}} }$	头实体首位置真实值
$\bar {\boldsymbol{o} }_{ {{\rm{head}}} }^{ {{\rm{end}}} }$	头实体尾位置真实值
$\bar {\boldsymbol{o} }_{ {{\rm{tail}}} }^{ {{\rm{start}}} }$	尾实体首位置真实值
$\bar {\boldsymbol{o} }_{ { {\rm{tail} } } }^{ { {\rm{end} } } }$	尾实体尾位置真实值
$\bar {\boldsymbol{o} }_{ { {\rm{relation} } } }^{}$	对应关系真实值

下载: 导出CSV

表 2 NYT和WebNLG数据集相关信息

Table 2 Related Information of NYT and WebNLG Datasets

类型	NYT			WebNLG
类型	训练	验证	测试	训练	验证	测试
Normal	35703	3208	3136	1600	182	246
EPO	11991	1030	1168	227	16	26
SEO	8502	762	696	3192	302	431
总计	56196	5000	5000	5019	500	703

下载: 导出CSV

表 3 模型性能对比

Table 3 Performance Comparison of the Models %

模型	NYT			WebNLG
模型	Precision	Recall	F1	Precision	Recall	F1
CopyRE_One	59.4	53.1	56.0	32.2	28.9	30.5
CopyRE_Mul	61.0	56.6	58.7	37.7	36.4	37.1
GraphRel_1p	62.9	57.3	60.0	42.3	39.2	40.7
GraphRel_2p	63.9	60.0	61.9	44.7	41.1	42.9
CopyMTL_One	72.7	69.2	70.9	57.8	60.1	58.9
CopyMTL_Mul	75.7	68.7	72.0	58.0	54.9	56.4
GMCD-JERE	80.0	73.8	76.8	43.9	42.7	43.3

下载: 导出CSV

参考文献(43)

[1]	李冬梅,张扬,李东远,等. 实体关系抽取方法研究综述[J]. 计算机研究与发展,2020,57(7):1424−1448 doi: 10.7544/issn1000-1239.2020.20190358 Li Dongmei, Zhang Yang, Li Dongyuan, et al. Review of entity relation extraction methods[J]. Journal of Computer Research and Development, 2020, 57(7): 1424−1448 (in Chinese) doi: 10.7544/issn1000-1239.2020.20190358
[2]	Zeng Daojian, Liu Kang, Lai Siwei, et al. Relation classification via convolutional deep neural network[C] //Proc of the 25th Int Conf on Computational Linguistics. Stroudsburg, PA: ACL, 2014: 2335−2344
[3]	Xu Kun, Feng Yansong, Huang Songfang, et al. Semantic relation classification via convolutional neural networks with simple negative sampling[C] //Proc of the 2015 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2015: 536−540
[4]	Chan S Y, Roth D. Exploiting syntactico-semantic structures for relation extraction[C] //Proc of the 49th Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2011: 551−560
[5]	Li Qi, Ji Heng. Incremental joint extraction of entity mentions and relations[C] //Proc of the 52nd Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2014: 402−412
[6]	Miwa M, Bansal M. End-to-end relation extraction using LSTMs on sequences and tree structures[C] //Proc of the 54th Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2016: 1105−1116
[7]	曹明宇,杨志豪,罗凌,等. 基于神经网络的药物实体与关系联合抽取[J]. 计算机研究与发展,2019,56(7):1432−1440 doi: 10.7544/issn1000-1239.2019.20180714 Cao Mingyu, Yang Zhihao, Luo Ling, et al. Joint drug entities and relations extraction based on neural networks[J]. Journal of Computer Research and Development, 2019, 56(7): 1432−1440 (in Chinese) doi: 10.7544/issn1000-1239.2019.20180714
[8]	Zhang Meishan, Zhang Yue, Fu Guohong. End-to-end neural relation extraction with global optimization[C] //Proc of the 2017 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2017: 1730−1740
[9]	Gupta P, Schtze H, Andrassy B. Table filling multi-task recurrent neural network for joint entity and relation extraction[C] //Proc of the 26th Int Conf on Computational Linguistics. Stroudsburg, PA: ACL, 2016: 2537−2547
[10]	Zheng Suncong, Wang Feng, Bao Hongyun, et al. Joint extraction of entities and relations based on a novel tagging scheme[C] //Proc of the 55th Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2017: 1227−1236
[11]	Zeng Xiangrong, Zeng Daojian, He Shizhu, et al. Extracting relational facts by an end-to-end neural model with copy mechanism[C] //Proc of the 56th Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2018: 506−514
[12]	Peng Nanyun, Hoifung P, Chris Q, et al. Cross-sentence n-ary relation extraction with graph LSTMs[J]. Transactions of the ACL, 2017, 5: 101–115
[13]	Sahu K S, Christopoulou F, Miwa M, et al. Inter-sentence relation extraction with document-level graph convolutional neural network[C] //Proc of the 57th Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2019: 4309−4316
[14]	Dai Dai, Xiao Xinyan, Lu Yajuan, et al. Joint extraction of entities and overlapping relations using position-attentive sequence labeling[C] //Proc of the 33rd AAAI Conf on Artificial Intelligence. Palo Alto, CA: AAAI, 2019: 6300−6308
[15]	Nayak T, Ng H T. Effective modeling of encoder-decoder architecture for joint entity and relation extraction[C] //Proc of the 34th AAAI Conf on Artificial Intelligence. Palo Alto, CA: AAAI, 2020: 8528−8535
[16]	甘丽新,万常选,刘德喜,等. 基于句法语义特征的中文实体关系抽取[J]. 计算机研究与发展,2016,53(2):284−302 doi: 10.7544/issn1000-1239.2016.20150842 Gan Lixin, Wan Changxuan, Liu Dexi, et al. Chinese named entity relation extraction based on syntactic and semantic features[J]. Journal of Computer Research and Development, 2016, 53(2): 284−302 (in Chinese) doi: 10.7544/issn1000-1239.2016.20150842
[17]	田驰远,陈德华,王梅,等. 基于依存句法分析的病例报告结构化处理方法[J]. 计算机研究与发展,2016,52(12):2669−2680 doi: 10.7544/issn1000-1239.2016.20160611 Tian Chiyuan, Chen Dehua, Wang Mei, et al. Structured processing for pathological reports based on dependency parsing[J]. Journal of Computer Research and Development, 2016, 52(12): 2669−2680 (in Chinese) doi: 10.7544/issn1000-1239.2016.20160611
[18]	刘克彬,李芳,刘磊,等. 基于核函数中文关系自动抽取系统的实现[J]. 计算机研究与发展,2007,44(8):1406−1411 doi: 10.1360/crad20070818 Liu Kebin, Li Fang, Liu Lei, et al. Implement of a kernel-based chinese relation extraction system[J]. Journal of Computer Research and Development, 2007, 44(8): 1406−1411 (in Chinese) doi: 10.1360/crad20070818
[19]	Zelenko D, Aone C, Richardella A. Kernel methods for relation extraction[C] //Proc of the 2002 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2002: 71−78
[20]	Hendrickx I, Kim N S, Kozareva Z, et al. Semeval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals[C] //Proc of the 5th Int Workshop on Semantic Evaluation. Stroudsburg, PA: ACL, 2010: 33−38
[21]	Xu Yan, Mou Lili, Li Ge, et al. Classifying relations via long short term memory networks along shortest dependency paths[C] //Proc of the 2015 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2015: 1785−1794
[22]	Florian R, Jing Hongyan, Kambhatla N, et al. Factorizing complex models: A case study in mention detection[C] //Proc of the 44th Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2006: 473−480
[23]	Florian R, Pitrelli J, Roukos S. et al. Improving mention detection robustness to noisy input[C] //Proc of the 2010 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2010: 335−345
[24]	Zhao Shubin, Grishman R. Extracting relations with integrated information using kernel methods[C] //Proc of the 43rd Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2005: 419−426
[25]	Jiang Jing, Zhai Chengxiang. A systematic exploration of the feature space for relation extraction[C] //Proc of the 2007 Conf of the North American Chapter of the ACL. Stroudsburg, PA: ACL, 2007: 113−120
[26]	Sun A, Grishman R, Sekine S. Semi-supervised relation extraction with large-scale word clustering[C] //Proc of the 49th Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2011: 521−529
[27]	Plank B, Moschitti A. Embedding semantic similarity in tree kernels for domain adaptation of relation extraction[C] //Proc of the 51st Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2013: 1498−1507
[28]	Mintz M, Bills S, Snow R, et al. Distant supervision for relation extraction without labeled data[C] //Proc of the Conf of the 47th Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2009: 1003−1011
[29]	Cai Rui, Zhang Xiaodong, Wang Houfang. Bidirectional recurrent convolutional neural network for relation classification[C] //Proc of the 54th Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2016: 756−765
[30]	Christopoulou F, Miwa M, Ananiadou S. A walk-based model on entity graphs for relation extraction[C] //Proc of the 56th Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2018: 81−88
[31]	Qin Pengda, Xu Weiran, Wang Yang William. Robust distant supervision relation extraction via deep reinforcement learning[C] //Proc of the 56th Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2018: 2137−2147
[32]	Yu Xiaofeng, Lam W. Jointly identifying entities and extracting relations in encyclopedia text via a graphical model approach[C] //Proc of the 2010 Conf on Computational Linguistics. Stroudsburg, PA: ACL, 2010: 1399−1407
[33]	Miwa M, Sasaki Y. Modeling joint entity and relation extraction with table representation[C] //Proc of the 2014 Conf on EMNLP. Stroudsburg, PA: ACL, 2014: 1858−1869
[34]	Ren Xiang, Wu Zeqiu, He Wenqi, et al. CoType: Joint extraction of typed entities and relations with knowledge bases[C] //Proc of the 26th Int Conf on World Wide Web. New York: ACM, 2017: 1015−1024
[35]	黄培馨,赵翔,方阳,等. 融合对抗训练的端到端知识三元组联合抽取[J]. 计算机研究与发展,2019,56(12):2536−2548 doi: 10.7544/issn1000-1239.2019.20190640 Huang Peixin, Zhao Xiang, Fang Yang, et al. End-to-end knowledge triplet extraction combined with adversarial training[J]. Journal of Computer Research and Development, 2019, 56(12): 2536−2548 (in Chinese) doi: 10.7544/issn1000-1239.2019.20190640
[36]	Fu T J, Li P, Ma W Y. GraphRel: Modeling text as relational graphs for joint entity and relation extraction[C] //Proc of the 57th Annual Meeting of the ACL. Stroudsburg, PA: ACL, 2019: 1409−1418
[37]	Zeng Daojian, Zhang Haoran, Liu Qianying. CopyMTL: Copy mechanism for joint extraction of entities and relations with multi-task learning[C] //Proc of the 34th AAAI Conf on Artificial Intelligence. Palo Alto, CA: AAAI, 2020: 9507−9514
[38]	Marcheggiani D, Titov I. Encoding sentences with graph convolutional networks for semantic role labeling[C] //Proc of the 2017 Conf on EMNLP. Stroudsburg, PA: ACL, 2017: 1506−1515
[39]	Liu Bang, Zhang Ting, Niu Di, et al. Matching long text documents via graph convolutional networks[J]. arXiv preprint, arXiv: 1802.07459, 2018
[40]	Cetoli A, Bragaglia S, O’Harney A, et al. Graph convolutional networks for named entity recognition[C] //Proc of the 16th Int Workshop on Treebanks and Linguistic Theories. Stroudsburg, PA: ACL, 2018: 37−45
[41]	Zhang Yuhao, Qi Peng, Christopher D M. Graph convolution over pruned dependency trees improves relation extraction[C] //Proc of the 2018 Conf on EMNLP. Stroudsburg, PA: ACL, 2018: 2205−2215
[42]	Luan Yi, Wadden D, He Luheng, et al. A general framework for information extraction using dynamic span graphs[C] //Proc of the 2019 Conf of the North American Chapter of the ACL. Stroudsburg, PA: ACL, 2019: 3036−3046
[43]	Qian Yujie, Santus E, Jin Zhijing, et al. GraphIE: A graph-based framework for information extraction[C] //Proc of the 2019 Conf of the North American Chapter of the ACL. Stroudsburg, PA: ACL, 2019: 751−761

施引文献(3)

期刊类型引用(2)

1.	刘阳，鲁圆圆，郭成城. 基于优先级的数据中心任务优化调度算法设计. 计算机仿真. 2025(01): 497-500+507 . 百度学术
2.	骆海霞. 基于递推估计的Web前端偶发任务能耗感知方法. 黑龙江工业学院学报(综合版). 2023(10): 115-120 . 百度学术