• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
高级检索

融合对抗训练的端到端知识三元组联合抽取

黄培馨, 赵翔, 方阳, 朱慧明, 肖卫东

黄培馨, 赵翔, 方阳, 朱慧明, 肖卫东. 融合对抗训练的端到端知识三元组联合抽取[J]. 计算机研究与发展, 2019, 56(12): 2536-2548. DOI: 10.7544/issn1000-1239.2019.20190640
引用本文: 黄培馨, 赵翔, 方阳, 朱慧明, 肖卫东. 融合对抗训练的端到端知识三元组联合抽取[J]. 计算机研究与发展, 2019, 56(12): 2536-2548. DOI: 10.7544/issn1000-1239.2019.20190640
Huang Peixin, Zhao Xiang, Fang Yang, Zhu Huiming, Xiao Weidong. End-to-end Knowledge Triplet Extraction Combined with Adversarial Training[J]. Journal of Computer Research and Development, 2019, 56(12): 2536-2548. DOI: 10.7544/issn1000-1239.2019.20190640
Citation: Huang Peixin, Zhao Xiang, Fang Yang, Zhu Huiming, Xiao Weidong. End-to-end Knowledge Triplet Extraction Combined with Adversarial Training[J]. Journal of Computer Research and Development, 2019, 56(12): 2536-2548. DOI: 10.7544/issn1000-1239.2019.20190640
黄培馨, 赵翔, 方阳, 朱慧明, 肖卫东. 融合对抗训练的端到端知识三元组联合抽取[J]. 计算机研究与发展, 2019, 56(12): 2536-2548. CSTR: 32373.14.issn1000-1239.2019.20190640
引用本文: 黄培馨, 赵翔, 方阳, 朱慧明, 肖卫东. 融合对抗训练的端到端知识三元组联合抽取[J]. 计算机研究与发展, 2019, 56(12): 2536-2548. CSTR: 32373.14.issn1000-1239.2019.20190640
Huang Peixin, Zhao Xiang, Fang Yang, Zhu Huiming, Xiao Weidong. End-to-end Knowledge Triplet Extraction Combined with Adversarial Training[J]. Journal of Computer Research and Development, 2019, 56(12): 2536-2548. CSTR: 32373.14.issn1000-1239.2019.20190640
Citation: Huang Peixin, Zhao Xiang, Fang Yang, Zhu Huiming, Xiao Weidong. End-to-end Knowledge Triplet Extraction Combined with Adversarial Training[J]. Journal of Computer Research and Development, 2019, 56(12): 2536-2548. CSTR: 32373.14.issn1000-1239.2019.20190640

融合对抗训练的端到端知识三元组联合抽取

基金项目: 国家自然科学基金项目(61402494,61402498,71690233,61902417);湖南省自然科学基金项目(2015JJ4009)
详细信息
  • 中图分类号: TP391

End-to-end Knowledge Triplet Extraction Combined with Adversarial Training

  • 摘要: 知识图谱作为一种有效表示现实世界的系统受到学术界和工业界广泛关注,并由于其精准表示知识的能力被广泛应用于信息服务、智慧搜索、自动问答等上层应用.知识图谱的核心为三元组形式的实体和关系.现有知识图谱远不足以描述现实世界,因此,如何通过实体关系抽取方法来补全或者构建新的知识图谱显得至关重要.传统流水线式的实体关系抽取方法会导致误差传递,而已有的联合抽取没有充分考虑命名实体识别与关系抽取之间的联系,从而降低抽取效果.针对上述问题,对知识三元组抽取方法进行了深入研究,提出了一种融合对抗训练的端到端知识三元组联合抽取方法.首先,采用了一种实体关系联合标注策略,通过端到端的神经网络抽取文本语义特征,并对文本进行自动标注;其次,模型在神经网络中加入自注意力机制增强对文本信息的编码能力,并通过引入带偏置项的目标函数提高对相关联实体的辨识能力;最后,模型融合了对抗训练以提高鲁棒性,改进抽取效果.在实验部分,采用4种分析方法和3种评价指标对模型性能进行评价分析,实验结果证明了模型在知识抽取上的性能明显优于现有方法.
    Abstract: As a system to effectively represent the real world, knowledge graph has been widely concerned by academia and industry, and its ability to accurately represent knowledge is widely used in upper applications such as information service, intelligent search, and automatic question answering. A fact (knowledge) in form of triplet (head_entity, relation, tail_entity), is the basic unit of knowledge graph. Since facts in existing knowledge graphs are far from enough to describe the real world, acquiring more knowledge for knowledge graph completion and construction appears to be crucial. This paper investigates the problem of knowledge triplet extraction in the task of knowledge acquisition. This paper proposes an end-to-end knowledge triplet extraction method combined with adversarial training. Traditional techniques, whether pipeline or joint extraction, failed to discover the link between two subtasks of named entity recognition and relation extraction, which led to error propagation and worse extraction effectiveness. To overcome these flaws, in this paper, we adopt an entity and relation joint tagging strategy, and leverage an end-to-end framework to automatically tag the text and classify the tagging results. In addition, self-attention mechanism is added to assist the encoding of text, an objective function with bias term is additionally introduced to increase the attention of relevant entities, and the adversarial training is utilized to improve the robustness of the model. In experiments, we evaluate the proposed knowledge triplet extraction model via three evaluation metrics and analyze the experiments in four aspects. The experimental results verify that our model outperforms other state-of-the-art alternatives on knowledge triplet extraction.
  • 期刊类型引用(11)

    1. 肖宇庭,吕晓琪,谷宇,刘传强. 基于拆分残差网络的糖尿病视网膜病变分类. 广西师范大学学报(自然科学版). 2024(01): 91-101 . 百度学术
    2. 吕德珍,赵玉,苗素琴. 基于分布式多节点医疗管理系统进程设计. 计算机与数字工程. 2024(02): 382-387 . 百度学术
    3. 盛文娟,赖振谱,杨宁,Peng Gangding. 基于改进AdaBoost算法的可调谐F-P滤波器温漂补偿方法. 光学学报. 2023(03): 48-56 . 百度学术
    4. 傅懋钟,胡海洋,李忠金. 面向GPU集群的动态资源调度方法. 计算机研究与发展. 2023(06): 1308-1321 . 本站查看
    5. 杨小琴,朱玉全. 基于距离限定优化的多姿态人脸图像智能识别. 计算机仿真. 2022(01): 200-203+282 . 百度学术
    6. 王昕. 梯度下降及优化算法研究综述. 电脑知识与技术. 2022(08): 71-73 . 百度学术
    7. 赵永亮,于倩,邓博,韩丽君,高红梅. 基于博弈论及机器学习的最优化算法设计与仿真. 电子设计工程. 2022(13): 23-27 . 百度学术
    8. 李晓锋,燕少飞,吴宸. 移动终端操作系统应用程序恶意检测系统技术研究. 电子技术与软件工程. 2022(17): 75-79 . 百度学术
    9. 蒋平. 基于卷积神经网络的图像精度深度优化. 淮阴工学院学报. 2021(03): 30-34 . 百度学术
    10. 杨国葳,李宏坤,张明亮,黄刚劲. 基于一维深度卷积自动编码器的刀具状态监测方法. 振动与冲击. 2021(21): 223-233+274 . 百度学术
    11. 郑雯,沈琪浩,任佳. 基于Improved DR-Net算法的糖尿病视网膜病变识别与分级. 光学学报. 2021(22): 72-83 . 百度学术

    其他类型引用(24)

计量
  • 文章访问数:  1553
  • HTML全文浏览量:  8
  • PDF下载量:  722
  • 被引次数: 35
出版历程
  • 发布日期:  2019-11-30

目录

    /

    返回文章
    返回