一种面向实体关系联合抽取中缓解曝光偏差的方法
An Alleviate Exposure Bias Method in Joint Extraction of Entities and Relations
-
摘要: 实体关系联合抽取的目的是从非结构化文本中同时提取实体提及和关系事实,是知识图构建的关键步骤,也是许多自然语言处理中高级任务的基础.现有工作大都采用了分阶段的联合抽取方法来处理文本中同时存在的多个三元组和实体重叠情况下的三元组抽取问题,虽然取得了合理的性能提升,但都存在严重的曝光偏差问题.对此,提出了一种名为融合关系表达向量(fusional relation expression embedding, FREE)的新方法,通过融合关系表达向量来有效缓解曝光偏差问题.此外,提出了一种称为条件层规范化层的新特征融合层来更有效地融合先验信息.在2个广泛使用的数据集上进行了大量对比实验,结果表明该方法相较于当前最先进的基线方法具有显著优势,可以更有效地处理各种情况,并在不牺牲效率的前提下取得了与当前针对曝光偏差问题的先进方法相当的性能.Abstract: Joint extraction of entities and relations aims to discover entity mentions and relational facts simultaneously from unstructured texts, which is a critical step in knowledge graph construction, and serves as a basis of many high-level tasks in natural language processing. The joint extraction model gets more widespread attention as they can model the correlation between entity recognition and relation extraction more effectively. Most of the existing work uses a phased joint extraction method to deal with the problem of triple extraction in the text where there are multiple triples and entities overlapping at the same time, although reasonable performance improvement has been achieved, there are serious exposure bias problems. In this paper, we propose a novel method called fusional relation expression embedding (FREE) to tackle the exposure bias problem by fusing relation expression information. Besides, a novel feature fusion layer called conditional layer normalization is proposed to fuse prior information more effectively. We conduct a lot of comparative experiments on two widely used data sets. The in-depth analysis of the experimental results shows that the proposed method has significant advantages over the current state-of-the-art baseline model, and it can deal with various situations more effectively and achieve the competitive performance as the current advanced model for exposure bias problems without sacrificing efficiency.