高级检索

    基于图注意力网络的因果关系抽取

    Causal Relation Extraction Based on Graph Attention Networks

    • 摘要: 因果关系作为一种重要的关系类型在关系推理等许多领域中起着至关重要的作用,因此对因果关系进行抽取是文本挖掘中的一项基本任务.与传统文本分类方法或关系抽取不同,采用序列标注的方法可以抽取文本中的因果实体并确定因果关系方向,不需要依赖特征工程或因果背景知识.主要贡献有:1)拓展句法依存树到句法依存图,将图注意力网络应用到自然语言处理中,引入了基于句法依存图的图注意力网络的概念;2)提出Bi-LSTM+CRF+S-GAT因果关系抽取模型,根据输入的词向量生成句子中每个词的因果标签;3)对SemEval数据集进行修正与拓展,针对其存在的缺陷制定规则重新标注实验数据.在拓展后的SemEval数据集上进行了大量的实验,结果表明:该模型在预测准确率上比现有最优模型Bi-LSTM+CRF+self-ATT提高了0.064.

       

      Abstract: Causality represents a kind of correlation between cause and effect, where the happening of cause will leads to the happening of effect. As the most important type of relationship between entities, causality plays a vital role in many fields such as automatic reasoning and scenario generation. Therefore, extracting causal relation becomes a basic task in natural language processing and text mining. Different from traditional text classification methods or relation extraction methods, this paper proposes a sequence labeling method to extract causal entity in text and identify direction of causality, without relying on feature engineering or causal background knowledge. The main contributions of this paper can be summarized as follows: 1) we extend syntactic dependency tree to the syntactic dependency graph, adopt graph attention networks in natural language processing, and introduce the concept of S-GAT(graph attention network based on syntactic dependency graph); 2) Bi-LSTM+CRF+S-GAT model for causal extraction is proposed, which generates causal label of each word in sentence based on input word vectors; 3) SemEval data set is modified and extended, and rules are defined to relabel experimental data with an aim of overcoming defects of the original labeling method. Extensive experiments are conducted on the expanded SemEval dataset, which shows that our model achieves 0.064 improvement over state-of-the-art model Bi-LSTM+CRF+self-ATT in terms of prediction accuracy.

       

    /

    返回文章
    返回