高级检索

    基于知识增强的文本隐喻识别图编码方法

    Knowledge-Enhanced Graph Encoding Method for Metaphor Detection in Text

    • 摘要: 隐喻识别是自然语言处理中语义理解的重要任务之一,目标为识别某一概念在使用时是否借用了其他概念的属性和特点. 由于单纯的神经网络方法受到数据集规模和标注稀疏性问题的制约,近年来,隐喻识别研究者开始探索如何利用其他任务中的知识和粗粒度句法知识结合神经网络模型,获得更有效的特征向量进行文本序列编码和建模.然而,现有方法忽略了词义项知识和细粒度句法知识,造成了外部知识利用率低的问题,难以建模复杂语境.针对上述问题,提出一种基于知识增强的图编码方法(knowledge-enhanced graph encoding method,KEG)来进行文本中的隐喻识别. 该方法分为3个部分:在文本编码层,利用词义项知识训练语义向量,与预训练模型产生的上下文向量结合,增强语义表示;在图网络层,利用细粒度句法知识构建信息图,进而计算细粒度上下文,结合图循环神经网络进行迭代式状态传递,获得表示词的节点向量和表示句子的全局向量,实现对复杂语境的高效建模;在解码层,按照序列标注架构,采用条件随机场对序列标签进行解码.实验结果表明,该方法的性能在4个国际公开数据集上均获得有效提升.

       

      Abstract: Metaphor recognition is one of the essential tasks of semantic understanding in natural language processing, aiming to identify whether one concept is viewed in terms of the properties and characteristics of the other. Since pure neural network methods are restricted by the scale of datasets and the sparsity of human annotations, recent researchers working on metaphor recognition explore how to combine the knowledge in other tasks and coarse-grained syntactic knowledge with neural network models, obtaining more effective feature vectors for sequence coding and modeling in text. However, the existing methods ignore the word sense knowledge and fine-grained syntactic knowledge, resulting in the problem of low utilization of external knowledge and the difficulty to model complex context. Aiming at the above issues, a knowledge-enhanced graph encoding method (KEG) for metaphor detection in text is proposed. This method consists of three parts. In the encoding layer, the sense vector is trained using the word sense knowledge, combined with the context vector generated by the pre-training model to enhance the semantic representation. In the graph layer, the information graph is constructed using fine-grained syntactic knowledge, and then the fine-grained context is calculated. The layer is combined with the graph recurrent neural network, whose state transition is carried out iteratively to obtain the node vector and the global vector representing the word and the sentence, respectively, to realize the efficient modeling of the complex context. In the decoding layer, conditional random fields are used to decode the sequence tags following the sequence labeling architecture. Experimental results show that this method effectively improves the performance on four international public datasets.

       

    /

    返回文章
    返回