基于带权词格的循环神经网络句子语义表示建模

张祥文; 陆紫耀; 杨静; 林倩; 卢宇; 王鸿吉; 苏劲松

doi:10.7544/issn1000-1239.2019.20170917

基于带权词格的循环神经网络句子语义表示建模

Weighted Lattice Based Recurrent Neural Networks for Sentence Semantic Representation Modeling

摘要

摘要: 目前，循环神经网络(recurrent neural network, RNN)已经被广泛应用于自然语言处理的文本序列语义表示建模.对于没有词语分隔符的语言，例如中文，该网络以经过分词预处理的词序列作为标准输入.然而，非最优的分词粒度和分词错误会对句子语义表示建模产生负面作用，影响后续自然语言处理任务的进行.针对这些问题，提出基于带权词格的循环神经网络模型.该模型以带权词格作为输入，在每个时刻融合多个输入向量和对应的隐状态，融合生成新的隐状态.带权词格是一种包含指数级别分词结果的压缩数据结构，词格中的边权重在一定程度上体现了不同分词结果的一致性.特别地，利用词格权重作为融合函数中权重建模的监督信息，进一步提升了模型句子语义表示的学习效果.相比于传统循环神经网络，该模型不仅能够缓解分词错误对句子语义建模产生的负面影响，同时使得语义建模具有更强的灵活性.在情感分类和问句分类2个任务上的实验结果证明了该模型的有效性.

Abstract: Currently, recurrent neural networks (RNNs) have been widely used in semantic representation modeling of text sequences in natural language processing. For those languages without natural word delimiters (e.g., Chinese), RNNs generally take the segmented word sequence as input. However, sub-optimal segmentation granularity and segmentation errors may affect sentence semantic modeling negatively, as well as subsequent natural language processing tasks. To address these issues, the proposed weighted word lattice based RNNs take the weighted word lattice as input and produce current state at each time step by integrating arbitrarily many input vectors and the corresponding previous hidden states. Weighted word lattice expresses a compressed data structure that contains exponential word segmentation results. To a certain extent, the weighted word lattice reflects the consistency of different word segmentation results. Specifically, lattice weights are further exploited as a supervised regularizer to refine weights modeling of the semantic composition operation in this model, leading to better sentence semantic representation learning. Compared with traditional RNNs, the proposed model not only alleviates the negative impact of segmentation errors but also is more expressive and flexible to sentence representation learning. Experimental results on sentiment classification and question classification tasks demonstrate the superiority of the proposed model.

HTML全文

参考文献(0)

施引文献

资源附件(0)