ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2019, Vol. 56 ›› Issue (4): 854-865.doi: 10.7544/issn1000-1239.2019.20170917

Previous Articles     Next Articles

Weighted Lattice Based Recurrent Neural Networks for Sentence Semantic Representation Modeling

Zhang Xiangwen1,2, Lu Ziyao1, Yang Jing1, Lin Qian1, Lu Yu1, Wang Hongji1, Su Jinsong1,2   

  1. 1(Xiamen University, Xiamen, Fujian 361000); 2(Jiangsu Provincial Key Laboratory for Computer Information Processing Technology(Soochow University), Suzhou, Jiangsu 215006)
  • Online:2019-04-01

Abstract: Currently, recurrent neural networks (RNNs) have been widely used in semantic representation modeling of text sequences in natural language processing. For those languages without natural word delimiters (e.g., Chinese), RNNs generally take the segmented word sequence as input. However, sub-optimal segmentation granularity and segmentation errors may affect sentence semantic modeling negatively, as well as subsequent natural language processing tasks. To address these issues, the proposed weighted word lattice based RNNs take the weighted word lattice as input and produce current state at each time step by integrating arbitrarily many input vectors and the corresponding previous hidden states. Weighted word lattice expresses a compressed data structure that contains exponential word segmentation results. To a certain extent, the weighted word lattice reflects the consistency of different word segmentation results. Specifically, lattice weights are further exploited as a supervised regularizer to refine weights modeling of the semantic composition operation in this model, leading to better sentence semantic representation learning. Compared with traditional RNNs, the proposed model not only alleviates the negative impact of segmentation errors but also is more expressive and flexible to sentence representation learning. Experimental results on sentiment classification and question classification tasks demonstrate the superiority of the proposed model.

Key words: weighted word lattice, recurrent neural network, sentence semantics modeling, sentiment classification, question classification

CLC Number: