ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2014, Vol. 51 ›› Issue (9): 1936-1944.doi: 10.7544/issn1000-1239.2014.20140211

所属专题: 2014深度学习

• 人工智能 • 上一篇    下一篇

基于RNN-RBM语言模型的语音识别研究

黎亚雄1,张坚强2,潘登3,胡惮4   

  1. 1(湖北科技学院网络管理中心 湖北咸宁 437100);2(弗吉尼亚理工大学信息技术中心 美国黑堡 VA24061);3(湖北科技学院外国语学院 湖北咸宁 437100) ;4(中南财经政法大学外国语学院 武汉 430073) (yaxiong_li@live.cn)
  • 出版日期: 2014-09-01

A Study of Speech Recognition Based on RNN-RBM Language Model

Li Yaxiong1, Zhang Jianqiang2, Pan Deng3, Hu Dan4   

  1. 1(Network Management Center, Hubei University of Science and Technology, Xianning, Hubei 437100);2(Information Technology Department of Learning Sciences & Technologies Virginia Polytechnic and State University, Blacksburg, USA VA24061);3(School of Foreign Languages, Hubei University of Science and Technology, Xianning, Hubei 437100);4(School of Foreign Languages, Zhongnan University of Economics and Law, Wuhan 430073)
  • Online: 2014-09-01

摘要: 近年来深度学习兴起,其在语言模型领域有着不错的成效,如受限玻尔兹曼机(restricted Boltz-mann machine, RBM)语言模型等.不同于N-gram语言模型,这些根植于神经网络的语言模型可以将词序列映射到连续空间来评估下一词出现的概率,以解决数据稀疏的问题.此外,也有学者使用递归神经网络来建构语言模型,期望由递归的方式充分利用所有上文信息来预测下一词,进而有效处理长距离语言约束.根据递归受限玻尔兹曼机神经网络(recurrent neural network-restricted Boltzmann machine, RNN-RBM)的基础来捕捉长距离信息;另外,也探讨了根据语言中语句的特性来动态地调整语言模型.实验结果显示,使用RNN-RBM语言模型对于大词汇连续语音识别的效能有相当程度的提升.

关键词: 语音识别, 语言模型, 神经网络, 递归神经网络-受限玻尔兹曼机, 关联信息

Abstract: In the recent years, deep learning is emerging as a new way of multilayer neural networks and back propagation training. Its application in the field of language model, such as restricted Boltzmann machine language model, gets good results. This language model based on neural network can assess the probability of the next word appears according to the word sequence which is mapped to a continuous space. This language model can solve the problem of sparse data. Besides, some scholars are constructing language model making use of recurrent neural network mode in order to make full use of the preceding text to predict the next words. From these models we can sort out the restriction of long-distance dependency in language. This paper attempts to catch the long-distance information based on RNN-RBM. On the other hand, the dynamic adjunction of language model ia analyzed and illustrated according to the language features. The experimental result manifests there are considerable improvement to the efficiency of expanding vocabulary continuing speech recognition using RNN_RBM language model.

Key words: speech recognition, language model, neural network, recurrent neural network-restricted Boltzmann machine, relevance information

中图分类号: