Abstract:
Recently, dynamic Bayesian network (DBN) based speech recognition has aroused an increasing interest, because of its interpretability, factorization and extensibility, which hidden Markov models (HMMs) lack. Although a huge success of the introduction of DBNs into speech recognition in many areas and DBNs has been presented with promising potential to overcome inherent limitations of HMMs in speech recognition, previous work on DBN based speech recognition mainly focuses on isolated word speech recognition, and the frameworks and recognition algorithms for DBN based continuous speech recognition are not as mature and flexible as those for HMM based one. This paper is trying to address the problems of flexibility and extensibility in DBN based continuous speech recognition. To achieve this purpose, the token passing model, which works very well to address the above problems for HMM based continuous speech recognition, is adapted for DBN based one, and a general framework based on it is proposed. In this framework, the advantages of both token passing model and DBN are combined. A novel recognition algorithm independent of the upper layer language model is proposed under this framework, and a toolkit DTK for building DBN based speech recognition under this framework is developed.