Abstract:
Perceptron is one type of neural networks (NN) which can acquire the ability of pattern recognition by supervised learning. In this paper, two perceptron training rules for language modeling (LM) are introduced as an alternative to the traditional training method such as maximum likelihood estimation (MLE). Variants of perceptron learning algorithms are presented and the impact of different training parameters on performance is discussed. Since there is a strict restriction on the language model size, feature selection is conducted based on the empirical risk minimization (ERM) principle before modeling. The model performance is evaluated in the task of Japanese kana-kanji conversion which converts phonetic strings into the appropriate word strings. An empirical study on the variants of perceptron learning algorithms is conducted based on the two training rules, and the results also show that perceptron methods outperform substantially the traditional methods for LM.