Abstract:
Spoken language understanding (SLU) is one of the key components in a spoken dialogue system. One challenge for SLU is robustness since the speech recognizer inevitably makes errors and spoken language is plagued with a large set of spontaneous speech phenomena. Another challenge is portability. Traditionally, the rulebased SLU approaches require linguistic experts to handcraft the domainspecific grammar for parsing, which is timeconsuming and laboursome. A new SLU approach based on twostage classification is proposed. Firstly, the topic classifier is used to identify the topic of an input utterance. Then, with the restriction of the recognized target topic, the semantic slot classifiers are trained to extract the corresponding slotvalue pairs. The advantage of the proposed approach is that it is mainly datadriven and requires only minimally annotated corpus for training whilst retaining the understanding robustness and deepness for spoken language. Experiments have been conducted in the Chinese public transportation information inquiry domain and the English DARPA Communicator domain. The good performance demonstrates the viability of the proposed approach.