Abstract:
One of the key issues in practical speech processing is to precisely locate endpoints of the input utterance to be free of non-speech regions. Although lots of studies have been performed to solve this problem, the operation of existing voice activity detection (VAD) algorithms is still far away from ideal. This paper proposes a robust feature for VAD method in car environments based on sample entropy (SampEn) which is an improved algorithm of approximate entropy (ApEn). In addition, we adopt fuzzy C means clustering algorithm and Bayesian information criterion algorithm to estimate the thresholds of the SampEn characteristic, and use dual thresholds method for VAD. Experiments on the TIMIT continuous speech database show that, in the car noise environments, the detection accuracy of SampEn and ApEn are both much higher than that of spectral entropy (SE) and energy spectral entropy (ESE). SampEn method has better detection performance than ApEn, especially when the SNR is not more than 0dB, and SampEn method detection performance is superior to ApEn nearly 10%. Therefore, the SampEn method has a good application prospect in automotive field and can provide accurate VAD techniques for car navigation.