Lecture Notes in Computer Science, 2007, Volume 4570/2007, 915-924, DOI: 10.1007/978-3-540-73325-6_91

An Improved Voice Activity Detection Algorithm for GSM Adaptive Multi-Rate Speech Codec Based on Wavelet and Support Vector Machine

Shi-Huang Chen, Yaotsu Chang and T. K. Truong

View Related Documents

Abstract

This paper proposes an improved voice activity detection (VAD) algorithm for controlling discontinuous transmission (DTX) of the GSM adaptive multi-rate (AMR) speech codec. First, based on the wavelet transform, the original IIR filter bank and the open-loop pitch detector are implemented via the wavelet filter bank and the wavelet-based pitch detection algorithm, respectively. The proposed wavelet filter bank divides the input speech signal into 9 frequency bands so that the signal level at each sub-band can be calculated. In addition, the background noise can be estimated in each sub-band by using the wavelet de-noising method. The wavelet filter bank is also derived to detect correlated complex signals like music. Then one can apply support vector machine (SVM) to train an optimized non-linear VAD decision rule involving the sub-band power, noise level, pitch period, tone flag, and complex signals warning flag of input speech signals. By the use of the trained SVM, the proposed VAD algorithm can produce more accurate detection results. Various experimental results carried out from the Aurora speech database show that the proposed algorithm gives considerable VAD performances superior to the AMR VAD Option 1 and comparable with the AMR VAD Option 2.

Keywords  GSM AMR - VAD - Wavelet - Support Vector Machine

Fulltext Preview

Image of the first page of the fulltext document