Volume 61, Number 1, 21-37, DOI: 10.1007/s11265-008-0274-7

Monaural Speech Separation Based on Gain Adapted Minimum Mean Square Error Estimation

M. H. Radfar, R. M. Dansereau and W.-Y. Chan

From the issue entitled "Special Issue: Machine Learning for Signal Processing; Guest Editors: Ioannis Pitas, Vince Calhoun and Konstantinos Diamantaras"

View Related Documents

Abstract

We present a new model-based monaural speech separation technique for separating two speech signals from a single recording of their mixture. This work is an attempt to solve a fundamental limitation in current model-based monaural speech separation techniques in which it is assumed that the data used in the training and test phases of the separation model have the same energy level. To overcome this limitation, a gain adapted minimum mean square error estimator is derived which estimates sources under different signal-to-signal ratios. Specifically, the speakers’ gains are incorporated as unknown parameters into the separation model and then the estimator is derived in terms of the source distributions and the signal-to-signal ratio. Experimental results show that the proposed system improves the separation performance significantly when compared with a similar model without gain adaptation as well as a maximum likelihood estimator with gain estimation.

Keywords  Source separation - Model-based monaural speech separation - Minimum mean square error estimation - Gain adaptation - Mixmax approximation

A preliminary version of this paper was presented at the IEEE Workshop on Machine Learning for Signal Processing (MLSP) held in Thessaloniki, Greece in August 2007.

Fulltext Preview

Image of the first page of the fulltext document