Lecture Notes in Computer Science, 2008, Volume 5246/2008, 403-410, DOI: 10.1007/978-3-540-87391-4_52

Accent and Channel Adaptation for Use in a Telephone-Based Spoken Dialog System

Kinfe Tadesse Mengistu and Andreas Wendemuth

View Related Documents

Abstract

An utterance conveys not only the intended message but also information about the speaker’s gender, accent, age group, etc. In a spoken dialog system, these information can be used to improve speech recognition for a target group of users that share common vocal characteristics. In this paper, we describe various approaches to adapt acoustic models trained on native English data to the vocal characteristics of German-accented English speakers. We show that significant performance boost can be achieved by using speaker adaptation techniques such as Maximum Likelihood Linear Regression (MLLR), Maximum a Posteriori (MAP) adaptation, and a combination of the two for the purpose of accent adaptation. We also show that promising performance gain can be obtained through cross-language accent adaptation, where native German speech from a different application domain is used as enrollment data. Moreover, we show the use of MLLR for telephone channel adaptation.

Keywords  Accent adaptation - accented speech - channel adaptation - cross- language accent adaptation - MAP - MLLR - native speech

Fulltext Preview

Image of the first page of the fulltext document