An utterance conveys not only the intended message but also information about the speaker’s gender, accent, age group, etc.
In a spoken dialog system, these information can be used to improve speech recognition for a target group of users that share
common vocal characteristics. In this paper, we describe various approaches to adapt acoustic models trained on native English
data to the vocal characteristics of German-accented English speakers. We show that significant performance boost can be achieved
by using speaker adaptation techniques such as Maximum Likelihood Linear Regression (MLLR), Maximum a Posteriori (MAP) adaptation,
and a combination of the two for the purpose of accent adaptation. We also show that promising performance gain can be obtained
through cross-language accent adaptation, where native German speech from a different application domain is used as enrollment
data. Moreover, we show the use of MLLR for telephone channel adaptation.
Keywords Accent adaptation - accented speech - channel adaptation - cross- language accent adaptation - MAP - MLLR - native speech