Lecture Notes in Computer Science, 2003, Volume 2705/2003, 38-77, DOI: 10.1007/978-3-540-45115-0_3

A Tutorial on Pronunciation Modeling for Large Vocabulary Speech Recognition

Eric Fosler-Lussier

View Related Documents

Abstract

Automatic speech recognition (ASR) research has progressed from the recognition of read speech to the recognition of spontaneous conversational speech in the past decade, prompting some in the field to re-evaluate ASR pronunciation models and their role of capturing the increased phonetic variability within unscripted speech. Two basic approaches for modeling pronunciation variation have emerged: encoding linguistic knowledge to pre-specify possible alternative pronunciations of words and deriving alternatives directly from a pronunciation corpus. This tutorial is intended to ground the reader in the basic linguistic concepts in phonetics and phonology that guide both of these techniques and to outline several pronunciation modeling strategies that have been employed through the years. The chapter will conclude with a summary of some promising recent research directions.

Fulltext Preview

Image of the first page of the fulltext document