Welcome!
To use the personalized features of this site, please log in or register.
If you have forgotten your username or password, we can help.
My Menu
Saved Items

Continuous audio-visual speech recognition

Juergen LuettinContact Information and Stéphane Dupont2, 1 Contact Information

(1)  IDIAP - Dalle Molle Institute for Perceptual Artificial Intelligence, Rue du Simplon 4, CH-1920 Martigny, Switzerland
(2)  Faculté Polytechnique de Mons - TCTS 31, Bld. Dolez, B-7000 Mons, Belgium
Abstract
We address the problem of robust lip tracking, visual speech feature extraction, and sensor integration for audio-visual speech recognition applications. An appearance based model of the articulators, which represents linguistically important features, is learned from example images and is used to locate, track, and recover visual speech information. We tackle the problem of joint temporal modelling of the acoustic and visual speech signals by applying Multi-Stream hidden Markov models. This approach allows the use of different temporal topologies and levels of stream integration and hence enables to model temporal dependencies more accurately. The system has been evaluated for a continuously spoken digit recognition task of 37 subjects.

Contact Information Juergen Luettin
Email: luettin@idiap.ch

Contact Information Stéphane Dupont
Email: dupont@tcts.fpms.ac.be
Fulltext Preview (Small, Large)
Image of the first page of the fulltext

References secured to subscribers.



Export this chapter
Export this chapter as RIS | Text
 
Remote Address: 38.107.191.108 • Server: mpweb23
HTTP User Agent: CCBot/1.0 (+http://www.commoncrawl.org/bot.html)