Welcome!
To use the personalized features of this site, please log in or register.
If you have forgotten your username or password, we can help.
|
 |
Continuous audio-visual speech recognition
| |
|
Continuous audio-visual speech recognition
Juergen Luettin1 and Stéphane Dupont2, 1 
| (1) |
IDIAP - Dalle Molle Institute for Perceptual Artificial Intelligence, Rue du Simplon 4, CH-1920 Martigny, Switzerland |
| (2) |
Faculté Polytechnique de Mons - TCTS 31, Bld. Dolez, B-7000 Mons, Belgium |
Abstract
We address the problem of robust lip tracking, visual speech feature extraction, and sensor integration for audio-visual speech
recognition applications. An appearance based model of the articulators, which represents linguistically important features,
is learned from example images and is used to locate, track, and recover visual speech information. We tackle the problem
of joint temporal modelling of the acoustic and visual speech signals by applying Multi-Stream hidden Markov models. This
approach allows the use of different temporal topologies and levels of stream integration and hence enables to model temporal
dependencies more accurately. The system has been evaluated for a continuously spoken digit recognition task of 37 subjects.
Fulltext Preview (Small, Large)
 References secured to subscribers.
|
|
|
|
|
|