Lecture Notes in Computer Science, 2007, Volume 4343/2007, 21-46, DOI: 10.1007/978-3-540-74200-5_2

Speaker Classification Concepts: Past, Present and Future

David R. Hill

View Related Documents

Abstract

Speaker classification requires a sufficiently accurate functional description of speaker attributes and the resources used in speaking, to be able to produce new utterances mimicking the speaker’s current physical, emotional and cognitive state, with the correct dialect, social class markers and speech habits. We lack adequate functional knowledge of why and how speakers produce the utterances they do, as well as adequate theoretical frameworks embodying the kinds of knowledge, resources and intentions they use. Rhythm and intonation - intimately linked in most language - provide a wealth of information relevant to speaker classification. Functional - as opposed to descriptive - models are needed. Segmental cues to speaker category, and markers for categories like fear, uncertainty, urgency, and confidence are largely un-researched. What Eckman and Friesen did for facial expression must be done for verbal expression. The chapter examines some potentially profitable research possibilities in context.

Keywords  voice morphing - impersonation - mimicry - socio-phonetics - speech forensics - speech research tools - speaker classification - speech segments - speech prosody - intonation - rhythm - formant sensitivity analysis - face recognition - emotional intelligence - dialogue dynamics - gnuspeech

Fulltext Preview

Image of the first page of the fulltext document