Lecture Notes in Computer Science, 2001, Volume 2130/2001, 57-64, DOI: 10.1007/3-540-44668-0_9

Symbolic Prosody Modeling by Causal Retro-causal NNs with Variable Context Length

Achim F. Müller and Hans Georg Zimmermann

View Related Documents

Abstract

In this paper the application of causal retro-causal neural networks (NN) to accent label prediction for speech synthesis is presented. Within the proposed NN architecture gating clusters are applied enabeling the dynamic adaptation of a network structure depending on the actual input to the NN. In the proposed causal retro-causal NN, gating clusters are used to adapt the network structure such that the network has a variable context length. This way only available input feature vectors from the actual context window are treated. The proposed NN architecture has been successfully applied for accent label prediction within our text-to-speech (TTS) system. Prediction accuracy ranges at 83%. This result ranges higher than results achieved with tree-based (CART) methods on a corpus with similar complexity.

Fulltext Preview

Image of the first page of the fulltext document