< Terug naar vorige pagina

Publicatie

Audiovisual speech synthesis: An overview of the state-of-the-art

Tijdschriftbijdrage - Tijdschriftartikel

We live in a world where there are countless interactions with computer systems in every-day situations. In the most ideal case, this interaction feels as familiar and as natural as the communication we experience with other humans. To this end, an ideal means of communication between a user and a computer system consists of audiovisual speech signals. Audiovisual text-to-speech technology allows the computer system to utter any spoken message towards its users. Over the last decades, a wide range of techniques for performing audiovisual speech synthesis has been developed. This paper gives a comprehensive overview on these approaches using a categorization of the systems based on multiple important aspects that determine the properties of the synthesized speech signals. The paper makes a clear distinction between the techniques that are used to model the virtual speaker and the techniques that are used to generate the appropriate speech gestures. In addition, the paper discusses the evaluation of audiovisual speech synthesizers, it elaborates on the hardware requirements for performing visual speech synthesis and it describes some important future directions that should stimulate the use of audiovisual speech synthesis technology in real-life applications.
Tijdschrift: Speech Commun
ISSN: 0167-6393
Volume: 66
Pagina's: 182-217
Jaar van publicatie:2015
Trefwoorden:Audiovisual speech synthesis, Visual speech synthesis, Speech synthesis
CSS-citation score:2
Toegankelijkheid:Open