Publicatie

The VUB Blizzard Challenge 2010 Entry: Towards Automatic Voice Building

Boekbijdrage - Boekhoofdstuk Conferentiebijdrage

In this paper we describe the voices we submitted to the 2010 Blizzard Challenge, a yearly challenge to evaluate auditory speech synthesis on common data. One of the goals of a data-driven synthesizer, such as ours, is to generalize the speech database in such a way that it allows a realistic rendition of unseen input text. The two main changes to our system, compared to previous submissions, are the inclusion of an HMM-based acoustic prosody model, and the automatic training of context-dependent target cost weights. These weights are estimated for each individual target during synthesis, and depend on the linguistic features of these targets which encompass their broader linguistic context. Another new aspect of our synthesizer is the ability to synthesize Mandarin Chinese speech. Its evaluation helps us assess the quality of our synthesizer for languages unfamiliar to the voice developers. Evaluation results and possible improvements to our synthesizer are also discussed.

Boek: Blizzard Challenge 2010, Kansai Science City, Japan

Jaar van publicatie:2010

Trefwoorden:speech synthesis, unit selection, weight training, evaluation

Publicatie

The VUB Blizzard Challenge 2010 Entry: Towards Automatic Voice Building

Boekbijdrage - Boekhoofdstuk Conferentiebijdrage

Auteurs/uitgever

Onderzoekseenheden