DI-UMONS : Dépôt institutionnel de l’université de Mons

Recherche transversale
(titres de publication, de périodique et noms de colloque inclus)
2021-11-25 - Article/Dans un journal avec peer-review - Anglais - page(s)

Tits Noé , El Haddad Kevin , Dutoit Thierry , "Analysis and Assessment of Controllability of an Expressive Deep Learning-Based TTS System" in mdpi informatics, 8, 4, 00084

  • Codes CREF : Intelligence artificielle (DI1180), Technologies de l'information et de la communication (TIC) (DI4730)
  • Unités de recherche UMONS : Information, Signal et Intelligence artificielle (F105)
  • Instituts UMONS : Institut NUMEDIART pour les Technologies des Arts Numériques (Numédiart)
Texte intégral :

Abstract(s) :

(Anglais) In this paper, we study the controllability of an Expressive TTS system trained on a dataset for a continuous control. The dataset is the Blizzard 2013 dataset based on audiobooks read by a female speaker containing a great variability in styles and expressiveness. Controllability is evaluated with both an objective and a subjective experiment. The objective assessment is based on a measure of correlation between acoustic features and the dimensions of the latent space representing expressiveness. The subjective assessment is based on a perceptual experiment in which users are shown an interface for Controllable Expressive TTS and asked to retrieve a synthetic utterance whose expressiveness subjectively corresponds to that a reference utterance.

Identifiants :
  • DOI : 10.3390/informatics8040084