DI-UMONS : Dépôt institutionnel de l’université de Mons

Recherche transversale
Rechercher
(titres de publication, de périodique et noms de colloque inclus)
2014-06-10 - Colloque/Article dans les actes avec comité de lecture - Anglais - 5 page(s)

Brognaux Sandrine , Picart Benjamin, Drugman Thomas, "Speech synthesis in various communicative situations: Impact of pronunciation variations" in Interspeech 2014, Singapore, Singapore, 2014

  • Codes CREF : Sciences de l'ingénieur (DI2000), Electricité courants faibles (DI2500)
  • Unités de recherche UMONS : Information, Signal et Intelligence artificielle (F105)
  • Instituts UMONS : Institut NUMEDIART pour les Technologies des Arts Numériques (Numédiart)
Texte intégral :

Abstract(s) :

(Anglais) While current research in speech synthesis focuses on the generation of various speaking styles or emotions, very few studies have addressed the possibility of including phonetic variations according to the communicative situation of the target speech (sports commentaries, TV news, etc.). However, significant phonetic variations have been observed, depending on various communicative factors (e.g. spontaneous/read and media broadcast or not). This study analyzes whether these alternative pronunciations contribute to the plausibility of the message and should therefore be considered in synthesis. To this end, subjective tests are performed on synthesized French sports commentaries. They aim at comparing HMM-based speech synthesis with genuine pronunciation and with neutral NLP-produced phonetization. Results show that the integration of the phonetic variations significantly improves the perceived naturalness of the generated speech. They also highlight the relative impor tance of the various types of variations and show that schwa elisions, in particular, play a crucial role in that respect.


Mots-clés :
  • (Anglais) Sports commentaries
  • (Anglais) Phonetic variations
  • (Anglais) HMM-based speech synthesis
  • (Anglais) Communicative situation