DI-UMONS : Dépôt institutionnel de l’université de Mons

Recherche transversale
Rechercher
(titres de publication, de périodique et noms de colloque inclus)
2007-04-16 - Colloque/Article dans les actes avec comité de lecture - Anglais - 4 page(s)

Dutoit Thierry , Holzapfel A., Jottrand Matthieu, Moinet Alexis , Perez Javier, Stylianou Y., "Towards a Voice Conversion System Based on Frame Selection" in ICASSP 2007 - International Conference on Acoustics, Speech and Signal Processing, volume 4, pp. 513-516, Honolulu, Hawai, USA, 2007

  • Codes CREF : Electricité courants faibles (DI2500)
  • Unités de recherche UMONS : Théorie des circuits et traitement du signal (F105)
  • Instituts UMONS : Institut de Recherche en Technologies de l’Information et Sciences de l’Informatique (InforTech)
Texte intégral :

Abstract(s) :

(Anglais) The subject of this paper is the conversion of a given speaker’s voice (the source speaker) into another identified voice (the target one). We assume we have at our disposal a large amount of speech samples from source and target voice with at least a part of them being parallel. The proposed system is built on a mapping function between source and target spectral envelopes followed by a frame selection algorithm to produce final spectral envelopes. Converted speech is produced by a basic LP analysis of the source and LP synthesis using the converted spectral envelopes. We compared three types of conversion: without mapping, with mapping and using the excitation of the source speaker and finally with mapping using the excitation of the target. Results show that the combination of mapping and frame selection provide the best results, and underline the interest to work on methods to convert the LP excitation.

Notes :
  • (Anglais) see also : proceedings of enterface 2006 : "Multimodal Speaker Conversion - his master's voice... and face -"
Identifiants :
  • DOI : 10.1109/ICASSP.2007.366962