DI-UMONS : Dépôt institutionnel de l’université de Mons

Recherche transversale
Rechercher
(titres de publication, de périodique et noms de colloque inclus)
2017-10-23 - Colloque/Article dans les actes avec comité de lecture - Anglais - 12 page(s)

Pironkov Gueorgui , Dupont Stéphane , Wood S. U. N., Dutoit Thierry , "Noise and Speech Estimation As Auxiliary Tasks for Robust Speech Recognition" in International Conference on Statistical Language and Speech Processing, 5, 181-192, Le Mans, France, 2017

  • Codes CREF : Théorie de l'information (DI1161), Electricité courants faibles (DI2500)
  • Unités de recherche UMONS : Théorie des circuits et Traitement du signal (F105)
  • Instituts UMONS : Institut de Recherche en Technologies de l’Information et Sciences de l’Informatique (InforTech), Institut NUMEDIART pour les Technologies des Arts Numériques (Numédiart)
Texte intégral :

Abstract(s) :

(Anglais) Dealing with noise deteriorating the speech is still a major problem for automatic speech recognition. An interesting approach to tackle this problem consists of using multi-task learning. In this case, an efficient auxiliary task is clean-speech generation. This auxiliary task is trained in addition to the main speech recognition task and its goal is to help improve the results of the main task. In this paper, we inves- tigate this idea further by generating features extracted directly from the audio file containing only the noise, instead of the clean-speech. Af- ter demonstrating that an improvement can be obtained through this multi-task learning auxiliary task, we also show that using both noise and clean-speech estimation auxiliary tasks leads to a 4% relative word error rate improvement in comparison to the classic single-task learning on the CHiME4 dataset.