DI-UMONS : Dépôt institutionnel de l’université de Mons

Recherche transversale
Rechercher
(titres de publication, de périodique et noms de colloque inclus)
2018-10-15 - Colloque/Article dans les actes avec comité de lecture - Anglais - 9 page(s)

Christodoulides George , "Forced Alignment of the Phonologie du Français Contemporain Corpus" in International Conference on Statistical Language and Speech Processing, Mons, Belgium, 2018

  • Codes CREF : Informatique appliquée logiciel (DI2570), Linguistique générale (DI5310)
  • Unités de recherche UMONS : Métrologie et Sciences du langage (P362)
  • Instituts UMONS : Institut de recherche en sciences et technologies du langage (Langage)
Texte intégral :

Abstract(s) :

(Anglais) The Phonologie du Fran¸cais Contemporain project is an international, collaborative research effort to create resources for the study of contemporary French phonology. It has produced a large, partially transcribed and annotated corpus of spoken French, consisting of approximately 300 hours of recordings, and covering 48 geographical regions (including Metropolitan France, Belgium, Switzerland, Canada, and French-speaking countries of Africa). Following a detailed protocol, speakers read aloud a word list and a short text and engage in guided and spontaneous conversation with an interviewer. The corpus presents several challenges: significant regional accent variation; variable recording quality and different types of environment noise; variation in speaker characteristics (age, sex); and interspersed segments of overlapping speech. In this article, we describe the procedure followed to address these challenges and produce an automatic forced alignment of the corpus at the phone, syllable and token level, starting from the initial transcriptions.