1 |
Towards the automatic processing of Yongning Na (Sino-Tibetan): developing a 'light' acoustic model of the target language and testing 'heavyweight' models from five national languages
|
|
|
|
In: Proceedings of the 4th International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU 2014) ; 4th International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU 2014) ; https://halshs.archives-ouvertes.fr/halshs-00980431 ; 4th International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU 2014), May 2014, St Petersburg, Russia. pp.153-160 (2014)
|
|
Abstract:
International audience ; Automatic speech processing technologies hold great potential to facilitate the urgent task of documenting the world's languages. The present research aims to explore the application of speech recognition tools to a little-documented language, with a view to facilitating processes of annotation, transcription and linguistic analysis. The target language is Yongning Na (a.k.a. Mosuo), an unwritten Sino-Tibetan language with less than 50,000 speakers. An acoustic model of Na was built using CMU Sphinx. In addition to this 'light' model, trained on a small data set (only 4 hours of speech from 1 speaker), 'heavyweight' models from five national languages (English, French, Chinese, Vietnamese and Khmer) were also applied to the same data. Preliminary results are reported, and perspectives for the long road ahead are outlined.
|
|
Keyword:
[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing; [SHS.LANGUE]Humanities and Social Sciences/Linguistics; [SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing; Acoustic models; automatic speech recognition (ASR); crosslingual acoustic modelling and adaptation; endangered languages; language portability; multilingual modelling; Naish languages; statistical language modeling; under-resourced languages; Yongning Na
|
|
URL: https://halshs.archives-ouvertes.fr/halshs-00980431 https://halshs.archives-ouvertes.fr/halshs-00980431v2/document https://halshs.archives-ouvertes.fr/halshs-00980431v2/file/SLTU2014_Do_Michaud_Castelli_FINAL.pdf
|
|
BASE
|
|
Hide details
|
|
2 |
YAST : A scalable ASR toolkit especially designed for under-resourced languages
|
|
|
|
In: International Conference onAsian Language Processing (IALP) ; https://hal.archives-ouvertes.fr/hal-01315532 ; International Conference onAsian Language Processing (IALP), Nov 2012, Hanoi, Vietnam. ⟨10.1109/IALP.2012.65⟩ (2012)
|
|
BASE
|
|
Show details
|
|
3 |
Extraction a parallel corpus for machine translation from and to under-resourced languages ; Extraction de corpus parallèle pour la traduction automatique depuis et vers une langue peu dotée
|
|
|
|
In: https://tel.archives-ouvertes.fr/tel-00680046 ; Autre [cs.OH]. Université de Grenoble; Université de Hanoi -- Vietnam, 2011. Français. ⟨NNT : 2011GRENM065⟩ (2011)
|
|
BASE
|
|
Show details
|
|
4 |
Exploitation d'un corpus bilingue comparable pour la création d'un système de traduction probabiliste Vietnamien - Français
|
|
|
|
In: TALN ; TALN 2009, Senlis, 24-26 juin 2009 ; https://hal.archives-ouvertes.fr/hal-00959202 ; TALN 2009, Senlis, 24-26 juin 2009, 2009, Unknown, pp.x-x (2009)
|
|
BASE
|
|
Show details
|
|
5 |
Mining a comparable text corpus for a Vietnamese - French statistical machine translation system
|
|
|
|
In: Fourth Workshop on Statistical Machine Translation ; https://hal.archives-ouvertes.fr/hal-01393602 ; Fourth Workshop on Statistical Machine Translation, 2009, Athens, Greece. pp.165 - 172, ⟨10.3115/1626431.1626466⟩ ; http://www.statmt.org/wmt09/ (2009)
|
|
BASE
|
|
Show details
|
|
6 |
TOWARDS THE AUTOMATIC PROCESSING OF YONGNING NA (SINO-TIBETAN): DEVELOPING A ‘LIGHT ’ ACOUSTIC MODEL OF THE TARGET LANGUAGE AND TESTING ‘HEAVYWEIGHT ’ MODELS FROM FIVE NATIONAL LANGUAGES
|
|
|
|
In: http://hal.univ-grenoble-alpes.fr/docs/00/99/59/04/PDF/SLTU2014_Do_Michaud_Castelli_FINAL.pdf
|
|
BASE
|
|
Show details
|
|
|
|