1 |
Parallel text dataset for Neural Machine Translation (French -> Fongbe, French -> Ewe) ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Parallel text dataset for Neural Machine Translation (French -> Fongbe, French -> Ewe) ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Participatory Research for Low-resourced Machine Translation:A Case Study in African Languages
|
|
Nekoto, Wilhelmina; Marivate, Vukosi; Matsila, Tshinondiwa; Fasubaa, Timi; Kolawole, Tajudeen; Fagbohungbe, Taiwo; Akinola, Solomon Oluwole; Muhammad, Shamsuddeen Hassan; Kabongo, Salomon; Osei, Salomey; Freshia, Sackey; Niyongabo, Rubungo Andre; Macharm, Ricky; Ogayo, Perez; Ahia, Orevaoghene; Meressa, Musie; Adeyemi, Mofe; Mokgesi-Selinga, Masabata; Okegbemi, Lawrence; Martinus, Laura Jane; Tajudeen, Kolawole; Degila, Kevin; Ogueji, Kelechi; Siminyu, Kathleen; Kreutzer, Julia; Webster, Jason; Ali, Jamiil Toure; Abbott, Jade; Orife, Iroro; Ezeani, Ignatius; Dangana, Idris Abdulkabir; Kamper, Herman; Elsahar, Hady; Duru, Goodness; Kioko, Ghollah; Murhabazi, Espoir; Biljon, Elan van; Whitenack, Daniel; Onyefuluchi, Christopher; Emezue, Chris; Dossou, Bonaventure; Sibanda, Blessing; Bassey, Blessing Itoro; Olabiyi, Ayodele; Ramkilowan, Arshath; Öktem, Alp; Akinfaderin, Adewale; Bashir, Abdallah. - 2020
|
|
Abstract:
Research in NLP lacks geographic diversity, and the question of how NLP can be scaled to low-resourced languages has not yet been adequately solved. "Low-resourced"-ness is a complex problem going beyond data availability and reflects systemic problems in society. In this paper, we focus on the task of Machine Translation (MT), that plays a crucial role for information accessibility and communication worldwide. Despite immense improvements in MT over the past decade, MT is centered around a few high-resourced languages. As MT researchers cannot solve the problem of low-resourcedness alone, we propose participatory research as a means to involve all necessary agents required in the MT development process. We demonstrate the feasibility and scalability of participatory research with a case study on MT for African languages. Its implementation leads to a collection of novel translation datasets, MT benchmarks for over 30 languages, with human evaluations for a third of them, and enables participants without formal training to make a unique scientific contribution. Benchmarks, models, data, code, and evaluation results are released under https://github.com/masakhane-io/masakhane-mt.
|
|
URL: https://eprints.lancs.ac.uk/id/eprint/150109/1/2010.02353v2.pdf https://eprints.lancs.ac.uk/id/eprint/150109/
|
|
BASE
|
|
Hide details
|
|
|
|