2 |
A Large-Scale Study of Machine Translation in the Turkic Languages ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
A Prototype Free/Open-Source Morphological Analyser and Generator for Sakha ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Evaluating Multiway Multilingual NMT in the Turkic Languages ...
|
|
Mirzakhalov, Jamshidbek; Babu, Anoop; Kunafin, Aigiz; Wahab, Ahsan; Moydinboyev, Behzod; Ivanova, Sardana; Uzokova, Mokhiyakhon; Pulatova, Shaxnoza; Ataman, Duygu; Kreutzer, Julia; Tyers, Francis; Firat, Orhan; Licato, John; Chellappan, Sriram. - : arXiv, 2021
|
|
Abstract:
Despite the increasing number of large and comprehensive machine translation (MT) systems, evaluation of these methods in various languages has been restrained by the lack of high-quality parallel corpora as well as engagement with the people that speak these languages. In this study, we present an evaluation of state-of-the-art approaches to training and evaluating MT systems in 22 languages from the Turkic language family, most of which being extremely under-explored. First, we adopt the TIL Corpus with a few key improvements to the training and the evaluation sets. Then, we train 26 bilingual baselines as well as a multi-way neural MT (MNMT) model using the corpus and perform an extensive analysis using automatic metrics as well as human evaluations. We find that the MNMT model outperforms almost all bilingual baselines in the out-of-domain test sets and finetuning the model on a downstream task of a single pair also results in a huge performance boost in both low- and high-resource scenarios. Our ... : 9 pages, 3 figures, 7 tables. To be presented at WMT 2021 ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://dx.doi.org/10.48550/arxiv.2109.06262 https://arxiv.org/abs/2109.06262
|
|
BASE
|
|
Hide details
|
|
8 |
Do RNN States Encode Abstract Phonological Alternations? ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
A morphological analyser for K’iche’ ; Un analizador morfológico para el idioma k’iche’
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Multi-script morphological transducers and transcribers for seven Turkic languages
|
|
|
|
In: Proceedings of the Workshop on Turkic and Languages in Contact with Turkic; Vol 5 (2020); 173-185 ; 2641-3485 (2021)
|
|
BASE
|
|
Show details
|
|
15 |
SIGMORPHON 2020 Shared Task 0: Typologically Diverse Morphological Inflection ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Dependency analysis of noun incorporation in polysynthetic languages ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|