4 |
Integrating Unsupervised Data Generation into Self-Supervised Neural Machine Translation for Low-Resource Languages ...
|
|
|
|
Abstract:
For most language combinations, parallel data is either scarce or simply unavailable. To address this, unsupervised machine translation (UMT) exploits large amounts of monolingual data by using synthetic data generation techniques such as back-translation and noising, while self-supervised NMT (SSNMT) identifies parallel sentences in smaller comparable data and trains on them. To date, the inclusion of UMT data generation techniques in SSNMT has not been investigated. We show that including UMT techniques into SSNMT significantly outperforms SSNMT and UMT on all tested language pairs, with improvements of up to +4.3 BLEU, +50.8 BLEU, +51.5 over SSNMT, statistical UMT and hybrid UMT, respectively, on Afrikaans to English. We further show that the combination of multilingual denoising autoencoding, SSNMT with backtranslation and bilingual finetuning enables us to learn machine translation even for distant language pairs for which only small amounts of monolingual data are available, e.g. yielding BLEU scores ... : 11 pages, 8 figures, accepted at MT-Summit 2021 (Research Track) ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://dx.doi.org/10.48550/arxiv.2107.08772 https://arxiv.org/abs/2107.08772
|
|
BASE
|
|
Hide details
|
|
5 |
Comparing Feature-Engineering and Feature-Learning Approaches for Multilingual Translationese Classification ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Investigating the Helpfulness of Word-Level Quality Estimation for Post-Editing Machine Translation Output ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Multi-Head Highly Parallelized LSTM Decoder for Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Comparing Feature-Engineering and Feature-Learning Approaches for Multilingual Translationese Classification ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Modeling Task-Aware MIMO Cardinality for Efficient Multilingual Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
A Bidirectional Transformer Based Alignment Model for Unsupervised Word Alignment ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Automatic classification of human translation and machine translation : a study from the perspective of lexical diversity
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Transformer-based NMT : modeling, training and implementation
|
|
Xu, Hongfei. - : Saarländische Universitäts- und Landesbibliothek, 2021
|
|
BASE
|
|
Show details
|
|
13 |
The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe
|
|
|
|
In: Language Resources and Evaluation Conference ; https://hal.archives-ouvertes.fr/hal-02892154 ; Language Resources and Evaluation Conference, ELDA/ELRA, May 2020, Marseille, France ; https://lrec2020.lrec-conf.org/en/ (2020)
|
|
BASE
|
|
Show details
|
|
14 |
The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Linguistically inspired morphological inflection with a sequence to sequence model ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Probing Word Translations in the Transformer and Trading Decoder for Encoder Layers ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Language service provision in the 21st century: challenges, opportunities and educational perspectives for translation studies
|
|
|
|
In: ISBN: 9788869234934 ; Bologna Process beyond 2020: Fundamental values of the EHEA pp. 297-303 (2020)
|
|
BASE
|
|
Show details
|
|
20 |
Deep interactive text prediction and quality estimation in translation interfaces
|
|
|
|
In: Hokamp, Christopher M. (2018) Deep interactive text prediction and quality estimation in translation interfaces. PhD thesis, Dublin City University. (2018)
|
|
BASE
|
|
Show details
|
|
|
|