2 |
On the differences between human translations
|
|
|
|
In: Popović, Maja orcid:0000-0001-8234-8745 (2020) On the differences between human translations. In: 22nd Annual Conference of the European Association for Machine Translation (EAMT 2020), 3 -5 Nov 2020, Lisbon, Portugal (Online). (2020)
|
|
BASE
|
|
Show details
|
|
3 |
Neural machine translation between similar south-Slavic languages
|
|
|
|
In: Popović, Maja orcid:0000-0001-8234-8745 and Poncelas, Alberto orcid:0000-0002-5089-1687 (2020) Neural machine translation between similar south-Slavic languages. In: 2020 Fifth Conference on Machine Translation (WMT20), 19-20 Nov 2020, Dominican Republic (Online). (2020)
|
|
BASE
|
|
Show details
|
|
4 |
Rapid development of competitive translation engines for access to multilingual COVID-19 information
|
|
|
|
In: Way, Andy orcid:0000-0001-5736-5930 , Haque, Rejwanul orcid:0000-0003-1680-0099 , Xie, Guodong, Gaspari, Federico orcid:0000-0003-3808-8418 , Popović, Maja orcid:0000-0001-8234-8745 and Poncelas, Alberto orcid:0000-0002-5089-1687 (2020) Rapid development of competitive translation engines for access to multilingual COVID-19 information. Informatics . ISSN 2227-9709 (2020)
|
|
BASE
|
|
Show details
|
|
5 |
Machine translation of user-generated content
|
|
Lohar, Pintu. - : Dublin City University. School of Computing, 2020. : Dublin City University. ADAPT, 2020
|
|
In: Lohar, Pintu (2020) Machine translation of user-generated content. PhD thesis, Dublin City University. (2020)
|
|
BASE
|
|
Show details
|
|
6 |
QRev: Machine translation of user reviews: what influences the translation quality?
|
|
|
|
In: Popović, Maja orcid:0000-0001-8234-8745 (2020) QRev: Machine translation of user reviews: what influences the translation quality? In: 22nd Annual Conference of the European Association for Machine Translation (EAMT 2020), 3 -5 Nov 2020, Lisbon, Portugal (Online). (2020)
|
|
BASE
|
|
Show details
|
|
7 |
Facilitating Access to Multilingual COVID-19 Information via Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Informative Manual Evaluation of Machine Translation Output ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Evaluating conjunction disambiguation on English-to-German and French-to-German WMT 2019 translation hypotheses
|
|
|
|
In: Popović, Maja orcid:0000-0001-8234-8745 (2019) Evaluating conjunction disambiguation on English-to-German and French-to-German WMT 2019 translation hypotheses. In: 4th Conference on Machine Translation (WMT 2019), 1 - 2 August 2019, Florence, Italy. (2019)
|
|
BASE
|
|
Show details
|
|
11 |
Automated text simplification as a preprocessing step for machine translation into an under-resourced language
|
|
|
|
In: Štajner, Sanja orcid:0000-0002-7780-7035 and Popović, Maja orcid:0000-0001-8234-8745 (2019) Automated text simplification as a preprocessing step for machine translation into an under-resourced language. In: Recent Advances in Natural Language Processing (RANLP 2019), 2-4 Sept 2019, Varna, Bulgaria. (2019)
|
|
BASE
|
|
Show details
|
|
12 |
Building English-to-Serbian machine translation system for IMDb movie reviews
|
|
|
|
In: Way, Andy orcid:0000-0001-5736-5930 , Lohar, Pintu and Popović, Maja orcid:0000-0001-8234-8745 (2019) Building English-to-Serbian machine translation system for IMDb movie reviews. In: Proceedings of the 7th Workshop on Balto-Slavic Natural Language Processing, 2 Aug 2019, Florence,Italy. ISBN 978-1-950737-41-3 (2019)
|
|
BASE
|
|
Show details
|
|
13 |
On reducing translation shifts in translations intended for MT evaluation
|
|
|
|
In: Popović, Maja orcid:0000-0001-8234-8745 (2019) On reducing translation shifts in translations intended for MT evaluation. In: MT Summit XVII, 19 - 23 Aug 2019, Dublin, Ireland. (2019)
|
|
BASE
|
|
Show details
|
|
14 |
Are ambiguous conjunctions problematic for machine translation?
|
|
|
|
In: Popović, Maja orcid:0000-0001-8234-8745 and Castilho, Sheila orcid:0000-0002-8416-6555 (2019) Are ambiguous conjunctions problematic for machine translation? In: Recent Advances in Natural Language Processing (RANLP 2019), 2 - 4 Sept 2019, Varna, Bulgaria. (2019)
|
|
BASE
|
|
Show details
|
|
15 |
Automatic error classification with multiple error labels
|
|
|
|
In: Popović, Maja orcid:0000-0001-8234-8745 and Vilar, David (2019) Automatic error classification with multiple error labels. In: MT Summit XVII, 19 - 23 Aug 2019, Dublin, Ireland. (2019)
|
|
BASE
|
|
Show details
|
|
16 |
Combining SMT and NMT back-translated data for efficient NMT
|
|
|
|
In: Poncelas, Alberto orcid:0000-0002-5089-1687 , Popović, Maja orcid:0000-0001-8234-8745 , Shterionov, Dimitar orcid:0000-0001-6300-797X , Maillette de Buy Wenniger, Gideon and Way, Andy orcid:0000-0001-5736-5930 (2019) Combining SMT and NMT back-translated data for efficient NMT. In: Recent Advances in Natural Language Processing (RANLP 2019), 2-4 Sept 2019, Varna, Bulgaria. (2019)
|
|
BASE
|
|
Show details
|
|
17 |
A systematic comparison between SMT and NMT on translating user-generated content
|
|
|
|
In: Lohar, Pintu, Popović, Maja orcid:0000-0001-8234-8745 , Alfi, Haithem orcid:0000-0002-7449-4707 and Way, Andy orcid:0000-0001-5736-5930 (2019) A systematic comparison between SMT and NMT on translating user-generated content. In: 20th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing 2019), 7 - 13 Apr 2019, La Rochelle, France. (2019)
|
|
Abstract:
Twitter has become an immensely popular platform where the users can share information within a certain character limit (280 characters) which encourages them to deliver short and informal messages (tweets). In general, machine translation (MT) of tweets is a challenging task. However, for translating German tweets about football into English, it has been shown that a moderate translation performance in terms of the BLEU score can be achieved using the phrase-based translation engines built on a tiny parallel Twitter data set [1]. In this work, we propose to further increase the translation quality using the neural machine translation models and applying the following strategies: (i) we back translate a set of out-of-domain English tweets released by ”Harvard data set” in 2017 into German and add the synthetic parallel data to the tiny parallel data used in [1]; (ii) as tweets are short in general, we extract short text pairs from the large news-commentary parallel data and add it to the tiny Twitter parallel data set in order to restrict the length of the out-of-genre text segments. We build both phrase-based and neural MT systems (PBMT and NMT) using the above data combinations in order to perform a systematic comparison between the two approaches on translating tweets. Our experimental results reveal that the NMT system performs significantly worse than the PBMT system when using only the tiny Twitter data set for MT training. In contrast, when additional data is used for training, the results show huge improvements of the NMT system and produce very similar BLEU scores as the PBMT system even with only few hundred thousands of additional synthetic parallel data.
|
|
Keyword:
Machine translating
|
|
URL: http://doras.dcu.ie/23869/
|
|
BASE
|
|
Hide details
|
|
18 |
Potential and limits of using post-edits as reference translations for MT evaluation
|
|
Popovic, Maja; Arcan, Mihael; Lommel, Arle. - : Vilnius University, University of Latvia, Latvia University of Agriculture, Institute of Mathematics and Informatics of University of Latvia, 2019
|
|
BASE
|
|
Show details
|
|
19 |
Language related issues for machine translation between closely related south Slavic languages
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Poor man’s lemmatisation for automatic error classification
|
|
|
|
BASE
|
|
Show details
|
|
|
|