1 |
How much context span is enough? Examining context-related issues for document-level MT
|
|
|
|
In: Castilho, Sheila orcid:0000-0002-8416-6555 (2022) How much context span is enough? Examining context-related issues for document-level MT. In: 13th Language Resources and Evaluation Conference, 21-23 June 2022, Marseille, France. (In Press) (2022)
|
|
Abstract:
This paper analyses how much context span is necessary to solve different context-related issues, namely, reference, ellipsis, gender, number, lexical ambiguity, and terminology when translating from English into Portuguese. We use the DELA corpus, which consists of 60 documents and six different domains (subtitles, literary, news, reviews, medical, and legislation). We find that the shortest context span to disambiguate issues can appear in different positions in the document including preceding, following, global, world knowledge; and that the average length depends on the issue types as well as the domain. Additionally, we show that the standard approach of relying on only two preceding sentences as context might not be enough depending on the domain and issue types.
|
|
Keyword:
Computational linguistics; context span; document-level; Language; Linguistics; Machine translating; Translating and interpreting
|
|
URL: http://doras.dcu.ie/27009/
|
|
BASE
|
|
Hide details
|
|
2 |
An investigation into multi-word expressions in machine translation
|
|
Han, Lifeng. - : Dublin City University. School of Computing, 2022. : Dublin City University. ADAPT, 2022
|
|
In: Han, Lifeng orcid:0000-0002-3221-2185 (2022) An investigation into multi-word expressions in machine translation. PhD thesis, Dublin City University. (2022)
|
|
BASE
|
|
Show details
|
|
3 |
An investigation of English-Irish machine translation and associated resources
|
|
Dowling, Meghan. - : Dublin City University. School of Computing, 2022. : Dublin City University. ADAPT, 2022
|
|
In: Dowling, Meghan orcid:0000-0003-1637-4923 (2022) An investigation of English-Irish machine translation and associated resources. PhD thesis, Dublin City University. (2022)
|
|
BASE
|
|
Show details
|
|
4 |
The role of machine translation in translation education: A thematic analysis of translator educators’ beliefs
|
|
|
|
In: Translation and Interpreting : the International Journal of Translation and Interpreting Research, Vol 14, Iss 1, Pp 177-197 (2022) (2022)
|
|
BASE
|
|
Show details
|
|
5 |
DELA Corpus - A Document-Level Corpus Annotated with Context-Related Issues
|
|
|
|
In: Castilho, Sheila orcid:0000-0002-8416-6555 , Cavalheiro Camargo, João Lucas orcid:0000-0003-3746-1225 , Menezes, Miguel and Way, Andy orcid:0000-0001-5736-5930 (2021) DELA Corpus - A Document-Level Corpus Annotated with Context-Related Issues. In: Sixth Conference on Machine Translation (WMT21), 10-11 Nov 2021, Punta Cana, Dominican Republic (Online). ISBN 978-1-954085-94-7 (2021)
|
|
BASE
|
|
Show details
|
|
6 |
Can Google Translate Rewire Your L2 English Processing?
|
|
|
|
In: Resende, Natália orcid:0000-0002-5248-2457 and Way, Andy orcid:0000-0001-5736-5930 (2021) Can Google Translate Rewire Your L2 English Processing? Digital, 1 (1). pp. 66-85. ISSN 2673-6470 (2021)
|
|
BASE
|
|
Show details
|
|
7 |
Chinese character decomposition for neural MT with multi-word expressions
|
|
|
|
In: Han, Lifeng orcid:0000-0002-3221-2185 , Jones, Gareth J.F. orcid:0000-0003-2923-8365 , Smeaton, Alan F. orcid:0000-0003-1028-8389 and Bolzoni, Paolo (2021) Chinese character decomposition for neural MT with multi-word expressions. In: 23rd Nordic Conference on Computational Linguistics (NoDaLiDa 2021), 31 May- 2 June 2021, Reykjavik, Iceland (Online). (In Press) (2021)
|
|
BASE
|
|
Show details
|
|
8 |
Translation quality assessment: a brief survey on manual and automatic methods
|
|
|
|
In: Han, Lifeng orcid:0000-0002-3221-2185 , Jones, Gareth J.F. orcid:0000-0003-2923-8365 and Smeaton, Alan F. orcid:0000-0003-1028-8389 (2021) Translation quality assessment: a brief survey on manual and automatic methods. In: MoTra21: Workshop on Modelling Translation: Translatology in the Digital Age, 31 May- 2 Jun 2021, Rejkjavik, Iceland (Online). (In Press) (2021)
|
|
BASE
|
|
Show details
|
|
9 |
Towards document-level human MT evaluation: On the Issues of annotator agreement, effort and misevaluation
|
|
|
|
In: Castilho, Sheila orcid:0000-0002-8416-6555 (2021) Towards document-level human MT evaluation: On the Issues of annotator agreement, effort and misevaluation. In: 16th Conference of the European Chapter of the Association for Computational Linguistics - EACL 2021., 19-23 April 2021, Online. (In Press) (2021)
|
|
BASE
|
|
Show details
|
|
10 |
Meta-evaluation of machine translation evaluation methods
|
|
|
|
In: Han, Lifeng orcid:0000-0002-3221-2185 (2021) Meta-evaluation of machine translation evaluation methods. In: Workshop on Informetric and Scientometric Research (SIG-MET), 23-24 Oct 2021, Online. (2021)
|
|
BASE
|
|
Show details
|
|
11 |
Unsupervised Morphological Segmentation and Part-of-Speech Tagging for Low-Resource Scenarios
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Unsupervised Morphological Segmentation and Part-of-Speech Tagging for Low-Resource Scenarios ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Innovations in machine learning: a case study of the Fabricius Workbench
|
|
Kelly, Bree. - : Sydney, Australia : Macquarie University, 2021
|
|
BASE
|
|
Show details
|
|
16 |
Meaning and translation : theory and practice of machine translation as exemplified by applicative and cognitive grammars
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Machine Translation: Linguistic challenges that arise in the translation of journalistic texts ; Traducción Automática: Retos Lingüisticos que se Presentan en la Traducción de Textos Periodísticos
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Real-time New Zealand sign language translator using convolution neural network
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Modelling source- and target-language syntactic Information as conditional context in interactive neural machine translation
|
|
|
|
In: Gupta, Kamal Kumar, Haque, Rejwanul orcid:0000-0003-1680-0099 , Ekbal, Asif, Bhattacharyya, Pushpak and Way, Andy orcid:0000-0001-5736-5930 (2020) Modelling source- and target-language syntactic Information as conditional context in interactive neural machine translation. In: Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, 2-6 Nov 2020, Lisboa, Portugal. (2020)
|
|
BASE
|
|
Show details
|
|
20 |
AlphaMWE: construction of multilingual parallel corpora with MWE annotations
|
|
|
|
In: Han, Lifeng orcid:0000-0002-3221-2185 , Jones, Gareth J.F. orcid:0000-0003-2923-8365 and Smeaton, Alan F. orcid:0000-0003-1028-8389 (2020) AlphaMWE: construction of multilingual parallel corpora with MWE annotations. In: Joint Workshop on Multiword Expressions and Electronic Lexicons (MWE-LEX 2020), 13 Dec 2020, Barcelona, Spain (Online). (2020)
|
|
BASE
|
|
Show details
|
|
|
|