1 |
Deep interactive text prediction and quality estimation in translation interfaces
|
|
|
|
In: Hokamp, Christopher M. (2018) Deep interactive text prediction and quality estimation in translation interfaces. PhD thesis, Dublin City University. (2018)
|
|
Abstract:
The output of automatic translation systems is usually destined for human consumption. In most cases, translators use machine translation (MT) as the first step in the process of creating a fluent translation in a target language given a text in a source language. However, there are many possible ways for translators to interact with MT. The goal of this thesis is to investigate new interactive designs and interfaces for translation. In the first part of the thesis, we present pilot studies which investigate aspects of the interactive translation process, building upon insights from Human-Computer Interaction (HCI) and Translation Studies. We developed HandyCAT, an open-source platform for translation process research, which was used to conduct two user studies: an investigation into interactive machine translation and evaluation of a novel component for post-editing. We then propose new models for quality estimation (QE) of MT, and new models for es- timating the confidence of prefix-based neural interactive MT (IMT) systems. We present a series of experiments using neural sequence models for QE and IMT. We focus upon token-level QE models, which can be used as standalone components or integrated into post-editing pipelines, guiding users in selecting phrases to edit. We introduce a strong recurrent baseline for neural QE, and show how state of the art automatic post-editing (APE) models can be re-purposed for word-level QE. We also propose an auxiliary con- fidence model, which can be attached to (I)-MT systems to use the model’s internal state to estimate confidence about the model’s predictions. The third part of the thesis introduces lexically constrained decoding using grid beam search (GBS), a means of expanding prefix-based interactive translation to general lexical constraints. By integrating lexically constrained decoding with word-level QE, we then suggest a novel interactive design for translation interfaces, and test our hypotheses using simulated editing. The final section focuses upon designing an interface for interactive post-editing, incorporating both GBS and QE. We design components which introduce a new way of interacting with translation models, and test these components in a user-study.
|
|
Keyword:
Computational linguistics; Machine learning; Machine translating
|
|
URL: http://doras.dcu.ie/22664/
|
|
BASE
|
|
Hide details
|
|
2 |
Statistical post-editing and quality estimation for machine translation systems
|
|
|
|
In: Béchara, Hanna (2014) Statistical post-editing and quality estimation for machine translation systems. Master of Science thesis, Dublin City University. (2014)
|
|
BASE
|
|
Show details
|
|
3 |
Predicting sentence translation quality using extrinsic and language independent features
|
|
|
|
In: Bicici, Ergun, Groves, Declan and van Genabith, Josef orcid:0000-0003-1322-7944 (2013) Predicting sentence translation quality using extrinsic and language independent features. Machine Translation, 27 (3-4). pp. 171-192. ISSN 0922-6567 (2013)
|
|
BASE
|
|
Show details
|
|
4 |
Working with a small dataset - semi-supervised dependency parsing for Irish
|
|
|
|
In: Lynn, Teresa, Foster, Jennifer orcid:0000-0002-7789-4853 , Dras, Mark orcid:0000-0001-9908-7182 and van Genabith, Josef orcid:0000-0003-1322-7944 (2013) Working with a small dataset - semi-supervised dependency parsing for Irish. In: Fourth Workshop on Statistical Parsing of Morphologically Rich Languages, 18 Oct 2013, Seattle, WA. USA. (2013)
|
|
BASE
|
|
Show details
|
|
5 |
Computer assisted (language) learning (CA(L)L) for the inclusive classroom
|
|
Greene, Cara N.. - : Dublin City University. Centre for Next Generation Localisation (CNGL), 2013. : Dublin City University. National Centre for Language Technology (NCLT), 2013. : Dublin City University. School of Computing, 2013
|
|
In: Greene, Cara N. (2013) Computer assisted (language) learning (CA(L)L) for the inclusive classroom. PhD thesis, Dublin City University. (2013)
|
|
BASE
|
|
Show details
|
|
6 |
Domain adaptation for statistical machine translation of corporate and user-generated content
|
|
|
|
In: Banerjee, Pratyush (2013) Domain adaptation for statistical machine translation of corporate and user-generated content. PhD thesis, Dublin City University. (2013)
|
|
BASE
|
|
Show details
|
|
7 |
CNGL: Grading student answers by acts of translation
|
|
|
|
In: Bicici, Ergun orcid:0000-0002-2293-2031 and van Genabith, Josef orcid:0000-0003-1322-7944 (2013) CNGL: Grading student answers by acts of translation. In: SEMEVAL, 14-15 Jun 2013, Atlanta, Georgia. (2013)
|
|
BASE
|
|
Show details
|
|
8 |
Definition of interfaces
|
|
|
|
In: Almaghout, Hala, Bicici, Ergun, Doherty, Stephen orcid:0000-0003-0887-1049 , Gaspari, Federico, Groves, Declan, Toral, Antonio orcid:0000-0003-2357-2960 , van Genabith, Josef orcid:0000-0003-1322-7944 , Popović, Maja orcid:0000-0001-8234-8745 and Piperidis, Stelios (2013) Definition of interfaces. Project Report. QTLaunchPad. (2013)
|
|
BASE
|
|
Show details
|
|
9 |
Quality metrics for human and machine translation.
|
|
|
|
In: Doherty, Stephen orcid:0000-0003-4864-5986 , Gaspari, Federico, Groves, Declan, Srivastava, Ankit Kumar and van Genabith, Josef orcid:0000-0003-1322-7944 (2013) Quality metrics for human and machine translation. Project Report. UNSPECIFIED. (2013)
|
|
BASE
|
|
Show details
|
|
10 |
CNGL-CORE: Referential translation machines for measuring semantic similarity
|
|
|
|
In: Bicici, Ergun orcid:0000-0002-2293-2031 and van Genabith, Josef orcid:0000-0003-1322-7944 (2013) CNGL-CORE: Referential translation machines for measuring semantic similarity. In: *SEM, 13-14 Jun 2013, Atlanta, Georgia. (2013)
|
|
BASE
|
|
Show details
|
|
11 |
Detecting grammatical errors with treebank-induced, probabilistic parsers
|
|
|
|
In: Wagner, Joachim orcid:0000-0002-8290-3849 (2012) Detecting grammatical errors with treebank-induced, probabilistic parsers. PhD thesis, Dublin City University. (2012)
|
|
BASE
|
|
Show details
|
|
12 |
Identifying high-impact sub-structures for convolution kernels in document-level sentiment classification
|
|
|
|
In: Tu, Zhaopeng, He, Yifan, Foster, Jennifer orcid:0000-0002-7789-4853 , van Genabith, Josef orcid:0000-0003-1322-7944 , Liu, Qun and Shouxun, Lin (2012) Identifying high-impact sub-structures for convolution kernels in document-level sentiment classification. In: Annual Meeting of the Association for Computational Linguistics (ACL 2012), 9-11 Jul 2012, Jelu, Korea. (2012)
|
|
BASE
|
|
Show details
|
|
13 |
Irish treebanking and parsing: a preliminary evaluation
|
|
|
|
In: Lynn, Teresa, Cetinoglu, Ozlem, Foster, Jennifer orcid:0000-0002-7789-4853 , Uí Dhonnchadha, Elaine orcid:0000-0003-3448-4288 , Dras, Mark orcid:0000-0001-9908-7182 and van Genabith, Josef orcid:0000-0003-1322-7944 (2012) Irish treebanking and parsing: a preliminary evaluation. In: International Conference on Linguistic Resources and Evaluation, 21-27 May 2012, Istanbul, Turkey. (2012)
|
|
BASE
|
|
Show details
|
|
14 |
Treebank-Based Deep Grammar Acquisition for French Probabilistic Parsing Resources
|
|
Schluter, Natalie. - : Dublin City University. National Centre for Language Technology (NCLT), 2011. : Dublin City University. School of Computing, 2011
|
|
In: Schluter, Natalie (2011) Treebank-Based Deep Grammar Acquisition for French Probabilistic Parsing Resources. PhD thesis, Dublin City University. (2011)
|
|
BASE
|
|
Show details
|
|
15 |
Comparing the use of edited and unedited text in parser self-training
|
|
|
|
In: Foster, Jennifer orcid:0000-0002-7789-4853 , Cetinoglu, Ozlem, Wagner, Joachim orcid:0000-0002-8290-3849 and van Genabith, Josef orcid:0000-0003-1322-7944 (2011) Comparing the use of edited and unedited text in parser self-training. In: The 12th International Conference on Parsing Technologies (IWPT 2011), 05-07 Oct 2011, Dublin, Ireland. ISBN 978-1-932432-04-6 (2011)
|
|
BASE
|
|
Show details
|
|
16 |
From news to comment: Resources and benchmarks for parsing the language of web 2.0
|
|
|
|
In: Foster, Jennifer orcid:0000-0002-7789-4853 , Cetinoglu, Ozlem, Wagner, Joachim orcid:0000-0002-8290-3849 , Le Roux, Joseph, Nivre, Joakim, Hogan, Deirdre and van Genabith, Josef orcid:0000-0003-1322-7944 (2011) From news to comment: Resources and benchmarks for parsing the language of web 2.0. In: The 5th International Joint Conference on Natural Language Processing (IJCNLP), 08-13 Nov 2011, Chiang Mai, Thailand. ISBN 978-974-466-564-5 (2011)
|
|
BASE
|
|
Show details
|
|
17 |
#hardtoparse: POS tagging and parsing the twitterverse
|
|
|
|
In: Foster, Jennifer orcid:0000-0002-7789-4853 , Cetinoglu, Ozlem, Wagner, Joachim orcid:0000-0002-8290-3849 , Le Roux, Joseph, Hogan, Stephen, Nivre, Joakim, Hogan, Deirdre and van Genabith, Josef orcid:0000-0003-1322-7944 (2011) #hardtoparse: POS tagging and parsing the twitterverse. In: The AAAI-11 Workshop on Analyzing Microtext, 8 Aug 2011, San Francisco, CA. (2011)
|
|
BASE
|
|
Show details
|
|
18 |
The integration of machine translation and translation memory
|
|
He, Yifan. - : Dublin City University. Centre for Next Generation Localisation (CNGL), 2011. : Dublin City University. School of Computing, 2011
|
|
In: He, Yifan (2011) The integration of machine translation and translation memory. PhD thesis, Dublin City University. (2011)
|
|
BASE
|
|
Show details
|
|
19 |
Treebank-based automatic acquisition of wide coverage, deep linguistic resources for Japanese
|
|
Oya, Masanori. - : Dublin City University. National Centre for Language Technology (NCLT), 2010. : Dublin City University. School of Computing, 2010
|
|
In: Oya, Masanori (2010) Treebank-based automatic acquisition of wide coverage, deep linguistic resources for Japanese. Master of Science thesis, Dublin City University. (2010)
|
|
BASE
|
|
Show details
|
|
20 |
Hard constraints for grammatical function labelling
|
|
|
|
In: Seeker, Wolfgang, Rehbein, Ines, Kuhn, Jonas and van Genabith, Josef orcid:0000-0003-1322-7944 (2010) Hard constraints for grammatical function labelling. In: ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, 11-16 July 2010, Uppsala, Sweden. (2010)
|
|
BASE
|
|
Show details
|
|
|
|