1 |
An investigation of English-Irish machine translation and associated resources
|
|
Dowling, Meghan. - : Dublin City University. School of Computing, 2022. : Dublin City University. ADAPT, 2022
|
|
In: Dowling, Meghan orcid:0000-0003-1637-4923 (2022) An investigation of English-Irish machine translation and associated resources. PhD thesis, Dublin City University. (2022)
|
|
BASE
|
|
Show details
|
|
7 |
gaBERT -- an Irish Language Model ...
|
|
|
|
Abstract:
The BERT family of neural language models have become highly popular due to their ability to provide sequences of text with rich context-sensitive token encodings which are able to generalise well to many Natural Language Processing tasks. Over 120 monolingual BERT models covering over 50 languages have been released, as well as a multilingual model trained on 104 languages. We introduce, gaBERT, a monolingual BERT model for the Irish language. We compare our gaBERT model to multilingual BERT and show that gaBERT provides better representations for a downstream parsing task. We also show how different filtering criteria, vocabulary size and the choice of subword tokenisation model affect downstream performance. We release gaBERT and related code to the community. ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://dx.doi.org/10.48550/arxiv.2107.12930 https://arxiv.org/abs/2107.12930
|
|
BASE
|
|
Hide details
|
|
10 |
Annotating verbal MWEs in Irish for the PARSEME Shared Task 1.2
|
|
|
|
In: Walsh, Abigail, Lynn, Teresa and Foster, Jennifer orcid:0000-0002-7789-4853 (2020) Annotating verbal MWEs in Irish for the PARSEME Shared Task 1.2. In: Joint Workshop on Multiword Expressions and Electronic Lexicons, 13 Dec 2020, Barcelona, Spain (Online). (2020)
|
|
BASE
|
|
Show details
|
|
11 |
A human evaluation of English-Irish statistical and neural machine translation
|
|
|
|
In: Dowling, Meghan orcid:0000-0003-1637-4923 , Castilho, Sheila orcid:0000-0002-8416-6555 , Moorkens, Joss orcid:0000-0003-4864-5986 , Lynn, Teresa and Way, Andy orcid:0000-0001-5736-5930 (2020) A human evaluation of English-Irish statistical and neural machine translation. In: Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, 6 Nov 2020, Lisbon, Portugal. (2020)
|
|
BASE
|
|
Show details
|
|
12 |
Treebanking user-generated content: a proposal for a unified representation in universal dependencies
|
|
|
|
In: Sanguinetti, Manuela orcid:0000-0002-0147-2208 , Bosco, Cristina, Cassidy, Lauren, Çetinoglu, Özlem, Cignarella, Alessandra Teresa orcid:0000-0002-4409-6679 , Lynn, Teresa, Rehbein, Ines, Ruppenhofer, Josef, Seddah, Djamé and Zeldes, Amir orcid:0000-0001-8016-6753 (2020) Treebanking user-generated content: a proposal for a unified representation in universal dependencies. In: 12th Language Resources and Evaluation Conference. (LREC 2020), 11-16 May 2020, Marseille, France. (2020)
|
|
BASE
|
|
Show details
|
|
13 |
Treebanking user-generated content: a proposal for a unified representation in universal dependencies
|
|
|
|
In: Sanguinetti, Manuela orcid:0000-0002-0147-2208 , Bosco, Cristina, Cassidy, Lauren, Çetinoglu, Özlem, Cignarella, Alessandra Teresa orcid:0000-0002-4409-6679 , Lynn, Teresa, Rehbein, Ines, Ruppenhofer, Josef, Seddah, Djamé and Zeldes, Amir orcid:0000-0001-8016-6753 (2020) Treebanking user-generated content: a proposal for a unified representation in universal dependencies. In: 12th Language Resources and Evaluation Conference. (LREC 2020), 11-16 May 2020, Marseille, France. (Virtual). (2020)
|
|
BASE
|
|
Show details
|
|
15 |
Annotated corpora and tools of the PARSEME Shared Task on Semi-Supervised Identification of Verbal Multiword Expressions (edition 1.2)
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Annotating Verbal MWEs in Irish for the PARSEME Shared Task 1.2 ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Treebanking User-Generated Content: a UD Based Overview of Guidelines, Corpora and Unified Recommendations ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Treebanking user-generated content: A proposal for a unified representation in universal dependencies
|
|
|
|
BASE
|
|
Show details
|
|
|
|