2 |
Hands-on natural language processing with PyTorch 1.x : build smart, AI-driven linguistic applications using deep learning and NLP techniques
|
|
|
|
BLLDB
|
|
UB Frankfurt Linguistik
|
|
Show details
|
|
9 |
Determining Tone of a Body of Text
|
|
|
|
In: Senior Projects Spring 2020 (2020)
|
|
BASE
|
|
Show details
|
|
11 |
Modelling source- and target-language syntactic Information as conditional context in interactive neural machine translation
|
|
|
|
In: Gupta, Kamal Kumar, Haque, Rejwanul orcid:0000-0003-1680-0099 , Ekbal, Asif, Bhattacharyya, Pushpak and Way, Andy orcid:0000-0001-5736-5930 (2020) Modelling source- and target-language syntactic Information as conditional context in interactive neural machine translation. In: Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, 2-6 Nov 2020, Lisboa, Portugal. (2020)
|
|
BASE
|
|
Show details
|
|
12 |
AlphaMWE: construction of multilingual parallel corpora with MWE annotations
|
|
|
|
In: Han, Lifeng orcid:0000-0002-3221-2185 , Jones, Gareth J.F. orcid:0000-0003-2923-8365 and Smeaton, Alan F. orcid:0000-0003-1028-8389 (2020) AlphaMWE: construction of multilingual parallel corpora with MWE annotations. In: Joint Workshop on Multiword Expressions and Electronic Lexicons (MWE-LEX 2020), 13 Dec 2020, Barcelona, Spain (Online). (2020)
|
|
BASE
|
|
Show details
|
|
13 |
DIACR-Ita @ EVALITA2020: overview of the EVALITA2020 DiachronicLexical semantics (DIACR-Ita) task
|
|
|
|
In: Basile, Pierpaolo, Caputo, Annalina orcid:0000-0002-7144-8545 , Caselli, Tommaso orcid:0000-0003-2936-0256 , Cassotti, Pierluigi and Varvara, Rossella orcid:0000-0001-9957-2807 (2020) DIACR-Ita @ EVALITA2020: overview of the EVALITA2020 DiachronicLexical semantics (DIACR-Ita) task. In: Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian, 17 Dec 2020, Online. (2020)
|
|
BASE
|
|
Show details
|
|
14 |
On the differences between human translations
|
|
|
|
In: Popović, Maja orcid:0000-0001-8234-8745 (2020) On the differences between human translations. In: 22nd Annual Conference of the European Association for Machine Translation (EAMT 2020), 3 -5 Nov 2020, Lisbon, Portugal (Online). (2020)
|
|
BASE
|
|
Show details
|
|
15 |
A diachronic Italian corpus based on “L’Unit`a”
|
|
|
|
In: Basile, Pierpaolo, Caputo, Annalina orcid:0000-0002-7144-8545 , Caselli, Tommaso orcid:0000-0003-2936-0256 , Cassotti, Pierluigi and Varvara, Rossella orcid:0000-0001-9957-2807 (2020) A diachronic Italian corpus based on “L’Unit`a”. In: Seventh Italian Conference on Computational Linguistics, 1-3 Mar 2021, Bologna (Online). (2020)
|
|
BASE
|
|
Show details
|
|
16 |
Neural machine translation between similar south-Slavic languages
|
|
|
|
In: Popović, Maja orcid:0000-0001-8234-8745 and Poncelas, Alberto orcid:0000-0002-5089-1687 (2020) Neural machine translation between similar south-Slavic languages. In: 2020 Fifth Conference on Machine Translation (WMT20), 19-20 Nov 2020, Dominican Republic (Online). (2020)
|
|
BASE
|
|
Show details
|
|
17 |
Annotating verbal MWEs in Irish for the PARSEME Shared Task 1.2
|
|
|
|
In: Walsh, Abigail, Lynn, Teresa and Foster, Jennifer orcid:0000-0002-7789-4853 (2020) Annotating verbal MWEs in Irish for the PARSEME Shared Task 1.2. In: Joint Workshop on Multiword Expressions and Electronic Lexicons, 13 Dec 2020, Barcelona, Spain (Online). (2020)
|
|
BASE
|
|
Show details
|
|
18 |
GM-CTSC at SemEval-2020 Task 1: Gaussian mixtures cross temporal similarity clustering
|
|
|
|
In: Cassotti, Pierluigi, Caputo, Annalina orcid:0000-0002-7144-8545 , Polignano, Marco orcid:0000-0002-3939-0136 and Basile, Pierpaolo orcid:0000-0002-0545-1105 (2020) GM-CTSC at SemEval-2020 Task 1: Gaussian mixtures cross temporal similarity clustering. In: Fourteenth Workshop on Semantic Evaluation, Dec 2020, Barcelona (Online). (2020)
|
|
BASE
|
|
Show details
|
|
19 |
Syntax-informed interactive neural machine translation
|
|
|
|
In: Gupta, Kamal Kumar, Haque, Rejwanul orcid:0000-0003-1680-0099 , Ekbal, Asif, Bhattacharyya, Pushpak and Way, Andy orcid:0000-0001-5736-5930 (2020) Syntax-informed interactive neural machine translation. In: The International Joint Conference on Neural Networks (IJCNN), 19-24 July 2020, Glasgow, UK (Online). (2020)
|
|
BASE
|
|
Show details
|
|
20 |
Bilingual lexicon induction across orthographically-distinct under-resourced Dravidian languages
|
|
|
|
In: Chakravarthi, Bharathi Raja orcid:0000-0002-4575-7934 , Rajasekaran, Navaneethan, Arcan, Mihael orcid:0000-0002-3116-621X , McGuinness, Kevin orcid:0000-0003-1336-6477 , O'Connor, Noel E. orcid:0000-0002-4033-9135 and McCrae, John P. orcid:0000-0002-7227-1331 (2020) Bilingual lexicon induction across orthographically-distinct under-resourced Dravidian languages. In: 7th Workshop on NLP for Similar Languages, Varieties and Dialects, 13 Dec 2020, Barcelona, Spain (Online). (2020)
|
|
Abstract:
Bilingual lexicons are a vital tool for under-resourced languages and recent state-of-the-art approaches to this leverage pretrained monolingual word embeddings using supervised or semi- supervised approaches. However, these approaches require cross-lingual information such as seed dictionaries to train the model and find a linear transformation between the word embedding spaces. Especially in the case of low-resourced languages, seed dictionaries are not readily available, and as such, these methods produce extremely weak results on these languages. In this work, we focus on the Dravidian languages, namely Tamil, Telugu, Kannada, and Malayalam, which are even more challenging as they are written in unique scripts. To take advantage of orthographic information and cognates in these languages, we bring the related languages into a single script. Previous approaches have used linguistically sub-optimal measures such as the Levenshtein edit distance to detect cognates, whereby we demonstrate that the longest common sub-sequence is linguistically more sound and improves the performance of bilingual lexicon induction. We show that our approach can increase the accuracy of bilingual lexicon induction methods on these languages many times, making bilingual lexicon induction approaches feasible for such under-resourced languages.
|
|
Keyword:
Computational linguistics; Information retrieval; Machine translating
|
|
URL: http://doras.dcu.ie/25223/
|
|
BASE
|
|
Hide details
|
|
|
|