1 |
Utilising knowledge graph embeddings for data-to-text generation
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Enhancing multiple-choice question answering with causal knowledge
|
|
|
|
BASE
|
|
Show details
|
|
4 |
NUIG-DSI at the WebNLG+ challenge: Leveraging transfer learning for RDF-to-text generation
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Bilingual lexicon induction across orthographically-distinct under-resourced Dravidian languages
|
|
|
|
In: Chakravarthi, Bharathi Raja orcid:0000-0002-4575-7934 , Rajasekaran, Navaneethan, Arcan, Mihael orcid:0000-0002-3116-621X , McGuinness, Kevin orcid:0000-0003-1336-6477 , O'Connor, Noel E. orcid:0000-0002-4033-9135 and McCrae, John P. orcid:0000-0002-7227-1331 (2020) Bilingual lexicon induction across orthographically-distinct under-resourced Dravidian languages. In: 7th Workshop on NLP for Similar Languages, Varieties and Dialects, 13 Dec 2020, Barcelona, Spain (Online). (2020)
|
|
BASE
|
|
Show details
|
|
6 |
Leveraging orthographic information to improve machine translation of under-resourced languages
|
|
|
|
Abstract:
This thesis describes our improvement of word sense translation for under-resourced languages utilizing orthographic information with a particular focus on creating resources using machine translation. The first target of this thesis is cleaning the noisy corpus in the form of code-mixed content at word-level based on orthographic information to improve machine translation quality. Our results indicate that the proposed removing of code-mixed text based on orthography results in improvement for Dravidian languages. We then turn our interest to the usage of training data from closely-related languages. While languages within the same language family share many properties, many under-resourced languages are written in their own native script, which makes taking advantage of these language similarities difficult. We propose to alleviate the problem of different scripts by transcribing the native script into a common representation such as the Latin script or the International Phonetic Alphabet (IPA). We also show that our method could aid the creation or improvement of wordnets for under-resourced languages using machine translation. Further, we investigate bilingual lexicon induction using pre-trained monolingual word embeddings and orthographic information. We use existing resources such as IndoWordNet entries as a seed dictionary and test set for the under-resourced Dravidian languages. To take advantage of orthographic information, we propose to bring the related languages into a single script before creating word embeddings, and use the longest common subsequence to take advantage of cognate information. Our methods for under-resourced word sense translation of Dravidian languages outperformed state-of-the art systems in terms of both automatic and manual evaluation.
|
|
Keyword:
Data science; Dravidian Languages; Engineering and Informatics; Machine Translation; Orthography; Tamil; Under-resourced Languages
|
|
URL: http://hdl.handle.net/10379/16100
|
|
BASE
|
|
Hide details
|
|
7 |
NUIG at TIAD: Combining unsupervised NLP and graph metrics for translation inference
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Aspects of Terminological and Named Entity Knowledge within Rule-Based Machine Translation Models for Under-Resourced Neural Machine Translation Scenarios ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Bilingual Lexicon Induction across Orthographically-distinct Under-Resourced Dravidian Languages ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Bilingual Lexicon Induction across Orthographically-distinct Under-Resourced Dravidian Languages ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Comparison of Different Orthographies for Machine Translation of Under-Resourced Dravidian Languages
|
|
|
|
BASE
|
|
Show details
|
|
12 |
TIAD 2019 Shared Task: Leveraging knowledge graphs with neural machine translation for automatic multilingual dictionary generation
|
|
|
|
BASE
|
|
Show details
|
|
13 |
The ESSOT system goes wild: an easy way for translating ontologies
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Generating linked-data based domain-specific sentiment lexicons from legacy language and semantic resources
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Automatic enrichment of terminological resources: the IATE RDF example
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Avtomatsko pridobivanje besednih zvez iz korpusa z uporabo leksikona SSJ
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Inferring translation candidates for multilingual dictionary generation with multi-way neural machine translation
|
|
|
|
BASE
|
|
Show details
|
|
|
|