DE eng

Search in the Catalogues and Directories

Hits 1 – 8 of 8

1
Semantic Relatedness and Taxonomic Word Embeddings ...
BASE
Show details
2
English WordNet Taxonomic Random Walk Pseudo-Corpora
In: Conference papers (2020)
BASE
Show details
3
Language related issues for machine translation between closely related south Slavic languages
Arcan, Mihael; Klubicka, Filip; Popovic, Maja. - : The COLING 2016 Organizing Committee, 2019
BASE
Show details
4
Synthetic, Yet Natural: Properties of WordNet Random Walk Corpora and the impact of rare words on embedding performance
In: Conference papers (2019)
Abstract: Creating word embeddings that reflect semantic relationships encoded in lexical knowledge resources is an open challenge. One approach is to use a random walk over a knowledge graph to generate a pseudo-corpus and use this corpus to train embeddings. However, the effect of the shape of the knowledge graph on the generated pseudo-corpora, and on the resulting word embeddings, has not been studied. To explore this, we use English WordNet, constrained to the taxonomic (tree-like) portion of the graph, as a case study. We investigate the properties of the generated pseudo-corpora, and their impact on the resulting embeddings. We find that the distributions in the psuedo-corpora exhibit properties found in natural corpora, such as Zipf’s and Heaps’ law, and also ob- serve that the proportion of rare words in a pseudo-corpus affects the performance of its embeddings on word similarity.
Keyword: Artificial Intelligence and Robotics; Computational Linguistics; corpus; evaluation; Numerical Analysis and Scientific Computing; random walk; representations; Software Engineering; taxonomy; word embeddings; word similarity; WordNet
URL: https://arrow.tudublin.ie/scschcomcon/271
https://arrow.tudublin.ie/cgi/viewcontent.cgi?article=1283&context=scschcomcon
BASE
Hide details
5
Size Matters: The Impact of Training Size in Taxonomically-Enriched Word Embeddings
In: Articles (2019)
BASE
Show details
6
Quantitative Fine-grained Human Evaluation of Machine Translation Systems: a Case Study on English to Croatian
In: Articles (2018)
BASE
Show details
7
Is it worth it? Budget-related evaluation metrics for model selection
In: Conference papers (2018)
BASE
Show details
8
hr500k – A Reference Training Corpus of Croatian.
In: Conference papers (2018)
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
8
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern