1 |
Shapley Idioms: Analysing BERT Sentence Embeddings for General Idiom Token Identification
|
|
|
|
In: Front Artif Intell (2022)
|
|
BASE
|
|
Show details
|
|
3 |
English WordNet Taxonomic Random Walk Pseudo-Corpora
|
|
|
|
In: Conference papers (2020)
|
|
BASE
|
|
Show details
|
|
4 |
Language related issues for machine translation between closely related south Slavic languages
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Synthetic, Yet Natural: Properties of WordNet Random Walk Corpora and the impact of rare words on embedding performance
|
|
|
|
In: Conference papers (2019)
|
|
BASE
|
|
Show details
|
|
6 |
Size Matters: The Impact of Training Size in Taxonomically-Enriched Word Embeddings
|
|
|
|
In: Articles (2019)
|
|
BASE
|
|
Show details
|
|
8 |
Quantitative Fine-Grained Human Evaluation of Machine Translation Systems: a Case Study on English to Croatian ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Is it worth it? Budget-related evaluation metrics for model selection ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Quantitative Fine-grained Human Evaluation of Machine Translation Systems: a Case Study on English to Croatian
|
|
|
|
In: Articles (2018)
|
|
BASE
|
|
Show details
|
|
11 |
Is it worth it? Budget-related evaluation metrics for model selection
|
|
|
|
In: Conference papers (2018)
|
|
BASE
|
|
Show details
|
|
12 |
hr500k – A Reference Training Corpus of Croatian.
|
|
|
|
In: Conference papers (2018)
|
|
BASE
|
|
Show details
|
|
13 |
Croatian Twitter training corpus ReLDI-NormTag-hr 1.1
|
|
|
|
Abstract:
ReLDI-NormTag-hr 1.1 is a manually annotated corpus of Croatian tweets. It is meant as a gold-standard training and testing dataset for tokenisation, sentence segmentation, word normalisation, morphosyntactic tagging and lemmatisation of non-standard Croatian. Each tweet is also annotated for its automatically assigned standardness levels (T = technical standardness, L = linguistic standardness). As an update to version 1.0, 1.1 corrects some minor errors. The corpus construction is (partially) described in: MILIČEVIĆ, Maja, LJUBEŠIĆ, Nikola. Tviterasi, tviteraši or twitteraši? Producing and analysing a normalised dataset of Croatian and Serbian tweets. Slovenščina 2.0: empirical, applied and interdisciplinary research, 4/2, 2016. ISSN 2335-2736. http://dx.doi.org/10.4312/slo2.0.2016.2.156-188
|
|
Keyword:
computer-mediated communication; lemmatisation; manual annotation; tagging; TEI; tokenisation; word normalisation
|
|
URL: http://hdl.handle.net/11356/1121
|
|
BASE
|
|
Hide details
|
|
17 |
Fine-grained human evaluation of neural versus phrase-based machine translation ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Fine-Grained Human Evaluation of Neural Versus Phrase-Based Machine Translation
|
|
|
|
In: Prague Bulletin of Mathematical Linguistics , Vol 108, Iss 1, Pp 121-132 (2017) (2017)
|
|
BASE
|
|
Show details
|
|
|
|