DE eng

Search in the Catalogues and Directories

Hits 1 – 9 of 9

1
RedditBias: A Real-World Resource for Bias Evaluation and Debiasing of Conversational Language Models ...
BASE
Show details
2
How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models ...
Abstract: Read paper: https://www.aclanthology.org/2021.acl-long.243 Abstract: In this work, we provide a systematic and comprehensive empirical comparison of pretrained multilingual language models versus their monolingual counterparts with regard to their monolingual task performance. We study a set of nine typologically diverse languages with readily available pretrained monolingual models on a set of five diverse monolingual downstream tasks. We first aim to establish, via fair and controlled comparisons, if a gap between the multilingual and the corresponding monolingual representation of that language exists, and subsequently investigate the reason for any performance difference. To disentangle conflating factors, we train new monolingual models on the same data, with monolingually and multilingually trained tokenizers. We find that while the pretraining data size is an important factor, a designated monolingual tokenizer plays an equally important role in the downstream performance. Our results show that ...
Keyword: Computational Linguistics; Condensed Matter Physics; Deep Learning; Electromagnetism; FOS Physical sciences; Information and Knowledge Engineering; Neural Network; Semantics
URL: https://underline.io/lecture/25575-how-good-is-your-tokenizerquestion-on-the-monolingual-performance-of-multilingual-language-models
https://dx.doi.org/10.48448/bdjq-kw21
BASE
Hide details
3
Learning Domain-Specialised Representations for Cross-Lingual Biomedical Entity Linking ...
BASE
Show details
4
LexFit: Lexical Fine-Tuning of Pretrained Language Models ...
BASE
Show details
5
A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots Matters ...
BASE
Show details
6
Multi-SimLex: A Large-Scale Evaluation of Multilingual and Cross-Lingual Lexical Semantic Similarity
In: ISSN: 0891-2017 ; EISSN: 1530-9312 ; Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-02975786 ; Computational Linguistics, Massachusetts Institute of Technology Press (MIT Press), 2020, 46 (4), pp.847-897 ; https://direct.mit.edu/coli/article/46/4/847/97326/Multi-SimLex-A-Large-Scale-Evaluation-of (2020)
BASE
Show details
7
A deep learning approach to bilingual lexicon induction in the biomedical domain. ...
Heyman, Geert; Vulić, Ivan; Moens, Marie-Francine. - : Apollo - University of Cambridge Repository, 2018
BASE
Show details
8
A deep learning approach to bilingual lexicon induction in the biomedical domain.
Heyman, Geert; Vulić, Ivan; Moens, Marie-Francine. - : Springer Science and Business Media LLC, 2018. : BMC Bioinformatics, 2018
BASE
Show details
9
Bio-SimVerb and Bio-SimLex: wide-coverage evaluation sets of word similarity in biomedicine.
Chiu, Billy; Pyysalo, Sampo; Vulić, Ivan. - : BioMed Central, 2018. : BMC bioinformatics, 2018
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
9
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern