2 |
Data for paper: "Evaluating Resource-Lean Cross-Lingual Embedding Models in Unsupervised Retrieval" ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Crossing the Conversational Chasm: A Primer on Natural Language Processing for Multilingual Task-Oriented Dialogue Systems ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
On Cross-Lingual Retrieval with Multilingual Text Encoders ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Evaluating Multilingual Text Encoders for Unsupervised Cross-Lingual Retrieval ...
|
|
|
|
Abstract:
Pretrained multilingual text encoders based on neural Transformer architectures, such as multilingual BERT (mBERT) and XLM, have achieved strong performance on a myriad of language understanding tasks. Consequently, they have been adopted as a go-to paradigm for multilingual and cross-lingual representation learning and transfer, rendering cross-lingual word embeddings (CLWEs) effectively obsolete. However, questions remain to which extent this finding generalizes 1) to unsupervised settings and 2) for ad-hoc cross-lingual IR (CLIR) tasks. Therefore, in this work we present a systematic empirical study focused on the suitability of the state-of-the-art multilingual encoders for cross-lingual document and sentence retrieval tasks across a large number of language pairs. In contrast to supervised language understanding, our results indicate that for unsupervised document-level CLIR -- a setup with no relevance judgments for IR-specific fine-tuning -- pretrained encoders fail to significantly outperform models ... : accepted at ECIR'21 (preprint) ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences; H.3.3; I.2.7; Information Retrieval cs.IR
|
|
URL: https://dx.doi.org/10.48550/arxiv.2101.08370 https://arxiv.org/abs/2101.08370
|
|
BASE
|
|
Hide details
|
|
6 |
AraWEAT: Multidimensional Analysis of Biases in Arabic Word Embeddings ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
On the Limitations of Cross-lingual Encoders as Exposed by Reference-Free Machine Translation Evaluation ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Orthogonal Language and Task Adapters in Zero-Shot Cross-Lingual Transfer ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
From Zero to Hero: On the Limitations of Zero-Shot Cross-Lingual Transfer with Multilingual Transformers ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Probing Pretrained Language Models for Lexical Semantics ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Do We Really Need Fully Unsupervised Cross-Lingual Embeddings? ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Informing unsupervised pretraining with external linguistic knowledge
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Unsupervised Cross-Lingual Information Retrieval using Monolingual Data Only ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Post-Specialisation: Retrofitting Vectors of Words Unseen in Lexical Resources ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
A Resource-Light Method for Cross-Lingual Semantic Textual Similarity ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|