1 |
MirrorWiC: On Eliciting Word-in-Context Representations from Pretrained Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
MirrorWiC: On Eliciting Word-in-Context Representations from Pretrained Language Models ...
|
|
|
|
Abstract:
Recent work indicated that pretrained language models (PLMs) such as BERT and RoBERTa can be transformed into effective sentence and word encoders even via simple self-supervised techniques. Inspired by this line of work, in this paper we propose a fully unsupervised approach to improving word-in-context (WiC) representations in PLMs, achieved via a simple and efficient WiC-targeted fine-tuning procedure: MirrorWiC. The proposed method leverages only raw texts sampled from Wikipedia, assuming no sense-annotated data, and learns context-aware word representations within a standard contrastive learning setup. We experiment with a series of standard and comprehensive WiC benchmarks across multiple languages. Our proposed fully unsupervised MirrorWiC models obtain substantial gains over off-the-shelf PLMs across all monolingual, multilingual and cross-lingual setups. Moreover, on some standard WiC benchmarks, MirrorWiC is even on-par with supervised models fine-tuned with in-task data and sense labels. ...
|
|
Keyword:
cs.CL
|
|
URL: https://dx.doi.org/10.17863/cam.78495 https://www.repository.cam.ac.uk/handle/1810/331050
|
|
BASE
|
|
Hide details
|
|
3 |
MirrorWiC: On Eliciting Word-in-Context Representations from Pretrained Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Context vs Target Word: Quantifying Biases in Lexical Semantic Datasets ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
AM2iCo: Evaluating Word Meaning in Context across Low-Resource Languages with Adversarial Examples ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
AM2iCo: Evaluating Word Meaning in Context across Low-Resource Languages with Adversarial Examples ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Improving Machine Translation of Rare and Unseen Word Senses ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning
|
|
|
|
BASE
|
|
Show details
|
|
11 |
XCOPA: A multilingual dataset for causal commonsense reasoning
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Investigating cross-lingual alignment methods for contextualized embeddings with Token-level evaluation ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Second-order contexts from lexical substitutes for few-shot learning of word representations ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Second-order contexts from lexical substitutes for few-shot learning of word representations
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Investigating cross-lingual alignment methods for contextualized embeddings with Token-level evaluation
|
|
|
|
BASE
|
|
Show details
|
|
|
|