1 |
MirrorWiC: On Eliciting Word-in-Context Representations from Pretrained Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
MirrorWiC: On Eliciting Word-in-Context Representations from Pretrained Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
AM2iCo: Evaluating Word Meaning in Context across Low-Resource Languages with Adversarial Examples ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Improving Machine Translation of Rare and Unseen Word Senses ...
|
|
|
|
Abstract:
The performance of NMT systems has improved drastically in the past few years but the translation of multi-sense words still poses a challenge. Since word senses are not represented uniformly in the parallel corpora used for training, there is an excessive use of the most frequent sense in MT output. In this work, we propose CmBT (Contextually-mined Back-Translation), an approach for improving multi-sense word translation leveraging pre-trained cross-lingual contextual word representations (CCWRs). Because of their contextual sensitivity and their large pre-training data, CCWRs can easily capture word senses that are missing or very rare in parallel corpora used to train MT. Specifically, CmBT applies bilingual lexicon induction on CCWRs to mine sense-specific target sentences from a monolingual dataset, and then back-translates these sentences to generate a pseudo parallel corpus as additional training data for an MT system. We test the translation quality of ambiguous words on the MuCoW test suite, which ...
|
|
Keyword:
Computational Linguistics; Machine Learning; Machine Learning and Data Mining; Machine translation; Natural Language Processing; Neural Network
|
|
URL: https://dx.doi.org/10.48448/hhqm-5908 https://underline.io/lecture/39468-improving-machine-translation-of-rare-and-unseen-word-senses
|
|
BASE
|
|
Hide details
|
|
5 |
XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning
|
|
|
|
BASE
|
|
Show details
|
|
7 |
XCOPA: A multilingual dataset for causal commonsense reasoning
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Investigating cross-lingual alignment methods for contextualized embeddings with Token-level evaluation ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Second-order contexts from lexical substitutes for few-shot learning of word representations ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Second-order contexts from lexical substitutes for few-shot learning of word representations
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Investigating cross-lingual alignment methods for contextualized embeddings with Token-level evaluation
|
|
|
|
BASE
|
|
Show details
|
|
|
|