1 |
Does Corpus Quality Really Matter for Low-Resource Languages? ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Unsupervised Multilingual Sentence Embeddings for Parallel Corpus Mining ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
PARADISE: Exploiting Parallel Data for Multilingual Sequence-to-Sequence Pretraining ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Beyond Offline Mapping: Learning Cross-lingual Word Embeddings through Context Anchoring ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Multilingual Machine Translation: Closing the Gap between Shared and Language-specific Encoder-Decoders ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Training Multilingual Machine Translation by Alternately Freezing Language-Specific Encoders-Decoders ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
A Call for More Rigor in Unsupervised Cross-lingual Learning ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Beyond Offline Mapping: Learning Cross Lingual Word Embeddings through Context Anchoring ...
|
|
|
|
Abstract:
Recent research on cross-lingual word embeddings has been dominated by unsupervised mapping approaches that align monolingual embeddings. Such methods critically rely on those embeddings having a similar structure, but it was recently shown that the separate training in different languages causes departures from this assumption. In this paper, we propose an alternative approach that does not have this limitation, while requiring a weak seed dictionary (e.g., a list of identical words) as the only form of supervision. Rather than aligning two fixed embedding spaces, our method works by fixing the target language embeddings, and learning a new set of embeddings for the source language that are aligned with them. To that end, we use an extension of skip-gram that leverages translated context words as anchor points, and incorporates self-learning and iterative restarts to reduce the dependency on the initial dictionary. Our approach outperforms conventional mapping methods on bilingual lexicon induction, and ... : ACL 2021 ...
|
|
Keyword:
Artificial Intelligence cs.AI; Computation and Language cs.CL; FOS Computer and information sciences; Machine Learning cs.LG
|
|
URL: https://arxiv.org/abs/2012.15715 https://dx.doi.org/10.48550/arxiv.2012.15715
|
|
BASE
|
|
Hide details
|
|
11 |
Translation Artifacts in Cross-lingual Transfer Learning ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Analyzing the Limitations of Cross-lingual Word Embedding Mappings ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
On the Cross-lingual Transferability of Monolingual Representations ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Unsupervised Neural Machine Translation, a new paradigm solely based on monolingual text ; Traducción Automática Neuronal no Supervisada, un nuevo paradigma basado solo en textos monolingües
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 7, Pp 597-610 (2019) (2019)
|
|
BASE
|
|
Show details
|
|
16 |
Contextualized Translations of Phrasal Verbs with Distributional Compositional Semantics and Monolingual Corpora
|
|
|
|
In: Computational Linguistics, Vol 45, Iss 3, Pp 395-421 (2019) (2019)
|
|
BASE
|
|
Show details
|
|
17 |
Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Uncovering divergent linguistic information in word embeddings with lessons for intrinsic and extrinsic evaluation ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Lexical semantics, Basque and Spanish in QTLeap: Quality Translation by Deep Language Engineering Approaches ; QTLeap - Traducción de calidad mediante tratamientos profundos de ingeniería lingüística
|
|
|
|
BASE
|
|
Show details
|
|
|
|