4 |
The Impact of Positional Encodings on Multilingual Compression ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
The Impact of Positional Encodings on Multilingual Compression ...
|
|
|
|
Abstract:
Anthology paper link: https://aclanthology.org/2021.emnlp-main.59/ Abstract: In order to preserve word-order information in a non-autoregressive setting, transformer architectures tend to include positional knowledge, by (for instance) adding positional encodings to token embeddings. Several modifications have been proposed over the sinusoidal positional encodings used in the original transformer architecture; these include, for instance, separating position encodings and token embeddings, or directly modifying attention weights based on the distance between word pairs. We first show that surprisingly, while these modifications tend to improve monolingual language models, none of them result in better multilingual language models. We then answer why that is: Sinusoidal encodings were explicitly designed to facilitate compositionality by allowing linear projections over arbitrary time steps. Higher variances in multilingual training distributions requires higher compression, in which case, compositionality ...
|
|
Keyword:
Computational Linguistics; Language Models; Machine Learning; Machine Learning and Data Mining; Natural Language Processing
|
|
URL: https://underline.io/lecture/37780-the-impact-of-positional-encodings-on-multilingual-compression https://dx.doi.org/10.48448/gnbq-xb75
|
|
BASE
|
|
Hide details
|
|
6 |
Attention Can Reflect Syntactic Structure (If You Let It) ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
The Sensitivity of Language Models and Humans to Winograd Schema Perturbations ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
From Zero to Hero: On the Limitations of Zero-Shot Cross-Lingual Transfer with Multilingual Transformers ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
From Zero to Hero: On the Limitations of Zero-Shot Cross-Lingual Transfer with Multilingual Transformers ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Do Neural Language Models Show Preferences for Syntactic Formalisms? ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
From Zero to Hero: On the Limitations of Zero-Shot Cross-Lingual Transfer with Multilingual Transformers
|
|
|
|
BASE
|
|
Show details
|
|
14 |
From zero to hero: On the limitations of zero-shot language transfer with multilingual transformers
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Probing Multilingual Sentence Representations With X-Probe ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
The WMT'18 Morpheval test suites for English-Czech, English-German, English-Finnish and Turkish-English
|
|
|
|
In: Proceedings of the Third Conference on Machine Translation ; 3rd Conference on Machine Translation (WMT 18) ; https://hal.archives-ouvertes.fr/hal-01910244 ; 3rd Conference on Machine Translation (WMT 18), Oct 2018, Bruxelles, Belgium. pp.550-564, ⟨10.18653/v1/W18-64060⟩ ; http://www.statmt.org/wmt18/ (2018)
|
|
BASE
|
|
Show details
|
|
19 |
Universal Dependencies 2.2
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-01930733 ; 2018 (2018)
|
|
BASE
|
|
Show details
|
|
|
|