1 |
Coloring the Blank Slate: Pre-training Imparts a Hierarchical Inductive Bias to Sequence-to-sequence Models ...
|
|
|
|
Abstract:
Relations between words are governed by hierarchical structure rather than linear ordering. Sequence-to-sequence (seq2seq) models, despite their success in downstream NLP applications, often fail to generalize in a hierarchy-sensitive manner when performing syntactic transformations - for example, transforming declarative sentences into questions. However, syntactic evaluations of seq2seq models have only observed models that were not pre-trained on natural language data before being trained to perform syntactic transformations, in spite of the fact that pre-training has been found to induce hierarchical linguistic generalizations in language models; in other words, the syntactic capabilities of seq2seq models may have been greatly understated. We address this gap using the pre-trained seq2seq models T5 and BART, as well as their multilingual variants mT5 and mBART. We evaluate whether they generalize hierarchically on two transformations in two languages: question formation and passivization in English and ... : Accepted to Findings of ACL 2022 ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://dx.doi.org/10.48550/arxiv.2203.09397 https://arxiv.org/abs/2203.09397
|
|
BASE
|
|
Hide details
|
|
2 |
Can language models capture syntactic associations without surface cues? A case study of reflexive anaphor licensing in English control constructions
|
|
|
|
In: Proceedings of the Society for Computation in Linguistics (2022)
|
|
BASE
|
|
Show details
|
|
6 |
NOPE: A Corpus of Naturally-Occurring Presuppositions in English ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
NOPE: A Corpus of Naturally-Occurring Presuppositions in English ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Predicting Scalar Inferences From "Or" to "Not Both" Using Neural Sentence Encoders
|
|
|
|
In: Proceedings of the Society for Computation in Linguistics (2021)
|
|
BASE
|
|
Show details
|
|
12 |
Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Harnessing the linguistic signal to predict scalar inferences ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Universal Dependencies 2.2
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-01930733 ; 2018 (2018)
|
|
BASE
|
|
Show details
|
|
|
|