41 |
Parameter space factorization for zero-shot learning across tasks and languages ...
|
|
|
|
BASE
|
|
Show details
|
|
42 |
Higher-order Derivatives of Weighted Finite-state Machines ...
|
|
|
|
BASE
|
|
Show details
|
|
44 |
On the Relationships Between the Grammatical Genders of Inanimate Nouns and Their Co-Occurring Adjectives and Verbs ...
|
|
|
|
BASE
|
|
Show details
|
|
45 |
On the Relationships Between the Grammatical Genders of Inanimate Nouns and Their Co-Occurring Adjectives and Verbs ...
|
|
|
|
BASE
|
|
Show details
|
|
47 |
Disambiguatory Signals are Stronger in Word-initial Positions ...
|
|
|
|
BASE
|
|
Show details
|
|
49 |
Multimodal pretraining unmasked: A meta-analysis and a unified framework of vision-and-language berts
|
|
|
|
In: Transactions of the Association for Computational Linguistics, 9 (2021)
|
|
BASE
|
|
Show details
|
|
50 |
Modeling the Unigram Distribution
|
|
|
|
In: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (2021)
|
|
BASE
|
|
Show details
|
|
51 |
On Finding the K-best Non-projective Dependency Trees
|
|
|
|
In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (2021)
|
|
BASE
|
|
Show details
|
|
52 |
Higher-order Derivatives of Weighted Finite-state Machines
|
|
|
|
In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (2021)
|
|
BASE
|
|
Show details
|
|
53 |
Efficient computation of expectations under spanning tree distributions
|
|
|
|
In: Transactions of the Association for Computational Linguistics, 9 (2021)
|
|
BASE
|
|
Show details
|
|
54 |
Do Syntactic Probes Probe Syntax? Experiments with Jabberwocky Probing
|
|
|
|
In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2021)
|
|
Abstract:
Analysing whether neural language models encode linguistic information has become popular in NLP. One method of doing so, which is frequently cited to support the claim that models like BERT encode syntax, is called probing; probes are small supervised models trained to extract linguistic information from another model’s output. If a probe is able to predict a particular structure, it is argued that the model whose output it is trained on must have implicitly learnt to encode it. However, drawing a generalisation about a model’s linguistic knowledge about a specific phenomena based on what a probe is able to learn may be problematic: in this work, we show that semantic cues in training data means that syntactic probes do not properly isolate syntax. We generate a new corpus of semantically nonsensical but syntactically well-formed Jabberwocky sentences, which we use to evaluate two probes trained on normal data. We train the probes on several popular language models (BERT, GPT-2, and RoBERTa), and find that in all settings they perform worse when evaluated on these data, for one probe by an average of 15.4 UUAS points absolute. Although in most cases they still outperform the baselines, their lead is reduced substantially, e.g. by 53% in the case of BERT for one probe. This begs the question: what empirical scores constitute knowing syntax?
|
|
URL: https://hdl.handle.net/20.500.11850/518986 https://doi.org/10.3929/ethz-b-000518986
|
|
BASE
|
|
Hide details
|
|
55 |
What About the Precedent: An Information-Theoretic Analysis of Common Law
|
|
|
|
In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2021)
|
|
BASE
|
|
Show details
|
|
56 |
Applying the Transformer to Character-level Transduction
|
|
|
|
In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume (2021)
|
|
BASE
|
|
Show details
|
|
57 |
Classifying Dyads for Militarized Conflict Analysis
|
|
|
|
In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2021)
|
|
BASE
|
|
Show details
|
|
58 |
Finding Concept-specific Biases in Form–Meaning Associations
|
|
|
|
In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2021)
|
|
BASE
|
|
Show details
|
|
59 |
Efficient Sampling of Dependency Structure
|
|
|
|
In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2021)
|
|
BASE
|
|
Show details
|
|
60 |
A Non-Linear Structural Probe
|
|
|
|
In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2021)
|
|
BASE
|
|
Show details
|
|
|
|