2 |
Probing BERT's priors with serial reproduction chains ...
|
|
|
|
Abstract:
Sampling is a promising bottom-up method for exposing what generative models have learned about language, but it remains unclear how to generate representative samples from popular masked language models (MLMs) like BERT. The MLM objective yields a dependency network with no guarantee of consistent conditional distributions, posing a problem for naive approaches. Drawing from theories of iterated learning in cognitive science, we explore the use of serial reproduction chains to sample from BERT's priors. In particular, we observe that a unique and consistent estimator of the ground-truth joint distribution is given by a Generative Stochastic Network (GSN) sampler, which randomly selects which token to mask and reconstruct on each step. We show that the lexical and syntactic statistics of sentences from GSN chains closely match the ground-truth corpus distribution and perform better than other methods in a large corpus of naturalness judgments. Our findings establish a firmer theoretical foundation for ... : Findings of ACL 2022 ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://dx.doi.org/10.48550/arxiv.2202.12226 https://arxiv.org/abs/2202.12226
|
|
BASE
|
|
Hide details
|
|
3 |
From partners to populations: A hierarchical Bayesian account of coordination and convention ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Shades of confusion: Lexical uncertainty modulates ad hoc coordination in an interactive communication task ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Evaluating Models of Robust Word Recognition with Serial Reproduction ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Generalizing meanings from partners to populations: Hierarchical inference supports convention formation on networks ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Investigating representations of verb bias in neural language models ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Intuitive Theories as Grammars for Causal Inference
|
|
|
|
In: MIT web domain (2019)
|
|
BASE
|
|
Show details
|
|
11 |
Learning Hierarchical Visual Representations in Deep Neural Networks Using Hierarchical Linguistic Labels ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Word forms - not just their lengths- are optimized for efficient communication ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
The Hierarchical Cortical Organization of Human Speech Processing
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Natural speech reveals the semantic maps that tile human cerebral cortex.
|
|
|
|
In: Nature, vol 532, iss 7600 (2016)
|
|
BASE
|
|
Show details
|
|
16 |
The Sapir-Whorf Hypothesis and Probabilistic Inference: Evidence from the Domain of Color
|
|
|
|
In: Cibelli, Emily; Xu, Yang; Austerweil, Joseph L; Griffiths, Thomas L; & Regier, Terry. (2016). The Sapir-Whorf Hypothesis and Probabilistic Inference: Evidence from the Domain of Color. PLOS ONE, 11(7), e0158725. doi:10.1371/journal.pone.0158725. UC Berkeley: UC Berkeley Library. Retrieved from: http://www.escholarship.org/uc/item/1pt8b5dj (2016)
|
|
BASE
|
|
Show details
|
|
17 |
The Sapir-Whorf Hypothesis and Probabilistic Inference: Evidence from the Domain of Color.
|
|
|
|
In: PloS one, vol 11, iss 7 (2016)
|
|
BASE
|
|
Show details
|
|
18 |
The Sapir-Whorf Hypothesis and Probabilistic Inference: Evidence from the Domain of Color
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Natural speech reveals the semantic maps that tile human cerebral cortex
|
|
|
|
BASE
|
|
Show details
|
|
|
|