8 |
A surprisal--duration trade-off across and within the world's languages ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
What About the Precedent: An Information-Theoretic Analysis of Common Law ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Finding Concept-specific Biases in Form–Meaning Associations ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Modeling the Unigram Distribution
|
|
|
|
In: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (2021)
|
|
BASE
|
|
Show details
|
|
14 |
What About the Precedent: An Information-Theoretic Analysis of Common Law
|
|
|
|
In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2021)
|
|
BASE
|
|
Show details
|
|
15 |
Finding Concept-specific Biases in Form–Meaning Associations
|
|
|
|
In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2021)
|
|
BASE
|
|
Show details
|
|
16 |
A Non-Linear Structural Probe
|
|
|
|
In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2021)
|
|
BASE
|
|
Show details
|
|
17 |
Disambiguatory Signals are Stronger in Word-initial Positions
|
|
|
|
In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume (2021)
|
|
BASE
|
|
Show details
|
|
18 |
How (Non-)Optimal is the Lexicon?
|
|
|
|
In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2021)
|
|
BASE
|
|
Show details
|
|
19 |
A Bayesian Framework for Information-Theoretic Probing
|
|
|
|
In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2021)
|
|
Abstract:
Pimentel et al. (2020) recently analysed probing from an information-theoretic perspective. They argue that probing should be seen as approximating a mutual information. This led to the rather unintuitive conclusion that representations encode exactly the same information about a target task as the original sentences. The mutual information, however, assumes the true probability distribution of a pair of random variables is known, leading to unintuitive results in settings where it is not. This paper proposes a new framework to measure what we term Bayesian mutual information, which analyses information from the perspective of Bayesian agents—allowing for more intuitive findings in scenarios with finite data. For instance, under Bayesian MI we have that data can add information, processing can help, and information can hurt, which makes it more intuitive for machine learning applications. Finally, we apply our framework to probing where we believe Bayesian mutual information naturally operationalises ease of extraction by explicitly limiting the available background knowledge to solve a task.
|
|
URL: https://doi.org/10.3929/ethz-b-000518995 https://hdl.handle.net/20.500.11850/518995
|
|
BASE
|
|
Hide details
|
|
|
|