2 |
On Homophony and Rényi Entropy ...
|
|
|
|
Abstract:
Homophony's widespread presence in natural languages is a controversial topic. Recent theories of language optimality have tried to justify its prevalence, despite its negative effects on cognitive processing time; e.g., Piantadosi et al. (2012) argued homophony enables the reuse of efficient wordforms and is thus beneficial for languages. This hypothesis has recently been challenged by Trott and Bergen (2020), who posit that good wordforms are more often homophonous simply because they are more phonotactically probable. In this paper, we join in on the debate. We first propose a new information-theoretic quantification of a language's homophony: the sample Rényi entropy. Then, we use this quantification to revisit Trott and Bergen's claims. While their point is theoretically sound, a specific methodological issue in their experiments raises doubts about their results. After addressing this issue, we find no clear pressure either towards or against homophony -- a much more nuanced result than either ... : Accepted for publication in EMNLP 2021. Code available in https://github.com/rycolab/homophony-as-renyi-entropy ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://arxiv.org/abs/2109.13766 https://dx.doi.org/10.48550/arxiv.2109.13766
|
|
BASE
|
|
Hide details
|
|
3 |
Finding Concept-specific Biases in Form--Meaning Associations ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Disambiguatory Signals are Stronger in Word-initial Positions ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
SIGMORPHON 2020 Shared Task 0: Typologically Diverse Morphological Inflection ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|