5 |
Finding Concept-specific Biases in Form--Meaning Associations ...
|
|
|
|
Abstract:
This work presents an information-theoretic operationalisation of cross-linguistic non-arbitrariness. It is not a new idea that there are small, cross-linguistic associations between the forms and meanings of words. For instance, it has been claimed (Blasi et al., 2016) that the word for "tongue" is more likely than chance to contain the phone [l]. By controlling for the influence of language family and geographic proximity within a very large concept-aligned, cross-lingual lexicon, we extend methods previously used to detect within language non-arbitrariness (Pimentel et al., 2019) to measure cross-linguistic associations. We find that there is a significant effect of non-arbitrariness, but it is unsurprisingly small (less than 0.5% on average according to our information-theoretic estimate). We also provide a concept-level analysis which shows that a quarter of the concepts considered in our work exhibit a significant level of cross-linguistic non-arbitrariness. In sum, the paper provides new methods to ... : Accepted at NAACL 2021. This is the camera ready version. Code is available in https://github.com/rycolab/form-meaning-associations ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://dx.doi.org/10.48550/arxiv.2104.06325 https://arxiv.org/abs/2104.06325
|
|
BASE
|
|
Hide details
|
|
6 |
Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
A surprisal--duration trade-off across and within the world's languages ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
What About the Precedent: An Information-Theoretic Analysis of Common Law ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Finding Concept-specific Biases in Form–Meaning Associations ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Disambiguatory Signals are Stronger in Word-initial Positions ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Modeling the Unigram Distribution
|
|
|
|
In: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (2021)
|
|
BASE
|
|
Show details
|
|
19 |
What About the Precedent: An Information-Theoretic Analysis of Common Law
|
|
|
|
In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2021)
|
|
BASE
|
|
Show details
|
|
20 |
Finding Concept-specific Biases in Form–Meaning Associations
|
|
|
|
In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2021)
|
|
BASE
|
|
Show details
|
|
|
|