1 |
Abolitionist Networks: Modeling Language Change in Nineteenth-Century Activist Newspapers ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Revisiting the Primacy of English in Zero-shot Cross-lingual Transfer ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Abolitionist Networks: Modeling Language Change in Nineteenth-Century Activist Newspapers ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Tuiteamos o pongamos un tuit? Investigating the Social Constraints of Loanword Integration in Spanish Social Media ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Tuiteamos o pongamos un tuit? Investigating the Social Constraints of Loanword Integration in Spanish Social Media
|
|
|
|
In: Proceedings of the Society for Computation in Linguistics (2021)
|
|
BASE
|
|
Show details
|
|
7 |
Will it Unblend?
|
|
|
|
In: Proceedings of the Society for Computation in Linguistics (2021)
|
|
Abstract:
Natural language processing systems often struggle with out-of-vocabulary (OOV) terms, which do not appear in training data. Blends, such as *innoventor*, are one particularly challenging class of OOV, as they are formed by fusing together two or more bases that relate to the intended meaning in unpredictable manners and degrees. In this work, we run experiments on a novel dataset of English OOV blends to quantify the difficulty of interpreting the meanings of blends by large-scale contextual language models such as BERT. We first show that BERT's processing of these blends does not fully access the component meanings, leaving their contextual representations semantically impoverished. We find this is mostly due to the loss of characters resulting from blend formation. Then, we assess how easily different models can recognize the structure and recover the origin of blends, and find that context-aware embedding systems outperform character-level and context-free embeddings, although their results are still far from satisfactory.
|
|
Keyword:
blends; compounds; Computational Linguistics; contextual-models; oov; out-of-vocabulary; portmanteaux; segmentation
|
|
URL: https://scholarworks.umass.edu/scil/vol4/iss1/62 https://scholarworks.umass.edu/cgi/viewcontent.cgi?article=1189&context=scil
|
|
BASE
|
|
Hide details
|
|
8 |
Abolitionist Networks: Modeling Language Change in Nineteenth-Century Activist Newspapers ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
How We Do Things With Words: Analyzing Text as Social and Cultural Data
|
|
|
|
In: Front Artif Intell (2020)
|
|
BASE
|
|
Show details
|
|
14 |
The Referential Reader: A Recurrent Entity Network for Anaphora Resolution ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Discovering Sociolinguistic Associations with Structured Sparsity ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Making "fetch" happen: The influence of social and linguistic context on nonstandard word growth and decline ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Sí o no, què penses? Catalonian Independence and Linguistic Identity on Social Media ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Mind Your POV: Convergence of Articles and Editors Towards Wikipedia's Neutrality Norm ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Making "fetch" happen: The influence of social and linguistic context on nonstandard word growth and decline ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
#anorexia, #anarexia, #anarexyia: Characterizing Online Community Practices with Orthographic Variation ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|