1 |
WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection ...
|
|
|
|
Abstract:
The perceived toxicity of language can vary based on someone's identity and beliefs, but this variation is often ignored when collecting toxic language datasets, resulting in dataset and model biases. We seek to understand the who, why, and what behind biases in toxicity annotations. In two online studies with demographically and politically diverse participants, we investigate the effect of annotator identities (who) and beliefs (why), drawing from social psychology research about hate speech, free speech, racist beliefs, political leaning, and more. We disentangle what is annotated as toxic by considering posts with three characteristics: anti-Black language, African American English (AAE) dialect, and vulgarity. Our results show strong associations between annotator identity and beliefs and their ratings of toxicity. Notably, more conservative annotators and those who scored highly on our scale for racist beliefs were less likely to rate anti-Black language as toxic, but more likely to rate AAE as toxic. ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences; Human-Computer Interaction cs.HC
|
|
URL: https://arxiv.org/abs/2111.07997 https://dx.doi.org/10.48550/arxiv.2111.07997
|
|
BASE
|
|
Hide details
|
|
4 |
Specializing Multilingual Language Models: An Empirical Study ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Provable Limitations of Acquiring Meaning from Ungrounded Form: What will Future Language Models Understand? ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Measuring Association Between Labels and Free-Text Rationales ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Promoting Graph Awareness in Linearized Graph-to-Text Generation ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Challenges in Automated Debiasing for Toxic Language Detection ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Effects of Parameter Norm Growth During Transformer Training: Inductive Bias from Gradient Descent ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Competency Problems: On Finding and Removing Artifacts in Language Data ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Semantic Comparisons for Natural Language Processing Applications
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Challenges in Automated Debiasing for Toxic Language Detection
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Parsing with Multilingual BERT, a Small Corpus, and a Small Treebank ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Unsupervised Bitext Mining and Translation via Self-trained Contextual Embeddings ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Evaluating Models' Local Decision Boundaries via Contrast Sets ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Grounded Compositional Outputs for Adaptive Language Modeling ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|