1 |
Cyberbullying Classifiers are Sensitive to Model-Agnostic Perturbations ...
|
|
|
|
Abstract:
A limited amount of studies investigates the role of model-agnostic adversarial behavior in toxic content classification. As toxicity classifiers predominantly rely on lexical cues, (deliberately) creative and evolving language-use can be detrimental to the utility of current corpora and state-of-the-art models when they are deployed for content moderation. The less training data is available, the more vulnerable models might become. This study is, to our knowledge, the first to investigate the effect of adversarial behavior and augmentation for cyberbullying detection. We demonstrate that model-agnostic lexical substitutions significantly hurt classifier performance. Moreover, when these perturbed samples are used for augmentation, we show models become robust against word-level perturbations at a slight trade-off in overall task performance. Augmentations proposed in prior work on toxicity prove to be less effective. Our results underline the need for such evaluations in online harm areas with small ... : Submitted to LREC 2022 ...
|
|
Keyword:
Computation and Language cs.CL; Computers and Society cs.CY; FOS Computer and information sciences; Social and Information Networks cs.SI
|
|
URL: https://arxiv.org/abs/2201.06384 https://dx.doi.org/10.48550/arxiv.2201.06384
|
|
BASE
|
|
Hide details
|
|
3 |
Mapping probability word problems to executable representations ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Interlocutors’ Age Impacts Teenagers’ Online Writing Style: Accommodation in Intra- and Intergenerational Online Conversations
|
|
|
|
In: Front Artif Intell (2021)
|
|
BASE
|
|
Show details
|
|
6 |
The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe
|
|
|
|
In: Language Resources and Evaluation Conference ; https://hal.archives-ouvertes.fr/hal-02892154 ; Language Resources and Evaluation Conference, ELDA/ELRA, May 2020, Marseille, France ; https://lrec2020.lrec-conf.org/en/ (2020)
|
|
BASE
|
|
Show details
|
|
8 |
The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Neural Machine Translation of Artwork Titles Using Iconclass Codes ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
A deep generative approach to native language identification ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Effective weakly supervised semantic frame induction using expression sharing in hierarchical hidden Markov models ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
What makes a distributional context useful? Lexical diversity is more important than frequency ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Children Probably Store Short Rather Than Frequent or Predictable Chunks: Quantitative Evidence From a Corpus Study
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Overview of PAN 2019: Bots and Gender Profiling, Celebrity Profiling, Cross-domain Authorship Attribution and Style Change Detection
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Multilingual Cross-domain Perspectives on Online Hate Speech ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Patient representation learning and interpretable evaluation using clinical notes ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Lexical category acquisition is facilitated by uncertainty in distributional co-occurrences
|
|
|
|
BASE
|
|
Show details
|
|
|
|