1 |
EMBEDDIA tools output example corpus of Estonian, Croatian and Latvian news articles 1.0
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Retweet communities reveal the main sources of hate speech
|
|
|
|
In: PLoS One (2022)
|
|
BASE
|
|
Show details
|
|
3 |
Slovenian Twitter dataset 2018-2020 1.0
|
|
|
|
Abstract:
The dataset represents the Twitter production in Slovenian in the period from 2018 until 2020. It consists of tweet IDs, retweet IDs, pseudo-anonymized user IDs, publication dates, and automatically assigned hate labels (acceptable, inappropriate, offensive, violent) with https://huggingface.co/IMSyPP/hate_speech_slo. The dataset is the basis for the two following papers: - "Retweet communities reveal the main source of hate speech" - https://arxiv.org/pdf/2105.14898.pdf - "Community evolution in retweet networks" - https://arxiv.org/pdf/2105.06214.pdf
|
|
Keyword:
hate speech; retweet networks; Twitter
|
|
URL: http://hdl.handle.net/11356/1423
|
|
BASE
|
|
Hide details
|
|
9 |
Investigating cross-lingual training for offensive language detection
|
|
|
|
In: PeerJ Comput Sci (2021)
|
|
BASE
|
|
Show details
|
|
|
|