DE eng

Search in the Catalogues and Directories

Page: 1 2
Hits 1 – 20 of 38

1
A simple language-agnostic yet very strong baseline system for hate speech and offensive content identification ...
Bestgen, Yves. - : arXiv, 2022
Abstract: For automatically identifying hate speech and offensive content in tweets, a system based on a classical supervised algorithm only fed with character n-grams, and thus completely language-agnostic, is proposed by the SATLab team. After its optimization in terms of the feature weighting and the classifier parameters, it reached, in the multilingual HASOC 2021 challenge, a medium performance level in English, the language for which it is easy to develop deep learning approaches relying on many external linguistic resources, but a far better level for the two less resourced language, Hindi and Marathi. It ends even first when performances are averaged over the three tasks in these languages, outperforming many deep learning approaches. These performances suggest that it is an interesting reference level to evaluate the benefits of using more complex approaches such as deep learning or taking into account complementary resources. ... : A slightly modified version of the paper: "A simple language-agnostic yet strong baseline system for hate speech and offensive content identification. In Working Notes of FIRE 2021 - Forum for Information Retrieval Evaluation (10 p.). ceur-ws.org ...
Keyword: Computation and Language cs.CL; FOS Computer and information sciences
URL: https://arxiv.org/abs/2202.02511
https://dx.doi.org/10.48550/arxiv.2202.02511
BASE
Hide details
2
Using Fisher's Exact Test to Evaluate Association Measures for N-grams ...
Bestgen, Yves. - : arXiv, 2021
BASE
Show details
3
LAST at CMCL 2021 Shared Task: Predicting Gaze Data During Reading with a Gradient Boosting Decision Tree Approach ...
Bestgen, Yves. - : arXiv, 2021
BASE
Show details
4
LAST at SemEval-2021 Task 1: Improving Multi-Word Complexity Prediction Using Bigram Association Measures ...
Bestgen, Yves. - : arXiv, 2021
BASE
Show details
5
Tracking L2 writers' phraseological development using collgrams: evidence from a longitudinal EFL corpus
In: Corpora and lexis. - Leiden : Brill Rodopi (2018), 277-301
BLLDB
Show details
6
Evaluating the frequency threshold for selecting lexical bundles by means of an extension of the Fisher's exact test
In: Corpora. - Edinburgh : Univ. Press 13 (2018) 2, 205-228
BLLDB
Show details
7
Getting rid of the Chi-square and Log-likelihood tests for analysing vocabulary differences between corpora
In: Bestgen, Yves. Getting rid of the Chi-square and Log-likelihood tests for analysing vocabulary differences between corpora. En: Quaderns de filología. Estudis lingüístics, 22 2017: 33-56 (2017)
BASE
Show details
8
Using n-grams to map registers across languages and uncover cross-linguistic contrasts: Insights from Correspondence Analysis
In: CBL (Cercle Belge de Linguistique) 2016 ; https://hal.archives-ouvertes.fr/hal-01426811 ; CBL (Cercle Belge de Linguistique) 2016, May 2016, Louvain-la-Neuve, Belgium (2016)
BASE
Show details
9
Vers une analyse des différences interlinguistiques entre les genres textuels : étude de cas basée sur les n-grammes et l'analyse factorielle des correspondances
In: TALN 2016: Traitement Automatique des Langues Naturelles ; https://hal.archives-ouvertes.fr/hal-01426820 ; TALN 2016: Traitement Automatique des Langues Naturelles, Jul 2016, Paris, France (2016)
BASE
Show details
10
Exact Expected Average Precision of the Random Baseline for System Evaluation
In: Prague Bulletin of Mathematical Linguistics , Vol 103, Iss 1, Pp 131-138 (2015) (2015)
BASE
Show details
11
Quantifying the development of phraseological competence in L2 English writing: An automated approach
In: Journal of second language writing. - Amsterdam ˜[u.a]œ : Elsevier 26 (2014), 28-41
OLC Linguistik
Show details
12
Inadequacy of the chi-squared test to examine vocabulary differences between corpora
In: LLC. - Oxford : Oxford Univ. Press 29 (2014) 2, 164
OLC Linguistik
Show details
13
The use of collocations by intermediate vs. advanced non-native writers: a bigram-based study
In: International review of applied linguistics in language teaching. - Berlin : de Gruyter 52 (2014) 3, 229-252
BLLDB
Show details
14
Construction automatique de ressources lexicales pour la fouille d'opinion. ...
Bestgen, Yves. - : ARIA, 2013
BASE
Show details
15
Toward automatic determination of the semantics of connectives in large newspaper corpora
In: Discourse processes. - London [u.a.] : Routledge, Taylor and Francis Group 41 (2006) 2, 175-193
BLLDB
OLC Linguistik
Show details
16
Improving text segmentation using latent semantic analysis : a reanalysis of Choi, Wiemer-Hastings, and Moore (2001)
In: Computational linguistics. - Cambridge, Mass. : MIT Press 32 (2006) 1, 5-12
BLLDB
OLC Linguistik
Show details
17
Improving Text Segmentation Using Latent Semantic Analysis: A Reanalysis of Choi, Wiemer-Hastings, and Moore
In: Computational linguistics. - Cambridge, Mass. : MIT Press 32 (2006) 3, 455
OLC Linguistik
Show details
18
Validation d'une méthodologie pour l'étude des marqueurs de la segmentation dans un grand corpus de textes
In: Traitement automatique des langues. - Paris : ATALA 47 (2006) 2, 89-110
BLLDB
Show details
19
Identification automatique des marqueurs globaux du discours par l'analyse des expressions récurrentes
In: Université catholique de Louvain / Institut de linguistique. Cahiers de l'Institut de Linguistique de Louvain. - Louvain 31 (2005) 2-4, 301-307
OLC Linguistik
Show details
20
How to determine the meaning and use of (causal) connectives in (large) corpora : from hand-based to automatic analyses
In: Electronic Document Week (SDN 2004), Workshop ATALA "Modelling and describing discourse organisation in the age of the digital document", La Rochelle ; https://archivesic.ccsd.cnrs.fr/sic_00001224 ; Jun 2004 (2004)
BASE
Show details

Page: 1 2

Catalogues
0
0
10
0
0
0
1
Bibliographies
16
0
0
1
0
0
0
0
1
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
11
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern