2 |
Character Alignment in Morphologically Complex Translation Sets for Related Languages ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Composing Byte-Pair Encodings for Morphological Sequence Classification ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Variation in Universal Dependencies annotation: A token based typological case study on adpossessive constructions ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Corpus evidence for word order freezing in Russian and German ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Noise Isn't Always Negative: Countering Exposure Bias in Sequence-to-Sequence Inflection Models ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Exhaustive Entity Recognition for Coptic - Challenges and Solutions ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Imagining Grounded Conceptual Representations from Perceptual Information in Situated Guessing Games ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Attentively Embracing Noise for Robust Latent Representation in BERT ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Classifier Probes May Just Learn from Linear Context Features ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Seeing the world through text: Evaluating image descriptions for commonsense reasoning in machine reading comprehension ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Manifold Learning-based Word Representation Refinement Incorporating Global and Local Information ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
HMSid and HMSid2 at PARSEME Shared Task 2020: Computational Corpus Linguistics and unseen-in-training MWEs ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Multi-dialect Arabic BERT for Country-level Dialect Identification ...
|
|
|
|
Abstract:
Arabic dialect identification is a complex problem for a number of inherent properties of the language itself. In this paper, we present the experiments conducted, and the models developed by our competing team, Mawdoo3 AI, along the way to achieving our winning solution to subtask 1 of the Nuanced Arabic Dialect Identification (NADI) shared task. The dialect identification subtask provides 21,000 country-level labeled tweets covering all 21 Arab countries. An unlabeled corpus of 10M tweets from the same domain is also presented by the competition organizers for optional use. Our winning solution itself came in the form of an ensemble of different training iterations of our pre-trained BERT model, which achieved a micro-averaged F1-score of 26.78% on the subtask at hand. We publicly release the pre-trained language model component of our winning solution under the name of Multi-dialect-Arabic-BERT model, for any interested researcher out there. ...
|
|
Keyword:
Natural Language Processing
|
|
URL: https://underline.io/lecture/6530-multi-dialect-arabic-bert-for-country-level-dialect-identification https://dx.doi.org/10.48448/xm5v-rh49
|
|
BASE
|
|
Hide details
|
|
20 |
Exploring End-to-End Differentiable Natural Logic Modeling ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|