1 |
First Align, then Predict: Understanding the Cross-Lingual Ability of Multilingual BERT
|
|
|
|
In: https://hal.inria.fr/hal-03161685 ; 2021 (2021)
|
|
BASE
|
|
Show details
|
|
2 |
Can Multilingual Language Models Transfer to an Unseen Dialect? A Case Study on North African Arabizi
|
|
|
|
In: https://hal.inria.fr/hal-03161677 ; 2021 (2021)
|
|
BASE
|
|
Show details
|
|
3 |
First Align, then Predict: Understanding the Cross-Lingual Ability of Multilingual BERT
|
|
|
|
In: EACL 2021 - The 16th Conference of the European Chapter of the Association for Computational Linguistics ; https://hal.inria.fr/hal-03239087 ; EACL 2021 - The 16th Conference of the European Chapter of the Association for Computational Linguistics, Apr 2021, Kyiv / Virtual, Ukraine ; https://2021.eacl.org/ (2021)
|
|
BASE
|
|
Show details
|
|
4 |
When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models
|
|
|
|
In: NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies ; https://hal.inria.fr/hal-03251105 ; NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Jun 2021, Mexico City, Mexico (2021)
|
|
BASE
|
|
Show details
|
|
5 |
Cross-Lingual GenQA: A Language-Agnostic Generative Question Answering Approach for Open-Domain Question Answering ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
First Align, then Predict: Understanding the Cross-Lingual Ability of Multilingual BERT ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Establishing a New State-of-the-Art for French Named Entity Recognition
|
|
|
|
In: LREC 2020 - 12th Language Resources and Evaluation Conference ; https://hal.inria.fr/hal-02617950 ; LREC 2020 - 12th Language Resources and Evaluation Conference, May 2020, Marseille, France ; http://www.lrec-conf.org (2020)
|
|
BASE
|
|
Show details
|
|
9 |
Building a User-Generated Content North-African Arabizi Treebank: Tackling Hell
|
|
|
|
In: ACL 2020 - 58th Annual Meeting of the Association for Computational Linguistics ; https://hal.inria.fr/hal-02889804 ; ACL 2020 - 58th Annual Meeting of the Association for Computational Linguistics, Jul 2020, Seattle / Virtual, Canada. ⟨10.18653/v1/2020.acl-main.107⟩ (2020)
|
|
Abstract:
International audience ; We introduce the first treebank for a romanized user-generated content variety of Algerian, a North-African Arabic dialect known for its frequent usage of code-switching. Made of 1500 sentences, fully annotated in morpho-syntax and Universal Dependency syntax, with full translation at both the word and the sentence levels, this treebank is made freely available. It is supplemented with 50k unlabeled sentences collected from Common Crawl and web-crawled data using intensive data-mining techniques. Preliminary experiments demonstrate its usefulness for POS tagging and dependency parsing. We believe that what we present in this paper is useful beyond the low-resource language community. This is the first time that enough unlabeled and annotated data is provided for an emerging user-generated content dialectal language with rich morphology and code switching, making it an challenging test-bed for most recent NLP approaches.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
|
|
URL: https://hal.inria.fr/hal-02889804/file/Building_an_Arabizi_Treebank__Tackling_Hell.pdf https://doi.org/10.18653/v1/2020.acl-main.107 https://hal.inria.fr/hal-02889804/document https://hal.inria.fr/hal-02889804
|
|
BASE
|
|
Hide details
|
|
10 |
CamemBERT: a Tasty French Language Model
|
|
|
|
In: ACL 2020 - 58th Annual Meeting of the Association for Computational Linguistics ; https://hal.inria.fr/hal-02889805 ; ACL 2020 - 58th Annual Meeting of the Association for Computational Linguistics, Jul 2020, Seattle / Virtual, United States. ⟨10.18653/v1/2020.acl-main.645⟩ (2020)
|
|
BASE
|
|
Show details
|
|
11 |
When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models
|
|
|
|
In: https://hal.inria.fr/hal-03109106 ; 2020 (2020)
|
|
BASE
|
|
Show details
|
|
12 |
Can Multilingual Language Models Transfer to an Unseen Dialect? A Case Study on North African Arabizi ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Unsupervised Learning for Handling Code-Mixed Data: A Case Study on POS Tagging of North-African Arabizi Dialect
|
|
|
|
In: EurNLP - First annual EurNLP ; https://hal.archives-ouvertes.fr/hal-02270527 ; EurNLP - First annual EurNLP, Oct 2019, Londres, United Kingdom (2019)
|
|
BASE
|
|
Show details
|
|
15 |
CamemBERT: a Tasty French Language Model
|
|
|
|
In: https://hal.inria.fr/hal-02445946 ; 2019 (2019)
|
|
BASE
|
|
Show details
|
|
16 |
Enhancing BERT for Lexical Normalization
|
|
|
|
In: The 5th Workshop on Noisy User-generated Text (W-NUT) ; https://hal.inria.fr/hal-02294316 ; The 5th Workshop on Noisy User-generated Text (W-NUT), Nov 2019, Hong Kong, China (2019)
|
|
BASE
|
|
Show details
|
|
17 |
ELMoLex: Connecting ELMo and Lexicon features for Dependency Parsing
|
|
|
|
In: CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies ; https://hal.inria.fr/hal-01959045 ; CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, Oct 2018, Brussels, Belgium. ⟨10.18653/v1/K18-2023⟩ (2018)
|
|
BASE
|
|
Show details
|
|
|
|