41 |
Morphologically Annotated Corpora and Morphological Analyzers for Moroccan and Sanaani Yemeni Arabic
|
|
|
|
In: 10th Language Resources and Evaluation Conference (LREC 2016) ; https://hal.archives-ouvertes.fr/hal-01349201 ; 10th Language Resources and Evaluation Conference (LREC 2016), May 2016, Portoroz, Slovenia (2016)
|
|
BASE
|
|
Show details
|
|
42 |
A Large Scale Corpus of Gulf Arabic
|
|
|
|
In: Language Resources and Evaluation Conference ; https://hal.archives-ouvertes.fr/hal-01349204 ; Language Resources and Evaluation Conference, 2016, Portoroz, Slovenia (2016)
|
|
Abstract:
International audience ; Most Arabic natural language processing tools and resources are developed to serve Modern Standard Arabic (MSA), which is the official written language in the Arab World. Some Dialectal Arabic varieties, notably Egyptian Arabic, have received some attention lately and have a growing collection of resources that include annotated corpora and morphological analyzers and taggers. Gulf Arabic, however, lags behind in that respect. In this paper, we present the Gumar Corpus, a large-scale corpus of Gulf Arabic consisting of 110 million words from 1,200 forum novels. We annotate the corpus for sub-dialect information at the document level. We also present results of a preliminary study in the morphological annotation of Gulf Arabic which includes developing guidelines for a conventional orthography. The text of the corpus is publicly browsable through a web interface we developed for it.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [INFO]Computer Science [cs]; Arabic Dialects; Corpus; Gulf Arabic; Large-Scale
|
|
URL: https://hal.archives-ouvertes.fr/hal-01349204/file/GulfArabicCorpus-LREC2016.pdf https://hal.archives-ouvertes.fr/hal-01349204 https://hal.archives-ouvertes.fr/hal-01349204/document
|
|
BASE
|
|
Hide details
|
|
43 |
Exploiting Arabic Diacritization for High Quality Automatic Annotation
|
|
|
|
In: Language Resources and Evaluation Conference ; https://hal.archives-ouvertes.fr/hal-01349206 ; Language Resources and Evaluation Conference, 2016, Portoroz, Slovenia (2016)
|
|
BASE
|
|
Show details
|
|
44 |
DALILA: The Dialectal Arabic Linguistic Learning Assistant
|
|
|
|
In: Language Resources and Evaluation Conference ; https://hal.archives-ouvertes.fr/hal-01349203 ; Language Resources and Evaluation Conference, 2016, Portoroz, Slovenia (2016)
|
|
BASE
|
|
Show details
|
|
45 |
Egyptian Arabic to English Statistical Machine Translation System for NIST OpenMT'2015 ...
|
|
|
|
BASE
|
|
Show details
|
|
47 |
A Conventional Orthography for Algerian Arabic
|
|
|
|
In: Proceedings of the Second Workshop on Arabic Natural Language ; the Second Workshop on Arabic Natural Language Processing ; https://hal.archives-ouvertes.fr/hal-02012254 ; the Second Workshop on Arabic Natural Language Processing, 2015, Beijing, China. pp.69 - 79 (2015)
|
|
BASE
|
|
Show details
|
|
48 |
POS-tagging of Tunisian Dialect Using Standard Arabic Resources and Tools
|
|
|
|
In: Proceedings of the Second Workshop on Arabic Natural Language Processing ; Workshop on Arabic Natural Language Processing ; https://hal.archives-ouvertes.fr/hal-01464860 ; Workshop on Arabic Natural Language Processing, Jul 2015, Beijing, China. pp.59 - 68, ⟨10.18653/v1/W15-3207⟩ (2015)
|
|
BASE
|
|
Show details
|
|
49 |
Conventional Orthography for Dialectal Arabic (CODA): Principles and Guidelines -- Egyptian Arabic - Version 0.7 - March 2012
|
|
|
|
BASE
|
|
Show details
|
|
50 |
A Corpus and Phonetic Dictionary for Tunisian Arabic Speech Recognition
|
|
|
|
In: The 9th edition of the Language Resources and Evaluation Conference (LREC 2014) ; https://hal.archives-ouvertes.fr/hal-01433247 ; The 9th edition of the Language Resources and Evaluation Conference (LREC 2014), 2014, Reykjavik, Iceland (2014)
|
|
BASE
|
|
Show details
|
|
51 |
Conventional Orthography for Dialectal Arabic (CODA): Principles and Guidelines -- Egyptian Arabic - Version 0.7 - March 2012 ...
|
|
|
|
BASE
|
|
Show details
|
|
52 |
Domain and Dialect Adaptation for Machine Translation into Egyptian Arabic ...
|
|
|
|
BASE
|
|
Show details
|
|
53 |
Domain and Dialect Adaptation for Machine Translation into Egyptian Arabic ...
|
|
|
|
BASE
|
|
Show details
|
|
56 |
Un système de traduction de verbes entre arabe standard et arabe dialectal par analyse morphologique profonde
|
|
|
|
In: Traitement Automatique des Langues Naturelles ; https://hal.archives-ouvertes.fr/hal-00908795 ; Traitement Automatique des Langues Naturelles, Jun 2013, France. pp.396 - 406 (2013)
|
|
BASE
|
|
Show details
|
|
57 |
Overview of the SPMRL 2013 shared task: cross-framework evaluation of parsing morphologically rich languages
|
|
|
|
In: Seddah, Djamé, Tsarfaty, Reut, Kübler, Sandra, Candito, Marie, Choi, Jinho, Farkas, Richard, Foster, Jennifer orcid:0000-0002-7789-4853 , Goenaga, Iakes, Gojenola, Koldo, Goldberg, Yoav, Green, Spence, Habash, Nizar, Kuhlmann, Marco, Maier, Wolfgang, Nivre, Joakim, Przepiórkowski, Adam, Roth, Ryan, Seeker, Wolfgang, Versley, Yannick, Vincze, Veronika, Wolinski, Marcin, Wróblewska, Alina and Villemonte de la Clérgerie, Eric (2013) Overview of the SPMRL 2013 shared task: cross-framework evaluation of parsing morphologically rich languages. In: Fourth Workshop on Statistical Parsing of Morphologically Rich Languages, 18 Oct 2013, Seattle, WA. (2013)
|
|
BASE
|
|
Show details
|
|
58 |
Annotation Guidelines for Arabic Nominal Gender, Number, and Rationality
|
|
|
|
BASE
|
|
Show details
|
|
59 |
LDC Arabic Treebanks and Associated Corpora: Data Divisions Manual
|
|
|
|
BASE
|
|
Show details
|
|
60 |
The Effects of Factorizing Root and Pattern Mapping in Bidirectional Tunisian - Standard Arabic Machine Translation
|
|
|
|
In: MT Summit 2013 ; https://hal.archives-ouvertes.fr/hal-00908761 ; MT Summit 2013, Sep 2013, France. pas d'édition papier (2013)
|
|
BASE
|
|
Show details
|
|
|
|