22 |
Adversarial Multitask Learning for Joint Multi-Feature and Multi-Dialect Morphological Modeling ...
|
|
|
|
BASE
|
|
Show details
|
|
25 |
Joint Diacritization, Lemmatization, Normalization, and Fine-Grained Morphological Tagging ...
|
|
|
|
BASE
|
|
Show details
|
|
26 |
CoNLL-UL: Universal Morphological Lattices for Universal Dependency Parsing
|
|
|
|
In: 11th Language Resources and Evaluation Conference ; https://hal.inria.fr/hal-01786125 ; 11th Language Resources and Evaluation Conference, May 2018, Miyazaki, Japan ; http://lrec2018.lrec-conf.org (2018)
|
|
BASE
|
|
Show details
|
|
27 |
Universal Dependencies 2.2
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-01930733 ; 2018 (2018)
|
|
BASE
|
|
Show details
|
|
28 |
MADARi: A Web Interface for Joint Arabic Morphological Annotation and Spelling Correction ...
|
|
|
|
BASE
|
|
Show details
|
|
31 |
Utilizing Character and Word Embeddings for Text Normalization with Sequence-to-Sequence Models ...
|
|
|
|
Abstract:
Text normalization is an important enabling technology for several NLP tasks. Recently, neural-network-based approaches have outperformed well-established models in this task. However, in languages other than English, there has been little exploration in this direction. Both the scarcity of annotated data and the complexity of the language increase the difficulty of the problem. To address these challenges, we use a sequence-to-sequence model with character-based attention, which in addition to its self-learned character embeddings, uses word embeddings pre-trained with an approach that also models subword information. This provides the neural model with access to more linguistic information especially suitable for text normalization, without large parallel corpora. We show that providing the model with word-level features bridges the gap for the neural network approach to achieve a state-of-the-art F1 score on a standard Arabic language correction shared task dataset. ... : Accepted in EMNLP 2018 ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences; I.2.6; Machine Learning cs.LG; Machine Learning stat.ML
|
|
URL: https://arxiv.org/abs/1809.01534 https://dx.doi.org/10.48550/arxiv.1809.01534
|
|
BASE
|
|
Hide details
|
|
32 |
Identifying effective translations for cross-lingual Arabic-to-English user-generated speech search
|
|
|
|
In: Khwileh, Ahmad, Afli, Haithem orcid:0000-0002-7449-4707 , Jones, Gareth J.F. orcid:0000-0003-2923-8365 and Way, Andy orcid:0000-0001-5736-5930 (2017) Identifying effective translations for cross-lingual Arabic-to-English user-generated speech search. In: Third Arabic Natural Language Processing Workshop (WANLP), 3 Apr 2017, Valencia, Spain. (2017)
|
|
BASE
|
|
Show details
|
|
33 |
Universal Dependencies 2.1
|
|
|
|
In: https://hal.inria.fr/hal-01682188 ; 2017 (2017)
|
|
BASE
|
|
Show details
|
|
36 |
Universal Dependencies 2.0 – CoNLL 2017 Shared Task Development and Test Data
|
|
|
|
BASE
|
|
Show details
|
|
38 |
Low Resourced Machine Translation via Morpho-syntactic Modeling: The Case of Dialectal Arabic ...
|
|
|
|
BASE
|
|
Show details
|
|
39 |
CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies
|
|
|
|
BASE
|
|
Show details
|
|
40 |
Optimizing Tokenization Choice for Machine Translation across Multiple Target Languages
|
|
|
|
In: Prague Bulletin of Mathematical Linguistics , Vol 108, Iss 1, Pp 257-269 (2017) (2017)
|
|
BASE
|
|
Show details
|
|
|
|