22 |
Adversarial Multitask Learning for Joint Multi-Feature and Multi-Dialect Morphological Modeling ...
|
|
|
|
Abstract:
Morphological tagging is challenging for morphologically rich languages due to the large target space and the need for more training data to minimize model sparsity. Dialectal variants of morphologically rich languages suffer more as they tend to be more noisy and have less resources. In this paper we explore the use of multitask learning and adversarial training to address morphological richness and dialectal variations in the context of full morphological tagging. We use multitask learning for joint morphological modeling for the features within two dialects, and as a knowledge-transfer scheme for cross-dialectal modeling. We use adversarial training to learn dialect invariant features that can help the knowledge-transfer scheme from the high to low-resource variants. We work with two dialectal variants: Modern Standard Arabic (high-resource "dialect") and Egyptian Arabic (low-resource dialect) as a case study. Our models achieve state-of-the-art results for both. Furthermore, adversarial training provides ... : Accepted to ACL 2019 ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences; Machine Learning cs.LG
|
|
URL: https://arxiv.org/abs/1910.12702 https://dx.doi.org/10.48550/arxiv.1910.12702
|
|
BASE
|
|
Hide details
|
|
25 |
Joint Diacritization, Lemmatization, Normalization, and Fine-Grained Morphological Tagging ...
|
|
|
|
BASE
|
|
Show details
|
|
26 |
CoNLL-UL: Universal Morphological Lattices for Universal Dependency Parsing
|
|
|
|
In: 11th Language Resources and Evaluation Conference ; https://hal.inria.fr/hal-01786125 ; 11th Language Resources and Evaluation Conference, May 2018, Miyazaki, Japan ; http://lrec2018.lrec-conf.org (2018)
|
|
BASE
|
|
Show details
|
|
27 |
Universal Dependencies 2.2
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-01930733 ; 2018 (2018)
|
|
BASE
|
|
Show details
|
|
28 |
MADARi: A Web Interface for Joint Arabic Morphological Annotation and Spelling Correction ...
|
|
|
|
BASE
|
|
Show details
|
|
31 |
Utilizing Character and Word Embeddings for Text Normalization with Sequence-to-Sequence Models ...
|
|
|
|
BASE
|
|
Show details
|
|
32 |
Identifying effective translations for cross-lingual Arabic-to-English user-generated speech search
|
|
|
|
In: Khwileh, Ahmad, Afli, Haithem orcid:0000-0002-7449-4707 , Jones, Gareth J.F. orcid:0000-0003-2923-8365 and Way, Andy orcid:0000-0001-5736-5930 (2017) Identifying effective translations for cross-lingual Arabic-to-English user-generated speech search. In: Third Arabic Natural Language Processing Workshop (WANLP), 3 Apr 2017, Valencia, Spain. (2017)
|
|
BASE
|
|
Show details
|
|
33 |
Universal Dependencies 2.1
|
|
|
|
In: https://hal.inria.fr/hal-01682188 ; 2017 (2017)
|
|
BASE
|
|
Show details
|
|
36 |
Universal Dependencies 2.0 – CoNLL 2017 Shared Task Development and Test Data
|
|
|
|
BASE
|
|
Show details
|
|
38 |
Low Resourced Machine Translation via Morpho-syntactic Modeling: The Case of Dialectal Arabic ...
|
|
|
|
BASE
|
|
Show details
|
|
39 |
CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies
|
|
|
|
BASE
|
|
Show details
|
|
40 |
Optimizing Tokenization Choice for Machine Translation across Multiple Target Languages
|
|
|
|
In: Prague Bulletin of Mathematical Linguistics , Vol 108, Iss 1, Pp 257-269 (2017) (2017)
|
|
BASE
|
|
Show details
|
|
|
|