1 |
Learning the Ordering of Coordinate Compounds and Elaborate Expressions in Hmong, Lahu, and Chinese ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
AUTOLEX: An Automatic Framework for Linguistic Exploration ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Evaluating the Morphosyntactic Well-formedness of Generated Texts ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Tusom2021: A Phonetically Transcribed Speech Dataset from an Endangered Language for Universal Phone Recognition Experiments ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Evaluating the Morphosyntactic Well-formedness of Generated Texts ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Differentiable Allophone Graphs for Language-Universal Speech Recognition ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Towards Zero-shot Learning for Automatic Phonemic Transcription ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Automatic Extraction of Rules Governing Morphological Agreement ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Where New Words Are Born: Distributional Semantic Analysis of Neologisms and Their Semantic Neighborhoods ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Universal Phone Recognition with a Multilingual Allophone System ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Cross-Cultural Similarity Features for Cross-Lingual Transfer Learning of Pragmatically Motivated Tasks ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Characterizing Sociolinguistic Variation in the Competing Vaccination Communities ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Where New Words Are Born: Distributional Semantic Analysis of Neologisms and Their Semantic Neighborhoods
|
|
|
|
In: Proceedings of the Society for Computation in Linguistics (2020)
|
|
BASE
|
|
Show details
|
|
15 |
Using Interlinear Glosses as Pivot in Low-Resource Multilingual Machine Translation ...
|
|
|
|
Abstract:
We demonstrate a new approach to Neural Machine Translation (NMT) for low-resource languages using a ubiquitous linguistic resource, Interlinear Glossed Text (IGT). IGT represents a non-English sentence as a sequence of English lemmas and morpheme labels. As such, it can serve as a pivot or interlingua for NMT. Our contribution is four-fold. Firstly, we pool IGT for 1,497 languages in ODIN (54,545 glosses) and 70,918 glosses in Arapaho and train a gloss-to-target NMT system from IGT to English, with a BLEU score of 25.94. We introduce a multilingual NMT model that tags all glossed text with gloss-source language tags and train a universal system with shared attention across 1,497 languages. Secondly, we use the IGT gloss-to-target translation as a key step in an English-Turkish MT system trained on only 865 lines from ODIN. Thirdly, we we present five metrics for evaluating extremely low-resource translation when BLEU is no longer sufficient and evaluate the Turkish low-resource system using BLEU and also ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://dx.doi.org/10.48550/arxiv.1911.02709 https://arxiv.org/abs/1911.02709
|
|
BASE
|
|
Hide details
|
|
16 |
Adapting Word Embeddings to New Languages with Morphological and Phonological Subword Representations ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Lexical Prefixes and Tibeto-Burman Laryngeal Contrasts
|
|
|
|
In: Mortensen, David R. (2013). Lexical Prefixes and Tibeto-Burman Laryngeal Contrasts. Proceedings of the 37th Annual Meeting of the Berkeley Linguistics Society, 37(37), 272 - 286. Retrieved from: http://www.escholarship.org/uc/item/1229x8bj (2013)
|
|
BASE
|
|
Show details
|
|
20 |
Lexical prefixes and Tibeto-Burman laryngeal contrasts
|
|
|
|
In: Annual Meeting of the Berkeley Linguistics Society; BLS 37: General Session and Parasession on Language, Gender, and Sexuality; 272-286 ; 2377-1666 ; 0363-2946 (2011)
|
|
BASE
|
|
Show details
|
|
|
|