1 |
SenZi: A sentiment analysis lexicon for the latinised Arabic (Arabizi)
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Identifying the Authors’ National Variety of English in Social Media Text
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Enabling Medical Translation for Low-Resource Languages ...
|
|
|
|
Abstract:
We present research towards bridging the language gap between migrant workers in Qatar and medical staff. In particular, we present the first steps towards the development of a real-world Hindi-English machine translation system for doctor-patient communication. As this is a low-resource language pair, especially for speech and for the medical domain, our initial focus has been on gathering suitable training data from various sources. We applied a variety of methods ranging from fully automatic extraction from the Web to manual annotation of test data. Moreover, we developed a method for automatically augmenting the training data with synthetically generated variants, which yielded a very sizable improvement of more than 3 BLEU points absolute. ... : CICLING-2016: 17th International Conference on Intelligent Text Processing and Computational Linguistics, Keywords: Machine Translation, medical translation, doctor-patient communication, resource-poor languages, Hindi ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences; I.2.7
|
|
URL: https://arxiv.org/abs/1610.02633 https://dx.doi.org/10.48550/arxiv.1610.02633
|
|
BASE
|
|
Hide details
|
|
7 |
Sublanguage corpus analysis toolkit:a tool for assessing the representativeness and sublanguage characteristics of corpora
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Sublanguage Corpus Analysis Toolkit: A tool for assessing the representativeness and sublanguage characteristics of corpora
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Multilingual person name recognition and transliteration ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|