1 |
Machine translation of user-generated content
|
|
Lohar, Pintu. - : Dublin City University. School of Computing, 2020. : Dublin City University. ADAPT, 2020
|
|
In: Lohar, Pintu (2020) Machine translation of user-generated content. PhD thesis, Dublin City University. (2020)
|
|
BASE
|
|
Show details
|
|
2 |
FooTweets: a bilingual parallel corpus of World Cup tweets
|
|
|
|
In: Sluyter-Gäthje, Henny, Lohar, Pintu, Afli, Haithem orcid:0000-0002-7449-4707 and Way, Andy orcid:0000-0001-5736-5930 (2018) FooTweets: a bilingual parallel corpus of World Cup tweets. In: LREC 2018 - 11th International Conference on Language Resources and Evaluation, 7-12 May 2018, Miyazaki, Japan. ISBN 979-10-95546-00-9 (2018)
|
|
BASE
|
|
Show details
|
|
3 |
Balancing translation quality and sentiment preservation
|
|
|
|
In: Lohar, Pintu, Afli, Haithem orcid:0000-0002-7449-4707 and Way, Andy orcid:0000-0001-5736-5930 (2018) Balancing translation quality and sentiment preservation. In: AMTA 2018, 17-21 Mar 2018, Boston, MA. USA. (2018)
|
|
Abstract:
Social media platforms such as Twitter and Facebook are hugely popular websites through which Internet users can communicate and spread information worldwide. On Twitter, messages (tweets) are generated by users from all over the world in many different languages. Tweets about different events almost always encode some degree of sentiment. As is often the case in the field of language processing, sentiment analysis tools exist primarily in English, so if we want to understand the sentiment of the original tweets, we are forced to translate them from the source language into English and pushing the English translations through a sentiment analysis tool. However, Lohar et al. (2017) demonstrated that using freely available translation tools often caused the sentiment encoded in the original tweet to be altered. As a consequence, they built a series of sentiment-specific translation engines and pushed tweets containing either positive, neutral or negative sentiment through the appropriate engine to improve sentiment preservation in the target language. For certain tasks, maintaining sentiment polarity in the target language during the translation process is arguably more important than the absolute translation quality obtained. In the work of Lohar et al. (2017), a small drop off in translation quality per se was deemed tolerable. In this work, we focus on maintaining the level of sentiment preservation while trying to improve translation quality still further. We propose a nearest sentiment classcombination method to extend the existing sentiment-specific translation systems by adding training data from the nearest-sentiment class. Our experimental results on German-to-English reveal that our approach is capable of achieving a proper balance between translation quality and sentiment preservation.
|
|
Keyword:
German-to-English; Machine translating; sentiment polarity; social media
|
|
URL: http://doras.dcu.ie/23206/
|
|
BASE
|
|
Hide details
|
|
4 |
Sentiment translation for low resourced languages: experiments on Irish general election Tweets
|
|
|
|
In: Afli, Haithem orcid:0000-0002-7449-4707 , Maguire, Sorcha and Way, Andy orcid:0000-0001-5736-5930 (2017) Sentiment translation for low resourced languages: experiments on Irish general election Tweets. In: 18th International Conference on Computational Linguistics and Intelligent Text Processing, 17-21 Apr 2017, Budapest, Hungry. (2017)
|
|
BASE
|
|
Show details
|
|
5 |
MultiNews: a web collection of an aligned multimodal and multilingual corpus
|
|
|
|
In: Afli, Haithem orcid:0000-0002-7449-4707 , Lohar, Pintu and Way, Andy orcid:0000-0001-5736-5930 (2017) MultiNews: a web collection of an aligned multimodal and multilingual corpus. In: Workshop on Curation and Applications of Parallel and Comparable Corpora, 27 Nov- 1 Dec 2017, Taipei, Taiwan. ISBN 978-1-948087-05-6 (2017)
|
|
BASE
|
|
Show details
|
|
6 |
Maintaining sentiment polarity in translation of user-generated content
|
|
|
|
In: Lohar, Pintu, Afli, Haithem orcid:0000-0002-7449-4707 and Way, Andy orcid:0000-0001-5736-5930 (2017) Maintaining sentiment polarity in translation of user-generated content. Prague Bulletin of Mathematical Linguistics (108). pp. 73-84. ISSN 1804-0462 (2017)
|
|
BASE
|
|
Show details
|
|
7 |
Identifying effective translations for cross-lingual Arabic-to-English user-generated speech search
|
|
|
|
In: Khwileh, Ahmad, Afli, Haithem orcid:0000-0002-7449-4707 , Jones, Gareth J.F. orcid:0000-0003-2923-8365 and Way, Andy orcid:0000-0001-5736-5930 (2017) Identifying effective translations for cross-lingual Arabic-to-English user-generated speech search. In: Third Arabic Natural Language Processing Workshop (WANLP), 3 Apr 2017, Valencia, Spain. (2017)
|
|
BASE
|
|
Show details
|
|
8 |
Dublin City University participation in the VTT track at TRECVid 2017
|
|
|
|
In: Afli, Haithem orcid:0000-0002-7449-4707 , Hu, Feiyan orcid:0000-0001-7451-6438 , Du, Jinhua orcid:0000-0002-3267-4881 , Cosgrove, Daniel, McGuinness, Kevin orcid:0000-0003-1336-6477 , O'Connor, Noel E. orcid:0000-0002-4033-9135 , Arazo Sánchez, Eric, Zhou, Jiang orcid:0000-0002-3067-8512 and Smeaton, Alan F. orcid:0000-0003-1028-8389 (2017) Dublin City University participation in the VTT track at TRECVid 2017. In: TRECVid workshop, 13-15 Nov 2017, Gaithersburg, Md., USA. (2017)
|
|
BASE
|
|
Show details
|
|
9 |
Identifying effective translations for cross-lingual Arabic-to-English user-generated speech search
|
|
|
|
In: Khwileh, Ahmad, Afli, Haithem orcid:0000-0002-7449-4707 , Jones, Gareth J.F. orcid:0000-0003-2923-8365 and Way, Andy orcid:0000-0001-5736-5930 (2017) Identifying effective translations for cross-lingual Arabic-to-English user-generated speech search. In: Proceedings of The Third Arabic Natural Language Processing Workshop (WANLP), 3-4 Apr 2017, Valencia, Spain. (2017)
|
|
BASE
|
|
Show details
|
|
10 |
Maintaining Sentiment Polarity in Translation of User-Generated Content
|
|
|
|
In: Prague Bulletin of Mathematical Linguistics , Vol 108, Iss 1, Pp 73-84 (2017) (2017)
|
|
BASE
|
|
Show details
|
|
11 |
The ADAPT bilingual document alignment system at WMT16
|
|
|
|
In: Lohar, Pintu, Afli, Haithem orcid:0000-0002-7449-4707 , Liu, Chao-Hong orcid:0000-0002-1235-6026 and Way, Andy orcid:0000-0001-5736-5930 (2016) The ADAPT bilingual document alignment system at WMT16. In: First Conference on Machine Translation (WMT16), 11-12 Aug 2016, Berlin, Germany. (2016)
|
|
BASE
|
|
Show details
|
|
12 |
FaDA: fast document aligner using word embedding
|
|
|
|
In: Lohar, Pintu, Ganguly, Debasis orcid:0000-0003-0050-7138 , Afli, Haithem orcid:0000-0002-7449-4707 , Way, Andy orcid:0000-0001-5736-5930 and Jones, Gareth J.F. orcid:0000-0003-2923-8365 (2016) FaDA: fast document aligner using word embedding. Prague Bulletin of Mathematical Linguistics (106). pp. 169-179. ISSN 1804-0462 (2016)
|
|
BASE
|
|
Show details
|
|
13 |
Using SMT for OCR error correction of historical texts
|
|
|
|
In: Afli, Haithem orcid:0000-0002-7449-4707 , Qui, Zhengwei, Way, Andy orcid:0000-0001-5736-5930 and Sheridan, Páraic (2016) Using SMT for OCR error correction of historical texts. In: Tenth International Conference on Language Resources and Evaluation (LREC 2016), 23-28 May 2016, Portorož, Slovenia. ISBN 978-2-9517408-9-1 (2016)
|
|
BASE
|
|
Show details
|
|
14 |
Integrating optical character recognition and machine translation of historical documents
|
|
|
|
In: Afli, Haithem orcid:0000-0002-7449-4707 and Way, Andy orcid:0000-0001-5736-5930 (2016) Integrating optical character recognition and machine translation of historical documents. In: COLING, the 26th International Conference on Computational Linguistics, 13-16 Dec 2016, Osaka, Japan. (2016)
|
|
BASE
|
|
Show details
|
|
15 |
From Arabic user-generated content to machine translation: integrating automatic error correction
|
|
|
|
In: Afli, Haithem orcid:0000-0002-7449-4707 , Aransa, Walid, Lohar, Pintu and Way, Andy orcid:0000-0001-5736-5930 (2016) From Arabic user-generated content to machine translation: integrating automatic error correction. In: 17th International Conference on Intelligent Text Processing and Computational Linguistics, 3–9 Apr 2016, Konya, Turkey. (2016)
|
|
BASE
|
|
Show details
|
|
16 |
Dublin City University and partners’ participation in the INS and VTT tracks at TRECVid 2016
|
|
|
|
In: Marsden, Mark, Mohedano, Eva, McGuinness, Kevin orcid:0000-0003-1336-6477 , Calafell, Andrea, Giró-i-Nieto, Xavier orcid:0000-0002-9935-5332 , O'Connor, Noel E. orcid:0000-0002-4033-9135 , Zhou, Jiang orcid:0000-0002-3067-8512 , Azevedo, Lucas, Daudert, Tobias, Davis, Brian, Hurlimann, Manuela, Afli, Haithem orcid:0000-0002-7449-4707 , Du, Jinhua, Ganguly, Debasis orcid:0000-0003-0050-7138 , Li, Wei B. orcid:0000-0001-7347-3501 , Way, Andy orcid:0000-0001-5736-5930 and Smeaton, Alan F. orcid:0000-0003-1028-8389 (2016) Dublin City University and partners’ participation in the INS and VTT tracks at TRECVid 2016. In: TRECVid Conference, 14-16 Nov 2016, Gaithersburg, Md., USA. (2016)
|
|
BASE
|
|
Show details
|
|
17 |
Dublin City University and Partners' participation in the INS and VTT Tracks at TRECVid 2016
|
|
|
|
BASE
|
|
Show details
|
|
18 |
FaDA: Fast Document Aligner using Word Embedding
|
|
|
|
In: Prague Bulletin of Mathematical Linguistics , Vol 106, Iss 1, Pp 169-179 (2016) (2016)
|
|
BASE
|
|
Show details
|
|
19 |
OCR Error Correction Using Statistical Machine Translation
|
|
|
|
In: 16th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing 2015). ; https://hal.archives-ouvertes.fr/hal-01433200 ; 16th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing 2015)., 2015, Cairo, Egypt (2015)
|
|
BASE
|
|
Show details
|
|
|
|