Page: 1 2 3 4 5 6 7 8 9... 13
81 |
Navegación de corpus a través de anotaciones lingüísticas automáticas obtenidas por Procesamiento del Lenguaje Natural: de anecdótico a ecdótico
|
|
|
|
In: Revista de Humanidades Digitales, vol. 4, pp. 136 (2019)
|
|
BASE
|
|
Show details
|
|
82 |
Identifying Semantic Divergences Across Languages
|
|
|
|
Abstract:
Cross-lingual resources such as parallel corpora and bilingual dictionaries are cornerstones of multilingual natural language processing (NLP). They have been used to study the nature of translation, train automatic machine translation systems, as well as to transfer models across languages for an array of NLP tasks. However, the majority of work in cross-lingual and multilingual NLP assumes that translations recorded in these resources are semantically equivalent. This is often not the case---words and sentences that are considered to be translations of each other frequently divergein meaning, often in systematic ways. In this thesis, we focus on such mismatches in meaning in text that we expect to be aligned across languages. We term such mismatches as cross-lingual semantic divergences. The core claim of this thesis is that translation is not always meaning preserving which leads to cross-lingual semantic divergences that affect multilingual NLP tasks. Detecting such divergences requires ways of directly characterizing differences in meaning across languages through novel cross-lingual tasks, as well as models that account for translation ambiguity and do not rely on expensive, task-specific supervision. We support this claim through three main contributions. First, we show that a large fraction of data in multilingual resources (such as parallel corpora and bilingual dictionaries) is identified as semantically divergent by human annotators. Second, we introduce cross-lingual tasks that characterize differences in word meaning across languages by identifying the semantic relation between two words. We also develop methods to predict such semantic relations, as well as a model to predict whether sentences in different languages have the same meaning. Finally, we demonstrate the impact of divergences by applying the methods developed in the previous sections to two downstream tasks. We first show that our model for identifying semantic relations between words helps in separating equivalent word translations from divergent translations in the context of bilingual dictionary induction, even when the two words are close in meaning. We also show that identifying and filtering semantic divergences in parallel data helps in training a neural machine translation system twice as fast without sacrificing quality.
|
|
Keyword:
Computer science; lexical semantics; Linguistics; machine learning; machine translation; multilingual nlp; natural language processing
|
|
URL: http://hdl.handle.net/1903/25448 https://doi.org/10.13016/rymp-ymgo
|
|
BASE
|
|
Hide details
|
|
83 |
A Combined CNN and LSTM Model for Arabic Sentiment Analysis
|
|
|
|
In: Lecture Notes in Computer Science ; 2nd International Cross-Domain Conference for Machine Learning and Knowledge Extraction (CD-MAKE) ; https://hal.inria.fr/hal-02060041 ; 2nd International Cross-Domain Conference for Machine Learning and Knowledge Extraction (CD-MAKE), Aug 2018, Hamburg, Germany. pp.179-191, ⟨10.1007/978-3-319-99740-7_12⟩ (2018)
|
|
BASE
|
|
Show details
|
|
84 |
Dialog Acts Annotations for Online Chats ; Annotation en Actes de Dialogue pour les Conversations d’Assistance en Ligne
|
|
|
|
In: Actes TALN-RECITAL 2018 ; 25e conférence sur le Traitement Automatique des Langues Naturelles (TALN) ; https://hal.archives-ouvertes.fr/hal-01943345 ; 25e conférence sur le Traitement Automatique des Langues Naturelles (TALN), 2018, Rennes, France (2018)
|
|
BASE
|
|
Show details
|
|
85 |
Experimental IR Meets Multilinguality, Multimodality, and Interaction (CLEF 2018, Avignon,France)
|
|
|
|
In: ISSN: 0302-9743 ; Lecture Notes in Computer Science ; 9th International Conference of the CLEF Association (CLEF 2018) ; https://hal.archives-ouvertes.fr/hal-03044243 ; Bellot, Patrice; Trabelsi, Chiraz; Mothe, Josiane; Murtagh, Fionn; Nie, Jian-Yun; Soulier, Laure; Sanjuan, Eric; Cappellato, Linda; Ferro, Nicola. 9th International Conference of the CLEF Association (CLEF 2018), Sep 2018, Avignon, France. Lecture Notes in Computer Science, Springer Berlin / Heidelberg; Springer, 2018, Experimental IR Meets Multilinguality, Multimodality, and Interaction, 978-3-319-98931-0. ⟨10.1007/978-3-319-98932-7⟩ ; https://link.springer.com/book/10.1007%2F978-3-319-98932-7 (2018)
|
|
BASE
|
|
Show details
|
|
86 |
From Emoji Usage to Categorical Emoji Prediction
|
|
|
|
In: 19th International Conference on Computational Linguistics and Intelligent Text Processing (CICLING 2018) ; https://hal-amu.archives-ouvertes.fr/hal-01871045 ; 19th International Conference on Computational Linguistics and Intelligent Text Processing (CICLING 2018), Mar 2018, Hanoï, Vietnam ; https://www.cicling.org/2018/ (2018)
|
|
BASE
|
|
Show details
|
|
88 |
Prediction of Psychosis Using Big Web Data in the United States
|
|
|
|
In: http://rave.ohiolink.edu/etdc/view?acc_num=kent1532962079970169 (2018)
|
|
BASE
|
|
Show details
|
|
89 |
An Empirical Study of Word Embedding Dimensionality Reduction ...
|
|
|
|
BASE
|
|
Show details
|
|
90 |
An Empirical Study of Word Embedding Dimensionality Reduction ...
|
|
|
|
BASE
|
|
Show details
|
|
93 |
Data-Driven Language Understanding for Spoken Dialogue Systems ...
|
|
|
|
BASE
|
|
Show details
|
|
94 |
ОБЛАЧНЫЕ СЕРВИСЫ ДЛЯ ОБРАБОТКИ ТЕКСТОВ НА ЕСТЕСТВЕННОМ ЯЗЫКЕ ... : CLOUD SERVICES FOR NATURAL LANGUAGE PROCESSING ...
|
|
|
|
BASE
|
|
Show details
|
|
96 |
Automatic Annotation And Retrieval System (Ilars) For Enhancing Organizational E-Learning ...
|
|
|
|
BASE
|
|
Show details
|
|
97 |
Automatic Annotation And Retrieval System (Ilars) For Enhancing Organizational E-Learning ...
|
|
|
|
BASE
|
|
Show details
|
|
98 |
Proposition-based summarization with a coherence-driven incremental model
|
|
Fang, Yimai. - : University of Cambridge, 2018. : Computer Science and Technology, 2018. : Hughes Hall, 2018
|
|
BASE
|
|
Show details
|
|
99 |
NLP Corpus Observatory – Looking for Constellations in Parallel Corpora to Improve Learners’ Collocational Skills
|
|
|
|
In: Schneider, Gerold; Graën, Johannes (2018). NLP Corpus Observatory – Looking for Constellations in Parallel Corpora to Improve Learners’ Collocational Skills. In: 7th Workshop on NLP for Computer Assisted Language Learning at SLTC 2018 (NLP4CALL 2018), Stockholm, 7 November 2018 - 7 November 2018, 69-78. (2018)
|
|
BASE
|
|
Show details
|
|
100 |
Simple Convolutional Neural Networks with Linguistically-Annotated Input for Answer Selection in Question Answering
|
|
|
|
BASE
|
|
Show details
|
|
Page: 1 2 3 4 5 6 7 8 9... 13
|
|