Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4 5 6

Hits 1 – 20 of 106

1	SemEval-2022 Task 2: Multilingual Idiomaticity Detection and Sentence Embedding ...
	Madabushi, Harish Tayyar; Gow-Smith, Edward; Garcia, Marcos. - : arXiv, 2022
	BASE
	Show details

2	Improving Tokenisation by Alternative Treatment of Spaces ...
	Gow-Smith, Edward; Madabushi, Harish Tayyar; Scarton, Carolina. - : arXiv, 2022
	BASE
	Show details

3	Investigating alignment interpretability for low-resource NMT [<Journal>]
	Boito, Marcely Zanon [Verfasser]; Villavicencio, Aline [Verfasser]; Besacier, Laurent [Verfasser]
	DNB Subject Category Language
	Show details

4	Investigating alignment interpretability for low-resource NMT
	Zanon Boito, Marcely; Villavicencio, Aline; Besacier, Laurent
	In: ISSN: 0922-6567 ; EISSN: 1573-0573 ; Machine Translation ; https://hal.archives-ouvertes.fr/hal-03139744 ; Machine Translation, Springer Verlag, 2021, ⟨10.1007/s10590-020-09254-w⟩ (2021)
	Abstract: International audience ; The attention mechanism in Neural Machine Translation (NMT) models added flexibility to translation systems, and the possibility to visualize soft-alignments between source and target representations. While there is much debate about the relationship between attention and the yielded output for neural models [26, 35, 43, 38], in this paper we propose a different assessment, investigating soft-alignment interpretability in low-resource scenarios. We experimented with different architectures (RNN [5], 2D-CNN [15], and Transformer [39]), comparing them with regards to their ability to produce directly exploitable alignments. For evaluating exploitability, we replicated the Unsupervised Word Segmentation (UWS) task from Godard et al. [22]. There, source words are translated into unsegmented phone sequences. Posterior to training, the resulting soft-alignments are used for producing segmentation over the target side. Our results showed that a RNN-based NMT model produced the most exploitable alignments in this scenario. We then investigated methods for increasing its UWS scores by comparing the following methodologies: monolingual pre-training, input representation augmentation (hybrid model), and explicit word length optimization during training. We reached the best results by using the hybrid model, which uses an intermediate monolingual-rooted segmentation from a non-parametric Bayesian model [25] to enrich the input representation before training.
	Keyword: [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [INFO]Computer Science [cs]; attention mechanism; computational language documentation; low-resource languages; neural machine translation; sequence-tosequence models; unsupervised word segmentation
	URL: https://doi.org/10.1007/s10590-020-09254-w https://hal.archives-ouvertes.fr/hal-03139744
	BASE
	Hide details

5	AStitchInLanguageModels: Dataset and Methods for the Exploration of Idiomaticity in Pre-Trained Language Models ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; ., Carolina; Gow-Smith, Edward. - : Underline Science Inc., 2021
	BASE
	Show details

6	Unsupervised Word Segmentation from Discrete Speech Units in Low-Resource Settings ...
	Boito, Marcely Zanon; Yusuf, Bolaji; Ondel, Lucas. - : arXiv, 2021
	BASE
	Show details

7	Assessing the Representations of Idiomaticity in Vector Models with a Noun Compound Dataset Labeled at Type and Token Levels ...
	The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing 2021; ., Carolina; Garcia, Marcos. - : Underline Science Inc., 2021
	BASE
	Show details

8	The Role of negative information when learning dense word vectors ; O papel da informação negativa na aprendizagem de vetores palavra densos
	Salle, Alexandre Tadeu. - 2021
	BASE
	Show details

9	Investigating Language Impact in Bilingual Approaches for Computational Language Documentation
	Zanon Boito, Marcely; Villavicencio, Aline; Besacier, Laurent
	In: Proceedings of the 1st Joint SLTU and CCURL Workshop (SLTU-CCURL 2020), ; SLTU-CCURL workshop, LREC 2020 ; https://hal.archives-ouvertes.fr/hal-02895907 ; SLTU-CCURL workshop, LREC 2020, May 2020, Marseille, France (2020)
	BASE
	Show details

10	Annotated corpora and tools of the PARSEME Shared Task on Semi-Supervised Identification of Verbal Multiword Expressions (edition 1.2)
	Ramisch, Carlos; Guillaume, Bruno; Savary, Agata. - : PARSEME, 2020
	BASE
	Show details

11	Investigating Language Impact in Bilingual Approaches for Computational Language Documentation ...
	Boito, Marcely Zanon; Villavicencio, Aline; Besacier, Laurent. - : arXiv, 2020
	BASE
	Show details

12	Empirical Evaluation of Sequence-to-Sequence Models for Word Discovery in Low-resource Settings
	Zanon Boito, Marcely; Villavicencio, Aline; Besacier, Laurent
	In: Interspeech 2019 ; https://hal.archives-ouvertes.fr/hal-02193867 ; Interspeech 2019, Sep 2019, Graz, Austria (2019)
	BASE
	Show details

13	Unsupervised Compositionality Prediction of Nominal Compounds
	Cordeiro, Silvio; Villavicencio, Aline; Idiart, Marco...
	In: ISSN: 0891-2017 ; EISSN: 1530-9312 ; Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-02318196 ; Computational Linguistics, Massachusetts Institute of Technology Press (MIT Press), 2019, 45 (1), pp.1-57. ⟨10.1162/coli_a_00341⟩ (2019)
	BASE
	Show details

14	How Does Language Influence Documentation Workflow? Unsupervised Word Discovery Using Translations in Multiple Languages
	Zanon Boito, Marcely; Villavicencio, Aline; Besacier, Laurent
	In: Journées Scientifiques du Groupement de Recherche: Linguistique Informatique, Formelle et de Terrain (LIFT). ; https://hal.archives-ouvertes.fr/hal-02895895 ; Journées Scientifiques du Groupement de Recherche: Linguistique Informatique, Formelle et de Terrain (LIFT)., Nov 2019, Orléans, France (2019)
	BASE
	Show details

15	How Does Language Influence Documentation Workflow? Unsupervised Word Discovery Using Translations in Multiple Languages ...
	Boito, Marcely Zanon; Villavicencio, Aline; Besacier, Laurent. - : arXiv, 2019
	BASE
	Show details

16	CogniVal: A Framework for Cognitive Word Embedding Evaluation
	de la Torre, Antonio; Langer, Nicolas; Zhang, Ce...
	In: Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL) (2019)
	BASE
	Show details

17	Unsupervised Compositionality Prediction of Nominal Compounds
	Cordeiro, Silvio; Villavicencio, Aline; Idiart, Marco. - : MIT Press, 2019
	BASE
	Show details

18	A small Griko-Italian speech translation corpus
	Zanon Boito, Marcely; Anastasopoulos, Antonios; Lekakou, Marika...
	In: 6th international workshop on spoken language technologies for under-resourced languages(SLTU'18) ; https://hal.archives-ouvertes.fr/hal-01962528 ; 6th international workshop on spoken language technologies for under-resourced languages(SLTU'18), Aug 2018, New Delhi, India (2018)
	BASE
	Show details

19	Unsupervised Word Segmentation from Speech with Attention
	Godard, Pierre; Zanon Boito, Marcely; Ondel, Lucas...
	In: Interspeech 2018 ; https://hal.archives-ouvertes.fr/hal-01818092 ; Interspeech 2018, Sep 2018, Hyderabad, India (2018)
	BASE
	Show details

20	Language, Cognition, and Computational Models
	Poibeau, Thierry; Villavicencio, Aline. - : HAL CCSD, 2018. : Cambridge University Press, 2018
	In: https://hal.archives-ouvertes.fr/hal-01722351 ; Cambridge University Press, 2018 ; https://www.cambridge.org/core/books/language-cognition-and-computational-models/90CC7DBA6CADB1FE361266D311CB4413 (2018)
	BASE
	Show details

Page: 1 2 3 4 5 6

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern