1 |
Automatic Normalisation of Early Modern French
|
|
|
|
In: https://hal.inria.fr/hal-03540226 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
2 |
From FreEM to D'AlemBERT ; From FreEM to D'AlemBERT: a Large Corpus and a Language Model for Early Modern French
|
|
|
|
In: Proceedings of the 13th Language Resources and Evaluation Conference ; https://hal.inria.fr/hal-03596653 ; Proceedings of the 13th Language Resources and Evaluation Conference, European Language Resources Association, Jun 2022, Marseille, France (2022)
|
|
BASE
|
|
Show details
|
|
3 |
Can Cognate Prediction Be Modelled as a Low-Resource Machine Translation Task?
|
|
|
|
In: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 ; https://hal.inria.fr/hal-03243380 ; Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, Aug 2021, Bangkok, Thailand (2021)
|
|
Abstract:
International audience ; Cognate prediction is the task of generating, in a given language, the likely cognates of words in a related language, where cognates are words in related languages that have evolved from a common ancestor word. It is a task for which little data exists and which can aid linguists in the discovery of previously undiscovered relations. Previous work has applied machine translation (MT) techniques to this task, based on the tasks' similarities, without, however, studying their numerous differences or optimising architectural choices and hyper-parameters. In this paper, we investigate whether cognate prediction can benefit from insights from low-resource MT. We first compare statistical MT (SMT) and neural MT (NMT) architectures in a bilingual setup. We then study the impact of employing data augmentation techniques commonly seen to give gains in low-resource MT: monolingual pretraining, backtranslation and multilinguality. Our experiments on several Romance languages show that cognate prediction behaves only to a certain extent like a standard lowresource MT task. In particular, MT architectures, both statistical and neural, can be successfully used for the task, but using supplementary monolingual data is not always as beneficial as using additional language data, contrarily to what is observed for MT.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
|
|
URL: https://hal.inria.fr/hal-03243380 https://hal.inria.fr/hal-03243380/document https://hal.inria.fr/hal-03243380/file/Is_Cognate_Prediction_a_Low_Resource_Machine_Translation_Task__ACL2021Findings-2.pdf
|
|
BASE
|
|
Hide details
|
|
4 |
Few-shot learning through contextual data augmentation
|
|
|
|
In: EACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics ; https://hal.inria.fr/hal-03121971 ; EACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Apr 2021, Kiev / Virtual, Ukraine (2021)
|
|
BASE
|
|
Show details
|
|
5 |
Variation graphique dans les documents d'Ancien Régime : Nouvelles approches scriptométriques
|
|
|
|
In: Journée d’étude : « Pour une histoire de la langue ‘par en bas’: textes privés et variation des langues dans le passé » ; https://hal.inria.fr/hal-03357080 ; Journée d’étude : « Pour une histoire de la langue ‘par en bas’: textes privés et variation des langues dans le passé », Sep 2021, Paris, France (2021)
|
|
BASE
|
|
Show details
|
|
6 |
[Book Review] Understanding Dialogue: Language Use and Social Interaction
|
|
|
|
In: ISSN: 0891-2017 ; EISSN: 1530-9312 ; Computational Linguistics ; https://hal.inria.fr/hal-03324500 ; Computational Linguistics, Massachusetts Institute of Technology Press (MIT Press), In press (2021)
|
|
BASE
|
|
Show details
|
|
7 |
Expanding the content model of annotationBlock
|
|
|
|
In: Next Gen TEI, 2021 - TEI Conference and Members’ Meeting ; https://hal.archives-ouvertes.fr/hal-03380805 ; Next Gen TEI, 2021 - TEI Conference and Members’ Meeting, Oct 2021, Virtual, United States (2021)
|
|
BASE
|
|
Show details
|
|
8 |
Can Cognate Prediction Be Modelled as a Low-Resource Machine Translation Task? ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Document Sub-structure in Neural Machine Translation
|
|
|
|
In: Proceedings of the 12th Language Resources and Evaluation Conference ; 12th Language Resources and Evaluation Conference ; https://hal.archives-ouvertes.fr/hal-02900568 ; 12th Language Resources and Evaluation Conference, 2020, Marseille, France. pp.3657-3667 (2020)
|
|
BASE
|
|
Show details
|
|
11 |
DiaBLa: A Corpus of Bilingual Spontaneous Written Dialogues for Machine Translation
|
|
|
|
In: ISSN: 1574-020X ; EISSN: 1574-0218 ; Language Resources and Evaluation ; https://hal.inria.fr/hal-03021633 ; Language Resources and Evaluation, Springer Verlag, 2020, ⟨10.1007/s10579-020-09514-4⟩ (2020)
|
|
BASE
|
|
Show details
|
|
12 |
Architecture of a Scalable, Secure and Resilient Translation Platform for Multilingual News Media
|
|
|
|
In: Proceedings of the 1st International Workshop on Language Technology Platforms ; 1st International Workshop on Language Technology Platforms ; https://hal.archives-ouvertes.fr/hal-02900633 ; 1st International Workshop on Language Technology Platforms, 2020, Marseille, France. pp.16-21 (2020)
|
|
BASE
|
|
Show details
|
|
13 |
ParBLEU: Augmenting Metrics with Automatic Paraphrases for the WMT'20 Metrics Shared Task
|
|
|
|
In: Proceedings of the 5th Conference on Machine Translation ; 5th Conference on Machine Translation ; https://hal.archives-ouvertes.fr/hal-02981143 ; 5th Conference on Machine Translation, Nov 2020, Online, Unknown Region (2020)
|
|
BASE
|
|
Show details
|
|
14 |
The University of Edinburgh’s English-Tamil and English-Inuktitut Submissions to the WMT20 News Translation Task
|
|
|
|
In: Proceedings of the 5th Conference on Machine Translation ; 5th Conference on Machine Translation ; https://hal.archives-ouvertes.fr/hal-02981153 ; 5th Conference on Machine Translation, Nov 2020, Online, Unknown Region (2020)
|
|
BASE
|
|
Show details
|
|
15 |
Findings of the WMT 2020 Biomedical Translation Shared Task: Basque, Italian and Russian as New Additional Languages
|
|
|
|
In: Proceedings of the 5th Conference on Machine Translation ; 5th Conference on Machine Translation ; https://hal.inria.fr/hal-02986356 ; 5th Conference on Machine Translation, 2020, Online, Unknown Region (2020)
|
|
BASE
|
|
Show details
|
|
16 |
The University of Edinburgh-Uppsala University’s Submission to the WMT 2020 Chat Translation Task
|
|
|
|
In: Proceedings of the 5th Conference on Machine Translation ; 5th Conference on Machine Translation ; https://hal.archives-ouvertes.fr/hal-02981159 ; 5th Conference on Machine Translation, Nov 2020, Online, Unknown Region (2020)
|
|
BASE
|
|
Show details
|
|
17 |
Architecture of a Scalable, Secure and Resilient Translation Platform for Multilingual News Media ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Architecture of a Scalable, Secure and Resilient Translation Platform for Multilingual News Media ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Findings of the WMT 2020 Biomedical Translation Shared Task: Basque, Italian and Russian as New Additional Languages
|
|
|
|
In: Fraunhofer FOKUS ; Fraunhofer IBMT (2020)
|
|
BASE
|
|
Show details
|
|
20 |
The University of Edinburgh’s Submissions to the WMT19 News Translation Task
|
|
|
|
In: 4th Conference on Machine Translation ; https://hal.inria.fr/hal-02986330 ; 4th Conference on Machine Translation, 2019, Florence, Italy (2019)
|
|
BASE
|
|
Show details
|
|
|
|