1 |
Automatic Normalisation of Early Modern French
|
|
|
|
In: https://hal.inria.fr/hal-03540226 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
2 |
From FreEM to D'AlemBERT ; From FreEM to D'AlemBERT: a Large Corpus and a Language Model for Early Modern French
|
|
|
|
In: Proceedings of the 13th Language Resources and Evaluation Conference ; https://hal.inria.fr/hal-03596653 ; Proceedings of the 13th Language Resources and Evaluation Conference, European Language Resources Association, Jun 2022, Marseille, France (2022)
|
|
BASE
|
|
Show details
|
|
3 |
Can Cognate Prediction Be Modelled as a Low-Resource Machine Translation Task?
|
|
|
|
In: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 ; https://hal.inria.fr/hal-03243380 ; Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, Aug 2021, Bangkok, Thailand (2021)
|
|
BASE
|
|
Show details
|
|
4 |
Few-shot learning through contextual data augmentation
|
|
|
|
In: EACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics ; https://hal.inria.fr/hal-03121971 ; EACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Apr 2021, Kiev / Virtual, Ukraine (2021)
|
|
BASE
|
|
Show details
|
|
5 |
Variation graphique dans les documents d'Ancien Régime : Nouvelles approches scriptométriques
|
|
|
|
In: Journée d’étude : « Pour une histoire de la langue ‘par en bas’: textes privés et variation des langues dans le passé » ; https://hal.inria.fr/hal-03357080 ; Journée d’étude : « Pour une histoire de la langue ‘par en bas’: textes privés et variation des langues dans le passé », Sep 2021, Paris, France (2021)
|
|
BASE
|
|
Show details
|
|
6 |
[Book Review] Understanding Dialogue: Language Use and Social Interaction
|
|
|
|
In: ISSN: 0891-2017 ; EISSN: 1530-9312 ; Computational Linguistics ; https://hal.inria.fr/hal-03324500 ; Computational Linguistics, Massachusetts Institute of Technology Press (MIT Press), In press (2021)
|
|
BASE
|
|
Show details
|
|
7 |
Expanding the content model of annotationBlock
|
|
|
|
In: Next Gen TEI, 2021 - TEI Conference and Members’ Meeting ; https://hal.archives-ouvertes.fr/hal-03380805 ; Next Gen TEI, 2021 - TEI Conference and Members’ Meeting, Oct 2021, Virtual, United States (2021)
|
|
BASE
|
|
Show details
|
|
8 |
Can Cognate Prediction Be Modelled as a Low-Resource Machine Translation Task? ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Document Sub-structure in Neural Machine Translation
|
|
|
|
In: Proceedings of the 12th Language Resources and Evaluation Conference ; 12th Language Resources and Evaluation Conference ; https://hal.archives-ouvertes.fr/hal-02900568 ; 12th Language Resources and Evaluation Conference, 2020, Marseille, France. pp.3657-3667 (2020)
|
|
Abstract:
International audience ; Current approaches to machine translation (MT) either translate sentences in isolation, disregarding the context they appear in, or model context at the level of the full document, without a notion of any internal structure the document may have. In this work we consider the fact that documents are rarely homogeneous blocks of text, but rather consist of parts covering different topics. Some documents, such as biographies and encyclopedia entries, have highly predictable, regular structures in which sections are characterised by different topics. We draw inspiration from Louis and Webber (2014) who use this information to improve statistical MT and transfer their proposal into the framework of neural MT. We compare two different methods of including information about the topic of the section within which each sentence is found: one using side constraints and the other using a cache-based model. We create and release the data on which we run our experiments - parallel corpora for three language pairs (Chinese-English, French-English, Bulgarian-English) from Wikipedia biographies, which we extract automatically, preserving the boundaries of sections within the articles.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; context; corpus creation; document structure; machine translation; parallel corpus; Wikipedia
|
|
URL: https://hal.archives-ouvertes.fr/hal-02900568 https://hal.archives-ouvertes.fr/hal-02900568/document https://hal.archives-ouvertes.fr/hal-02900568/file/LREC_submission.pdf
|
|
BASE
|
|
Hide details
|
|
11 |
DiaBLa: A Corpus of Bilingual Spontaneous Written Dialogues for Machine Translation
|
|
|
|
In: ISSN: 1574-020X ; EISSN: 1574-0218 ; Language Resources and Evaluation ; https://hal.inria.fr/hal-03021633 ; Language Resources and Evaluation, Springer Verlag, 2020, ⟨10.1007/s10579-020-09514-4⟩ (2020)
|
|
BASE
|
|
Show details
|
|
12 |
Architecture of a Scalable, Secure and Resilient Translation Platform for Multilingual News Media
|
|
|
|
In: Proceedings of the 1st International Workshop on Language Technology Platforms ; 1st International Workshop on Language Technology Platforms ; https://hal.archives-ouvertes.fr/hal-02900633 ; 1st International Workshop on Language Technology Platforms, 2020, Marseille, France. pp.16-21 (2020)
|
|
BASE
|
|
Show details
|
|
13 |
ParBLEU: Augmenting Metrics with Automatic Paraphrases for the WMT'20 Metrics Shared Task
|
|
|
|
In: Proceedings of the 5th Conference on Machine Translation ; 5th Conference on Machine Translation ; https://hal.archives-ouvertes.fr/hal-02981143 ; 5th Conference on Machine Translation, Nov 2020, Online, Unknown Region (2020)
|
|
BASE
|
|
Show details
|
|
14 |
The University of Edinburgh’s English-Tamil and English-Inuktitut Submissions to the WMT20 News Translation Task
|
|
|
|
In: Proceedings of the 5th Conference on Machine Translation ; 5th Conference on Machine Translation ; https://hal.archives-ouvertes.fr/hal-02981153 ; 5th Conference on Machine Translation, Nov 2020, Online, Unknown Region (2020)
|
|
BASE
|
|
Show details
|
|
15 |
Findings of the WMT 2020 Biomedical Translation Shared Task: Basque, Italian and Russian as New Additional Languages
|
|
|
|
In: Proceedings of the 5th Conference on Machine Translation ; 5th Conference on Machine Translation ; https://hal.inria.fr/hal-02986356 ; 5th Conference on Machine Translation, 2020, Online, Unknown Region (2020)
|
|
BASE
|
|
Show details
|
|
16 |
The University of Edinburgh-Uppsala University’s Submission to the WMT 2020 Chat Translation Task
|
|
|
|
In: Proceedings of the 5th Conference on Machine Translation ; 5th Conference on Machine Translation ; https://hal.archives-ouvertes.fr/hal-02981159 ; 5th Conference on Machine Translation, Nov 2020, Online, Unknown Region (2020)
|
|
BASE
|
|
Show details
|
|
17 |
Architecture of a Scalable, Secure and Resilient Translation Platform for Multilingual News Media ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Architecture of a Scalable, Secure and Resilient Translation Platform for Multilingual News Media ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Findings of the WMT 2020 Biomedical Translation Shared Task: Basque, Italian and Russian as New Additional Languages
|
|
|
|
In: Fraunhofer FOKUS ; Fraunhofer IBMT (2020)
|
|
BASE
|
|
Show details
|
|
20 |
The University of Edinburgh’s Submissions to the WMT19 News Translation Task
|
|
|
|
In: 4th Conference on Machine Translation ; https://hal.inria.fr/hal-02986330 ; 4th Conference on Machine Translation, 2019, Florence, Italy (2019)
|
|
BASE
|
|
Show details
|
|
|
|