DE eng

Search in the Catalogues and Directories

Hits 1 – 13 of 13

1
Automatic Normalisation of Early Modern French
In: https://hal.inria.fr/hal-03540226 ; 2022 (2022)
BASE
Show details
2
From FreEM to D'AlemBERT ; From FreEM to D'AlemBERT: a Large Corpus and a Language Model for Early Modern French
In: Proceedings of the 13th Language Resources and Evaluation Conference ; https://hal.inria.fr/hal-03596653 ; Proceedings of the 13th Language Resources and Evaluation Conference, European Language Resources Association, Jun 2022, Marseille, France (2022)
Abstract: 8 pages, 2 figures, 4 tables ; International audience ; Language models for historical states of language are becoming increasingly important to allow the optimal digitisation and analysis of old textual sources. Because these historical states are at the same time more complex to process and more scarce in the corpora available, specific efforts are necessary to train natural language processing (NLP) tools adapted to the data. In this paper, we present our efforts to develop NLP tools for Early Modern French (historical French from the 16th to the 18th centuries). We present the FreEMmax corpus of Early Modern French and D'AlemBERT, a RoBERTa-based language model trained on FreEMmax. We evaluate the usefulness of D'AlemBERT by fine-tuning it on a part-of-speech tagging task, outperforming previous work on the test set. Importantly, we find evidence for the transfer learning capacity of the language model, since its performance on lesser-resourced time periods appears to have been boosted by the more resourced ones. We release D'AlemBERT and the open-sourced subpart of the FreEMmax corpus.
Keyword: [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; Corpus creation; Création de corpus; Digital humanities; Early Modern French; Français classique; Humanités Numériques; Language modelling; Langues peu dotées; Less-resourced languages; Modèle de langue neuronal; Modélisation linguistique; Neural language representation models; Partie du discours; POS tagging
URL: https://hal.inria.fr/hal-03596653
BASE
Hide details
3
A dataset for automatic detection of places in (early) modern French texts ; Un jeu de données pour la détection automatique de lieux dans les textes français modernes
In: NASSCFL 2021 - 50th Annual North American Society for Seventeenth-Century French Literature Conference ; https://hal.archives-ouvertes.fr/hal-03187097 ; NASSCFL 2021 - 50th Annual North American Society for Seventeenth-Century French Literature Conference, NASSCFL, May 2021, Iowa City / Virtual, United States. pp.5 (2021)
BASE
Show details
4
Lemmatiser des textes et corriger l'annotation grâcè a l'apprentissage profond avec Pyrrha
In: Humanistica 2021 ; https://hal.archives-ouvertes.fr/hal-03224112 ; Humanistica 2021, May 2021, Rennes, France (2021)
BASE
Show details
5
Variation graphique dans les documents d'Ancien Régime : Nouvelles approches scriptométriques
In: Journée d’étude : « Pour une histoire de la langue ‘par en bas’: textes privés et variation des langues dans le passé » ; https://hal.inria.fr/hal-03357080 ; Journée d’étude : « Pour une histoire de la langue ‘par en bas’: textes privés et variation des langues dans le passé », Sep 2021, Paris, France (2021)
BASE
Show details
6
Beyond Idiolectometry? On Racine's Stylometric Signature ; Au-delà de l'idiolectométrie? Sur la signature stylométrique de Racine
In: Proceedings of the Conference on Computational Humanities Research 2021 ; Conference on Computational Humanities Research 2021 ; https://hal.archives-ouvertes.fr/hal-03402994 ; Conference on Computational Humanities Research 2021, Nov 2021, Amsterdam, Netherlands ; http://ceur-ws.org/Vol-2989 (2021)
BASE
Show details
7
Expanding the content model of annotationBlock
In: Next Gen TEI, 2021 - TEI Conference and Members’ Meeting ; https://hal.archives-ouvertes.fr/hal-03380805 ; Next Gen TEI, 2021 - TEI Conference and Members’ Meeting, Oct 2021, Virtual, United States (2021)
BASE
Show details
8
Guidelines for linguistic annotation of modern French (16th-18th c.) ; Manuel d'annotation linguistique pour le français moderne (XVIe -XVIIIe siècles)
In: https://hal.archives-ouvertes.fr/hal-02571190 ; 2020 (2020)
BASE
Show details
9
Standardizing linguistic data: method and tools for annotating (pre-orthographic) French ; Standardiser les données linguistiques: méthodes et outils pour l'annotation du français (pré-orthographique)
In: Proceedings of the 2nd International Digital Tools & Uses Congress (DTUC '20) ; https://hal.archives-ouvertes.fr/hal-03018381 ; Proceedings of the 2nd International Digital Tools & Uses Congress (DTUC '20), Oct 2020, Hammamet, Tunisia. ⟨10.1145/3423603.3423996⟩ (2020)
BASE
Show details
10
Machine Translation for the Normalisation of 17th c. French ; Traduction automatique pour la normalisation du français du XVII e siècle
In: TALN 2020 ; https://hal.archives-ouvertes.fr/hal-02596669 ; TALN 2020, ATALA, Jun 2020, Nancy, France (2020)
BASE
Show details
11
A linguistic introduction for machine learning data? ; Une introduction linguistique pour les données de Machine learning?
In: Humanistica 2020 ; https://hal.archives-ouvertes.fr/hal-02619356 ; Humanistica 2020, Humanistica, May 2020, Bordeaux, France ; http://www.humanisti.ca/colloque2020/ (2020)
BASE
Show details
12
La naissance de Marie-Blanche de Grignan. Notes sur la mise en page de la polyphonie sévignéenne
In: ISSN: 2496-5731 ; Acta Litt&Arts [En ligne] ; https://hal.archives-ouvertes.fr/hal-01900042 ; Acta Litt&Arts [En ligne], Grenoble: Université Grenoble Alpes, 2020, Les discours rapportés en contexte épistolaire (XVIe-XVIIIe siècles), http://ouvroir-litt-arts.univ-grenoble-alpes.fr/revues/actalittarts/616 (2020)
BASE
Show details
13
CORPUS17: a philological corpus for 17th c. French ; CORPUS17: un corpus philologique pour le 17ème siècle français
In: Proceedings of the 2nd International Digital Tools & Uses Congress (DTUC ’20) ; https://hal.archives-ouvertes.fr/hal-03041871 ; Proceedings of the 2nd International Digital Tools & Uses Congress (DTUC ’20), Oct 2020, Hammamet, Tunisia. ⟨10.1145/3423603.3424002⟩ (2020)
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
13
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern