1 |
Assessing the impact of OCR noise on multilingual event detection over digitised documents
|
|
|
|
In: ISSN: 1432-5012 ; EISSN: 1432-1300 ; International Journal on Digital Libraries ; https://hal.archives-ouvertes.fr/hal-03635985 ; International Journal on Digital Libraries, Springer Verlag, 2022, ⟨10.1007/s00799-022-00325-2⟩ (2022)
|
|
BASE
|
|
Show details
|
|
2 |
Assessing the Impact of OCR Noise on Multilingual Event Detection over Digitised Documents ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Data set for "Token-Level Multilingual Epidemic Dataset for Event Extraction" ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Assessing the Impact of OCR Noise on Multilingual Event Detection over Digitised Documents ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
L3i at SemEval-2022 Task 11: Straightforward Additional Context for Multilingual Named Entity Recognition ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Data set for "Token-Level Multilingual Epidemic Dataset for Event Extraction" ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
L3i at SemEval-2022 Task 11: Straightforward Additional Context for Multilingual Named Entity Recognition ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
EMBEDDIA at SemEval-2022 Task 8: Investigating Sentence, Image, and Knowledge Graph Representations for Multilingual News Article Similarity ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
EMBEDDIA at SemEval-2022 Task 8: Investigating Sentence, Image, and Knowledge Graph Representations for Multilingual News Article Similarity ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Intérêt des modèles de caractères pour la détection d'événements
|
|
|
|
In: Actes de la 28e Conférence sur le Traitement Automatique des Langues Naturelles. Volume 1 : conférence principale ; Traitement Automatique des Langues Naturelles ; https://hal.archives-ouvertes.fr/hal-03265875 ; Traitement Automatique des Langues Naturelles, 2021, Lille, France. pp.179-188 (2021)
|
|
BASE
|
|
Show details
|
|
11 |
L3i_LBPAM at the FinSim-2 task: Learning Financial Semantic Similarities with Siamese Transformers
|
|
|
|
In: WWW '21: Companion Proceedings of the Web Conference 2021 ; WWW '21: The Web Conference 2021 ; https://hal.sorbonne-universite.fr/hal-03256324 ; WWW '21: The Web Conference 2021, Apr 2021, Ljubljana (virtual), Slovenia. pp.302-306, ⟨10.1145/3442442.3451384⟩ (2021)
|
|
BASE
|
|
Show details
|
|
12 |
Atténuer les erreurs de numérisation dans la reconnaissance d'entités nommées pour les documents historiques
|
|
|
|
In: Conférence en Recherche d'Informations et Applications (CORIA 2021) ; https://hal.archives-ouvertes.fr/hal-03320332 ; Conférence en Recherche d'Informations et Applications (CORIA 2021), ARIA : Association Francophone de Recherche d’Information (RI) et Applications, Apr 2021, Grenoble (virtuel), France. pp.1 - 7 ; http://coria.asso-aria.org/2021/articles/mini_24/main.pdf (2021)
|
|
BASE
|
|
Show details
|
|
13 |
Multilingual Epidemic Event Extraction
|
|
|
|
In: Towards Open and Trustworthy Digital Societies. 23rd International Conference on Asia-Pacific Digital Libraries, ICADL 2021, Virtual Event, December 1–3, 2021, Proceedings ; https://hal.archives-ouvertes.fr/hal-03480551 ; Hao-Ren Ke; Chei Sian Lee; Kazunari Sugiyama. Towards Open and Trustworthy Digital Societies. 23rd International Conference on Asia-Pacific Digital Libraries, ICADL 2021, Virtual Event, December 1–3, 2021, Proceedings, 13133, Springer, pp.139-156, 2021, Lecture Notes in Computer Science, 978-3-030-91668-8. ⟨10.1007/978-3-030-91669-5_12⟩ (2021)
|
|
BASE
|
|
Show details
|
|
14 |
Étude comparative de méthodes de classification multilingue appliquées à l'épidémiologie
|
|
|
|
In: COnférence en Recherche d'Informations et Applications - CORIA 2021, French Information Retrieval Conference ; https://hal.archives-ouvertes.fr/hal-03320343 ; COnférence en Recherche d'Informations et Applications - CORIA 2021, French Information Retrieval Conference, Apr 2021, Grenoble (virtuel), France (2021)
|
|
BASE
|
|
Show details
|
|
15 |
A Multilingual Dataset for Named Entity Recognition, Entity Linking and Stance Detection in Historical Newspapers
|
|
|
|
In: SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval ; https://hal.archives-ouvertes.fr/hal-03418387 ; SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Jul 2021, Virtual Event, Canada. pp.2328-2334, ⟨10.1145/3404835.3463255⟩ (2021)
|
|
Abstract:
International audience ; Named entity processing over historical texts is more and more being used due to the massive documents and archives being stored in digital libraries. However, due to the poor annotated resources of historical nature, information extraction performances fall behind those on contemporary texts. In this paper, we introduce the development of the NewsEye resource, a multilingual dataset for named entity recognition and linking enriched with stances towards named entities. The dataset is comprised of diachronic historical newspaper material published between 1850 and 1950 in French, German, Finnish, and Swedish. Such historical resource is essential in the context of developing and evaluating named entity processing systems. It evenly allows enhancing the performances of existing approaches on historical documents which enables adequate and efficient semantic indexing of historical documents on digital cultural heritage collections.
|
|
Keyword:
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]; [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [INFO.INFO-DL]Computer Science [cs]/Digital Libraries [cs.DL]; [INFO.INFO-HC]Computer Science [cs]/Human-Computer Interaction [cs.HC]; [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR]; [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG]; [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing; datasets; diachronic historical newspapers; entity linking; multilingual; named entity recognition; stance detection
|
|
URL: https://hal.archives-ouvertes.fr/hal-03418387 https://hal.archives-ouvertes.fr/hal-03418387/file/SIGIR2021_NER-resources.pdf https://doi.org/10.1145/3404835.3463255 https://hal.archives-ouvertes.fr/hal-03418387/document
|
|
BASE
|
|
Hide details
|
|
16 |
MELHISSA: a multilingual entity linking architecture for historical press articles ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Étude comparative de méthodes de classification multilingue appliquées à l'épidémiologie ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
MELHISSA: a multilingual entity linking architecture for historical press articles ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Multilingual Dataset for Named Entity Recognition, Entity Linking and Stance Detection in Historical Newspapers ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|