1 |
Assessing the impact of OCR noise on multilingual event detection over digitised documents
|
|
|
|
In: ISSN: 1432-5012 ; EISSN: 1432-1300 ; International Journal on Digital Libraries ; https://hal.archives-ouvertes.fr/hal-03635985 ; International Journal on Digital Libraries, Springer Verlag, 2022, ⟨10.1007/s00799-022-00325-2⟩ (2022)
|
|
BASE
|
|
Show details
|
|
2 |
Assessing the Impact of OCR Noise on Multilingual Event Detection over Digitised Documents ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Assessing the Impact of OCR Noise on Multilingual Event Detection over Digitised Documents ...
|
|
|
|
Abstract:
Event detection (ED) is a crucial task for natural language processing (NLP) and it involves the identification of instances of speci ed types of events in text and their classi cation into event types. The detection of events from digitised documents could enable historians to gather and combine a large amount of information into an integrated whole, a panoramic interpretation of the past. However, the level of degradation of digitised documents and the quality of the optical character recognition (OCR) tools might hinder the performance of an event detection system. While several studies have been performed in detecting events from historical documents, the transcribed documents needed to be hand-validated which implied a great effort of human expertise and manual labor-intensive work. Thus, in this study, we explore the robustness of two different event detection language-independent models to OCR noise, over two datasets that cover different event types and multiple languages.We aim at analysing their ...
|
|
URL: https://zenodo.org/record/6369941 https://dx.doi.org/10.5281/zenodo.6369941
|
|
BASE
|
|
Hide details
|
|
4 |
L3i_LBPAM at the FinSim-2 task: Learning Financial Semantic Similarities with Siamese Transformers
|
|
|
|
In: WWW '21: Companion Proceedings of the Web Conference 2021 ; WWW '21: The Web Conference 2021 ; https://hal.sorbonne-universite.fr/hal-03256324 ; WWW '21: The Web Conference 2021, Apr 2021, Ljubljana (virtual), Slovenia. pp.302-306, ⟨10.1145/3442442.3451384⟩ (2021)
|
|
BASE
|
|
Show details
|
|
5 |
Impact Analysis of Document Digitization on Event Extraction ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Impact Analysis of Document Digitization on Event Extraction ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Impact Analysis of Document Digitization on Event Extraction
|
|
|
|
In: CEUR Workshop Proceedings ; 4th Workshop on Natural Language for Artificial Intelligence (NL4AI 2020) co-located with the 19th International Conference of the Italian Association for Artificial Intelligence (AI*IA 2020) ; https://hal.archives-ouvertes.fr/hal-03026148 ; 4th Workshop on Natural Language for Artificial Intelligence (NL4AI 2020) co-located with the 19th International Conference of the Italian Association for Artificial Intelligence (AI*IA 2020), Nov 2020, Virtual, Italy. pp.17-28 ; http://sag.art.uniroma2.it/NL4AI/ (2020)
|
|
BASE
|
|
Show details
|
|
8 |
Impact Analysis of Document Digitization on Event Extraction ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Impact Analysis of Document Digitization on Event Extraction ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|