5 |
The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe
|
|
|
|
In: Language Resources and Evaluation Conference ; https://hal.archives-ouvertes.fr/hal-02892154 ; Language Resources and Evaluation Conference, ELDA/ELRA, May 2020, Marseille, France ; https://lrec2020.lrec-conf.org/en/ (2020)
|
|
BASE
|
|
Show details
|
|
10 |
The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Curation Technologies for a Cultural Heritage Archive: "Project Tongilbu" ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Curation Technologies for a Cultural Heritage Archive: "Project Tongilbu" ...
|
|
|
|
Abstract:
We are developing a platform for generic curation technologies, using various NLP procedures, that is specifically targeted at, but not limited to, document collections that are too large for humans to (manually) read and go through. The aim then is to provide prototypical NLP tools like NER, Entity Linking, clustering and summarization in order to support rapid exploration of a data set. In this particular submission, the data set in question is the result of "Project Tongilbu”, a report funded by the Korean Ministry of Re-unification, on the unification of East- and West-Germany in the 1990’s. The majority of the content in this data set is in German, with small parts in Korean. With the collection being a set of PDF files, we first apply OCR to extract machine-readable text. Focusing on German, we then apply an NER model trained on Wikipedia data, retrieve URIs of recognized entities in the GND (Gemeinsame Normdatei, a German database of entities with additional information), perform temporal analysis and ...
|
|
Keyword:
Korean; NLP
|
|
URL: https://zenodo.org/record/3404255 https://dx.doi.org/10.5281/zenodo.3404255
|
|
BASE
|
|
Hide details
|
|
|
|