DE eng

Search in the Catalogues and Directories

Page: 1 2
Hits 1 – 20 of 34

1
Corpus of term-annotated texts RSDO5 1.1
Abstract: The RSDO5 corpus was compiled in order to serve as a training set for automatic term identification. It consists of 12 texts with 250,000 words and almost 38,000 manually annotated terms, each marked to be either in- or out-domain. The corpus texts were published between 2000 and 2019, are either PhD theses (3), a scientific book based on a PhD thesis (1), graduate level text books (4), or journal articles (4) and belong to the fields of biomechanics (3), linguistics (3), chemistry (3), or veterinary science (3). Apart from the manually annotated terms, the corpus was automatically annotated with Universal Dependencies annotations, i.e. tokenisation, sentence segmentation, lemmatisation, morpological features and dependency syntax. As opposed to the previous version, this one adds in- and out-domain marking on terms in the TEI and vertical files.
Keyword: manual annotation; TEI; terminology
URL: http://hdl.handle.net/11356/1470
BASE
Hide details
2
Training corpus ssj500k 2.3
Krek, Simon; Dobrovoljc, Kaja; Erjavec, Tomaž. - : Centre for Language Resources and Technologies, University of Ljubljana, 2021
BASE
Show details
3
Corpus of term-annotated texts RSDO5 1.0
BASE
Show details
4
Training corpus ssj500k 2.2
Krek, Simon; Dobrovoljc, Kaja; Erjavec, Tomaž. - : Centre for Language Resources and Technologies, University of Ljubljana, 2019
BASE
Show details
5
Croatian Twitter training corpus ReLDI-NormTagNER-hr 2.1
Ljubešić, Nikola; Erjavec, Tomaž; Batanović, Vuk. - : Jožef Stefan Institute, 2019
BASE
Show details
6
CMC training corpus Janes-Tag 2.1
Erjavec, Tomaž; Fišer, Darja; Čibej, Jaka. - : Jožef Stefan Institute, 2019
BASE
Show details
7
Serbian Twitter training corpus ReLDI-NormTagNER-sr 2.1
Ljubešić, Nikola; Erjavec, Tomaž; Batanović, Vuk. - : Jožef Stefan Institute, 2019
BASE
Show details
8
Training corpus jos1M 1.2
Erjavec, Tomaž; Krek, Simon; Dobrovoljc, Kaja. - : Jožef Stefan Institute, 2019
BASE
Show details
9
Training corpus SETimes.SR 1.0
Batanović, Vuk; Ljubešić, Nikola; Samardžić, Tanja. - : Regional Linguistic Data Initiative Centre ReLDI, 2018
BASE
Show details
10
Training corpus ssj500k 2.1
Krek, Simon; Dobrovoljc, Kaja; Erjavec, Tomaž. - : Centre for Language Resources and Technologies, University of Ljubljana, 2018
BASE
Show details
11
Bilingual terminology extraction dataset KAS-biterm 1.0
Erjavec, Tomaž; Fišer, Darja; Ljubešić, Nikola. - : Jožef Stefan Institute, 2018
BASE
Show details
12
Terminology identification dataset KAS-term 1.0
Erjavec, Tomaž; Fišer, Darja; Ljubešić, Nikola. - : Jožef Stefan Institute, 2018
BASE
Show details
13
Training corpus hr500k 1.0
Ljubešić, Nikola; Agić, Željko; Klubička, Filip. - : Jožef Stefan Institute, 2018
BASE
Show details
14
Tweet code-switching corpus Janes-Preklop 1.0
Reher, Špela; Erjavec, Tomaž; Fišer, Darja. - : Jožef Stefan Institute, 2017
BASE
Show details
15
CMC training corpus Janes-Tag 2.0
Erjavec, Tomaž; Fišer, Darja; Čibej, Jaka. - : Jožef Stefan Institute, 2017
BASE
Show details
16
Croatian Twitter training corpus ReLDI-NormTag-hr 1.1
Ljubešić, Nikola; Farkaš, Daša; Klubička, Filip. - : Jožef Stefan Institute, 2017
BASE
Show details
17
Serbian Twitter training corpus ReLDI-NormTag-sr 1.0
Ljubešić, Nikola; Farkaš, Daša; Klubička, Filip. - : Jožef Stefan Institute, 2017
BASE
Show details
18
Croatian Twitter training corpus ReLDI-NormTag-hr 1.0
Ljubešić, Nikola; Farkaš, Daša; Klubička, Filip. - : Jožef Stefan Institute, 2017
BASE
Show details
19
Serbian Twitter training corpus ReLDI-NormTagNER-sr 2.0
Ljubešić, Nikola; Erjavec, Tomaž; Miličević, Maja. - : Jožef Stefan Institute, 2017
BASE
Show details
20
Training corpus ssj500k 2.0
Krek, Simon; Dobrovoljc, Kaja; Erjavec, Tomaž. - : Centre for Language Resources and Technologies, University of Ljubljana, 2017
BASE
Show details

Page: 1 2

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
34
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern