DE eng

Search in the Catalogues and Directories

Page: 1 2
Hits 1 – 20 of 36

1
Abstracts from the KAS corpus KAS-Abs 2.0
Žagar, Aleš; Kavaš, Matic; Robnik-Šikonja, Marko. - : Faculty of Electrical Engineering and Computer Science, University of Maribor, 2022. : Faculty of Computer and Information Science, University of Ljubljana, 2022
BASE
Show details
2
Corpus of academic Slovene KAS 2.0
Žagar, Aleš; Kavaš, Matic; Robnik-Šikonja, Marko. - : Faculty of Electrical Engineering and Computer Science, University of Maribor, 2022. : Faculty of Computer and Information Science, University of Ljubljana, 2022
BASE
Show details
3
Summarization datasets from the KAS corpus KAS-Sum 1.0
Žagar, Aleš; Kavaš, Matic; Robnik-Šikonja, Marko. - : Faculty of Electrical Engineering and Computer Science, University of Maribor, 2022. : Faculty of Computer and Information Science, University of Ljubljana, 2022
BASE
Show details
4
Machine Translation datasets from the KAS corpus KAS-MT 1.0
Žagar, Aleš; Kavaš, Matic; Robnik-Šikonja, Marko. - : Faculty of Electrical Engineering and Computer Science, University of Maribor, 2022. : Faculty of Computer and Information Science, University of Ljubljana, 2022
BASE
Show details
5
The ParlaMint corpora of parliamentary proceedings
BASE
Show details
6
The ParlaMint corpora of parliamentary proceedings
In: Lang Resour Eval (2022)
BASE
Show details
7
Offensive language dataset of Croatian, English and Slovenian comments FRENK 1.0
Ljubešić, Nikola; Fišer, Darja; Erjavec, Tomaž. - : Jožef Stefan Institute, 2021
BASE
Show details
8
Offensive language dataset of Croatian, English and Slovenian comments FRENK 1.1
Ljubešić, Nikola; Fišer, Darja; Erjavec, Tomaž. - : Jožef Stefan Institute, 2021
BASE
Show details
9
Abstracts from the KAS corpus KAS-Abs 1.0
Erjavec, Tomaž; Fišer, Darja; Ljubešić, Nikola; Ferme, Marko; Borovič, Mladen; Boškovič, Borko; Ojsteršek, Milan; Hrovat, Goran. - : Jožef Stefan Institute, 2021. : Faculty of Electrical Engineering and Computer Science, University of Maribor, 2021
Abstract: The KAS-abs corpus contains 108,254 automatically identified Slovenian and/or English abstracts (30 million words) from 62,000 BSc/BA, MSc/MA, and PhD theses included in the KAS Corpus of Academic Slovene. This corpus is made available because the public version of KAS (http://hdl.handle.net/11356/1244) does not contain the front matter, and hence the abstracts. The abstracts were identified on a per-page basis, and are either in Slovenian (*-abs-sl.txt, 47,273 files), English (*-abs-en.tx, 49,261 files) or, for cases where the abstracts in both languages were on the same page, in both languages (*-abs-slen.txt, 11,720 files). The files contain the plain text of the abstracts, one paragraph per line. Note that as the cleaning of source PDF files and identification of the abstracts was done automatically, this corpus contains various types of errors. The files are stored in the same manner as for the complete KAS corpus, i.e. in 1,000 directories with the same filename prefix as in KAS. The file with the metadata for the corpus texts is also included. The abstracts can be useful for research in e.g. machine translations and terminology extraction, and, using also the full texts from the KAS corpus, for studies in automatic summarisation.
Keyword: abstracts; academic writing; BSc/BA theses; MSc/MA theses; PhD theses
URL: http://hdl.handle.net/11356/1420
BASE
Hide details
10
English-Slovene term candidates KAS-biterm 1.0
Erjavec, Tomaž; Ljubešić, Nikola; Fišer, Darja. - : Jožef Stefan Institute, 2020
BASE
Show details
11
Corpus of academic Slovene KAS 1.0
Erjavec, Tomaž; Fišer, Darja; Ljubešić, Nikola. - : Jožef Stefan Institute, 2019. : Faculty of Electrical Engineering and Computer Science, University of Maribor, 2019
BASE
Show details
12
CMC training corpus Janes-Tag 2.1
Erjavec, Tomaž; Fišer, Darja; Čibej, Jaka. - : Jožef Stefan Institute, 2019
BASE
Show details
13
Corpus of Academic Slovene (PhD theses) KAS-dr 1.0
Erjavec, Tomaž; Fišer, Darja; Ljubešić, Nikola. - : Jožef Stefan Institute, 2019. : Faculty of Electrical Engineering and Computer Science, University of Maribor, 2019
BASE
Show details
14
Corpus of Academic Slovene (MSc/MA theses) KAS-mag 1.0
Erjavec, Tomaž; Fišer, Darja; Ljubešić, Nikola. - : Jožef Stefan Institute, 2019. : Faculty of Electrical Engineering and Computer Science, University of Maribor, 2019
BASE
Show details
15
Corpus of Academic Slovene (BSc/BA theses) KAS-dipl 1.0
Erjavec, Tomaž; Fišer, Darja; Ljubešić, Nikola. - : Jožef Stefan Institute, 2019. : Faculty of Electrical Engineering and Computer Science, University of Maribor, 2019
BASE
Show details
16
Dictionary of Twitterese Janes-Dict 1.0
Gantar, Polona; Škrjanec, Iza; Fišer, Darja. - : Faculty of Arts, University of Ljubljana, 2018
BASE
Show details
17
Dataset and baseline model of moderated content FRENK-MMC-RTV 1.0
Ljubešić, Nikola; Erjavec, Tomaž; Fišer, Darja. - : Jožef Stefan Institute, 2018
BASE
Show details
18
Bilingual terminology extraction dataset KAS-biterm 1.0
Erjavec, Tomaž; Fišer, Darja; Ljubešić, Nikola. - : Jožef Stefan Institute, 2018
BASE
Show details
19
Terminology identification dataset KAS-term 1.0
Erjavec, Tomaž; Fišer, Darja; Ljubešić, Nikola. - : Jožef Stefan Institute, 2018
BASE
Show details
20
Dataset and baseline model of moderated content FRENK-STYRIA-24sata 1.0
Ljubešić, Nikola; Erjavec, Tomaž; Fišer, Darja. - : Jožef Stefan Institute, 2018
BASE
Show details

Page: 1 2

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
34
0
2
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern