DE eng

Search in the Catalogues and Directories

Page: 1 2 3
Hits 1 – 20 of 52

1
Abstracts from the KAS corpus KAS-Abs 2.0
Žagar, Aleš; Kavaš, Matic; Robnik-Šikonja, Marko. - : Faculty of Electrical Engineering and Computer Science, University of Maribor, 2022. : Faculty of Computer and Information Science, University of Ljubljana, 2022
BASE
Show details
2
Corpus of academic Slovene KAS 2.0
Žagar, Aleš; Kavaš, Matic; Robnik-Šikonja, Marko. - : Faculty of Electrical Engineering and Computer Science, University of Maribor, 2022. : Faculty of Computer and Information Science, University of Ljubljana, 2022
BASE
Show details
3
Summarization datasets from the KAS corpus KAS-Sum 1.0
Žagar, Aleš; Kavaš, Matic; Robnik-Šikonja, Marko. - : Faculty of Electrical Engineering and Computer Science, University of Maribor, 2022. : Faculty of Computer and Information Science, University of Ljubljana, 2022
BASE
Show details
4
Machine Translation datasets from the KAS corpus KAS-MT 1.0
Žagar, Aleš; Kavaš, Matic; Robnik-Šikonja, Marko. - : Faculty of Electrical Engineering and Computer Science, University of Maribor, 2022. : Faculty of Computer and Information Science, University of Ljubljana, 2022
BASE
Show details
5
Bayesian BERT for Trustful Hate Speech Detection ...
BASE
Show details
6
Bayesian BERT for Trustful Hate Speech Detection ...
BASE
Show details
7
Bayesian BERT for Trustful Hate Speech Detection ...
BASE
Show details
8
Slovene SuperGLUE Benchmark: Translation and Evaluation ...
BASE
Show details
9
Valency lexicon extracted from the Gigafida 2.1 corpus
Krek, Simon; Gantar, Polona; Krsnik, Luka. - : Centre for Language Resources and Technologies, University of Ljubljana, 2021
BASE
Show details
10
Multiword Expressions lexicon extracted from the Gigafida 2.1 corpus
Krek, Simon; Gantar, Apolonija; Laskowski, Cyprian. - : Centre for Language Resources and Technologies, University of Ljubljana, 2021
BASE
Show details
11
Corpus of Written Standard Slovene Gigafida 2.0
Krek, Simon; Erjavec, Tomaž; Repar, Andraž. - : Centre for Language Resources and Technologies, University of Ljubljana, 2021
BASE
Show details
12
List of single-word male and female occupations in Slovenian
Supej, Anka; Ulčar, Matej; Robnik-Šikonja, Marko; Pollak, Senja. - : Jožef Stefan Institute, 2021. : Faculty of Computer and Information Science, University of Ljubljana, 2021
Abstract: The list of single-word occupations in Slovene is based on the Slovene Standard Classification of Occupations (https://www.uradni-list.si/glasilo-uradni-list-rs/vsebina?urlid=199728&stevilka=1641). The list includes 234 occupation pairs. For each occupation, it contains its masculine word form (e.g. fotograf), its possible synonym, its feminine equivalent (e.g. fotografka) and the corresponding synonym of the feminine form (e.g. fotografinja). The cases where no synonyms were added for a specific occupation are denoted with the label 0 (note that only synonyms with the same root are considered). Several conditions for inclusion or exclusion of an occupation to the list were applied: - Our list contains only single word occupation pairs, while the majority of the occupations in the aforementioned classification are multi-word expressions. - An occupation has to exist both in female and male grammatical gender (gender-neutral words such as pismonoša [en. postman] are not included in the list). - At least one of the variants of an occupation (masculine or feminine) occurs at least 500 times in the Corpus of Written Standard Slovene Gigafida 2.0. - The occupations that are also proper names in Slovene, e.g. kovač [en. blacksmith], were filtered out if in the Slovene Morphological Lexicon Sloleks 2.0 (Dobrovoljc et al., 2019) the proper name form exists. - Occupations that could be easily associated with a context unrelated to occupations (e.g. čarovnik/čarovnica [en. wizard/witch]) or where a male or female variant is a homograph of a common noun (e.g. detektivka [en. detective] also denotes a detective novel) were excluded from the final set of occupations. When a more established version of an occupation exists, we manually add a synonym with the same root (e.g. in the case of fotografka, an arguably more established fotografinja was added [en. photographer]). If the standard classification does not include the female (e.g. dramatik [en. playwright]) or the male version (e.g. prostitutka [en. prostitute]) of an occupation, the missing version is manually added if it exists and appears in Gigafida corpus (e.g. there are no established words for female and male versions of postrešček [en. porter] and hostesa [en. hostess]). The list of occupations can be used for different natural language processing tasks including evaluation of word embeddings models through analogies, which can point to bias in language use. If you use the dataset, please cite the following paper: SUPEJ, Anka, ULČAR, Matej, ROBNIK ŠIKONJA, Marko, POLLAK, Senja (2020). Primerjava slovenskih besednih vektorskih vložitev z vidika spola na analogijah poklicev. Zbornik konference Jezikovne tehnologije in digitalna humanistika / Proc. of the Conference on Language Technologies and Digital Humanities, p. 93-100.
Keyword: gender; occupations; Slovenian language; word analogies
URL: http://hdl.handle.net/11356/1347
BASE
Hide details
13
SloBERTa: Slovene monolingual large pretrained masked language model ...
BASE
Show details
14
Slav-NER: the 3rd Cross-lingual Challenge on Recognition, Normalization, Classification, and Linking of Named Entities across Slavic languages ...
BASE
Show details
15
SloBERTa: Slovene monolingual large pretrained masked language model ...
BASE
Show details
16
Unsupervised Approach to Cross-Lingual User Comments Summarization ...
BASE
Show details
17
Unsupervised Approach to Cross-Lingual User Comments Summarization ...
BASE
Show details
18
Slav-NER: the 3rd Cross-lingual Challenge on Recognition, Normalization, Classification, and Linking of Named Entities across Slavic languages ...
BASE
Show details
19
Evaluation of contextual embeddings on less-resourced languages ...
BASE
Show details
20
Training dataset and dictionary sizes matter in BERT models: the case of Baltic languages ...
BASE
Show details

Page: 1 2 3

Catalogues
0
0
0
0
1
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
51
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern