Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3

Hits 1 – 20 of 42

1	EMBEDDIA tools output example corpus of Estonian, Croatian and Latvian news articles 1.0
	Freienthal, Linda; Pelicon, Andraž; Martinc, Matej. - : Ekspress Meedia Group, 2022. : Styria Media Group, 2022
	BASE
	Show details

2	Out of Thin Air: Is Zero-Shot Cross-Lingual Keyword Detection Better Than Unsupervised? ...
	Koloski, Boshko; Pollak, Senja; Škrlj, Blaž. - : arXiv, 2022
	BASE
	Show details

3	Word-embedding based bilingual terminology alignment ...
	Repar, Andraž; Martinc, Matej; Ulčar, Matej. - : Zenodo, 2021
	BASE
	Show details

4	Word-embedding based bilingual terminology alignment ...
	Repar, Andraž; Martinc, Matej; Ulčar, Matej. - : Zenodo, 2021
	BASE
	Show details

5	Ekspress news article archive (in Estonian and Russian) 1.0
	Purver, Matthew; Pollak, Senja; Freienthal, Linda. - : Ekspress Meedia Group, 2021
	BASE
	Show details

6	Latvian user comment dataset 1.0
	Shekhar, Ravi; Purver, Matthew; Pollak, Senja. - : Ekspress Meedia Group, 2021
	BASE
	Show details

7	Ekspress user comment dataset 1.0
	Shekhar, Ravi; Pollak, Senja; Pelicon, Andraž. - : Ekspress Meedia Group, 2021
	BASE
	Show details

8	24sata news comment dataset 1.0
	Shekhar, Ravi; Pranjic, Marko; Pollak, Senja. - : Styria Media Group, 2021
	BASE
	Show details

9	Keyword extraction datasets for Croatian, Estonian, Latvian and Russian 1.0
	Koloski, Boshko; Pollak, Senja; Škrlj, Blaž. - : Ekspress Meedia Group, 2021. : Styria Media Group, 2021
	BASE
	Show details

10	24sata news article archive 1.0
	Purver, Matthew; Shekhar, Ravi; Pranjić, Marko. - : Styria Media Group, 2021
	BASE
	Show details

11	Latvian Delfi article archive (in Latvian and Russian) 1.0
	Pollak, Senja; Purver, Matthew; Shekhar, Ravi. - : Ekspress Meedia Group, 2021
	BASE
	Show details

12	List of single-word male and female occupations in Slovenian
	Supej, Anka; Ulčar, Matej; Robnik-Šikonja, Marko. - : Jožef Stefan Institute, 2021. : Faculty of Computer and Information Science, University of Ljubljana, 2021
	BASE
	Show details

13	SimLex-999 Slovenian translation SimLex-999-sl 1.0
	Pollak, Senja; Vulić, Ivan; Pelicon, Andraž. - : University of Ljubljana, 2021
	BASE
	Show details

14	Slav-NER: the 3rd Cross-lingual Challenge on Recognition, Normalization, Classification, and Linking of Named Entities across Slavic languages ...
	Piskorski, Jakub; Babych, Bogdan; Kancheva, Zara. - : Zenodo, 2021
	BASE
	Show details

15	Slav-NER: the 3rd Cross-lingual Challenge on Recognition, Normalization, Classification, and Linking of Named Entities across Slavic languages ...
	Piskorski, Jakub; Babych, Bogdan; Kancheva, Zara. - : Zenodo, 2021
	BASE
	Show details

16	Evaluation of contextual embeddings on less-resourced languages ...
	Ulčar, Matej; Žagar, Aleš; Armendariz, Carlos S.. - : arXiv, 2021
	BASE
	Show details

17	Simple Discovery of COVID IS WAR Metaphors Using Word Embeddings ...
	Brglez, Mojca; Pollak, Senja; Vintar, Špela. - : Zenodo, 2021
	BASE
	Show details

18	Simple Discovery of COVID IS WAR Metaphors Using Word Embeddings ...
	Brglez, Mojca; Pollak, Senja; Vintar, Špela. - : Zenodo, 2021
	BASE
	Show details

19	Investigating cross-lingual training for offensive language detection
	Pelicon, Andraž; Shekhar, Ravi; Škrlj, Blaž; Purver, Matthew; Pollak, Senja
	In: PeerJ Comput Sci (2021)
	Abstract: Platforms that feature user-generated content (social media, online forums, newspaper comment sections etc.) have to detect and filter offensive speech within large, fast-changing datasets. While many automatic methods have been proposed and achieve good accuracies, most of these focus on the English language, and are hard to apply directly to languages in which few labeled datasets exist. Recent work has therefore investigated the use of cross-lingual transfer learning to solve this problem, training a model in a well-resourced language and transferring to a less-resourced target language; but performance has so far been significantly less impressive. In this paper, we investigate the reasons for this performance drop, via a systematic comparison of pre-trained models and intermediate training regimes on five different languages. We show that using a better pre-trained language model results in a large gain in overall performance and in zero-shot transfer, and that intermediate training on other languages is effective when little target-language data is available. We then use multiple analyses of classifier confidence and language model vocabulary to shed light on exactly where these gains come from and gain insight into the sources of the most typical mistakes.
	Keyword: Computational Linguistics
	URL: https://doi.org/10.7717/peerj-cs.559 http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8237322/
	BASE
	Hide details

20	Temporal Integration of Text Transcripts and Acoustic Features for Alzheimer's Diagnosis Based on Spontaneous Speech
	Martinc, Matej; Haider, Fasih; Pollak, Senja...
	In: Front Aging Neurosci (2021)
	BASE
	Show details

Page: 1 2 3

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern