DE eng

Search in the Catalogues and Directories

Page: 1 2 3
Hits 1 – 20 of 42

1
EMBEDDIA tools output example corpus of Estonian, Croatian and Latvian news articles 1.0
Freienthal, Linda; Pelicon, Andraž; Martinc, Matej. - : Ekspress Meedia Group, 2022. : Styria Media Group, 2022
BASE
Show details
2
Out of Thin Air: Is Zero-Shot Cross-Lingual Keyword Detection Better Than Unsupervised? ...
BASE
Show details
3
Word-embedding based bilingual terminology alignment ...
BASE
Show details
4
Word-embedding based bilingual terminology alignment ...
BASE
Show details
5
Ekspress news article archive (in Estonian and Russian) 1.0
Purver, Matthew; Pollak, Senja; Freienthal, Linda. - : Ekspress Meedia Group, 2021
BASE
Show details
6
Latvian user comment dataset 1.0
Shekhar, Ravi; Purver, Matthew; Pollak, Senja. - : Ekspress Meedia Group, 2021
BASE
Show details
7
Ekspress user comment dataset 1.0
Shekhar, Ravi; Pollak, Senja; Pelicon, Andraž. - : Ekspress Meedia Group, 2021
BASE
Show details
8
24sata news comment dataset 1.0
Shekhar, Ravi; Pranjic, Marko; Pollak, Senja. - : Styria Media Group, 2021
BASE
Show details
9
Keyword extraction datasets for Croatian, Estonian, Latvian and Russian 1.0
Koloski, Boshko; Pollak, Senja; Škrlj, Blaž. - : Ekspress Meedia Group, 2021. : Styria Media Group, 2021
BASE
Show details
10
24sata news article archive 1.0
Purver, Matthew; Shekhar, Ravi; Pranjić, Marko. - : Styria Media Group, 2021
BASE
Show details
11
Latvian Delfi article archive (in Latvian and Russian) 1.0
Abstract: This dataset is an archive of articles from the Delfi news site from 2015-2019, containing over 180,000 articles (c. 50% in Latvian and 50% in the Russian language). Keywords for articles are included. There are 5 JSON files: lv_2015.json contains 42 001 articles from the year 2015 lv_2016_.json contains 40 342 articles from the year 2016 lv_2017_.json contains 37 256 articles from the year 2017 lv_2018_.json contains 31 732 articles from the year 2018 lv_2019_.json contains 29 070 articles from the year 2019 In sum: 180 401 articles Description of the dataset This JSON file is a list of dictionaries, i.e. each article is represented as a dictionary. Each dictionary contains the following: id (integer) - the ID of the article title (string) - the title of the article lead (string) - the lead of the article tags [1] (list of dictionaries or None): each dictionary represents one tag. The tag dictionary contains the following: domain_id (string) - the ID of the domain id (string) - the ID of the tag lang (string) - the language of the tag tag (string) - the tag itself, e.g. Šokolāde translitted_name (string) - a modified version of the tag, e.g. sokolade rawBody (string) - the raw text of the article (contains HTML) bodyText (string) - clean article text (stripped from HTML) publishDate (string) - published date & time of the article categoryPrimary (dictionary or empty list) - the dictionary contains the following information: categoryId (integer) - the ID of the category categoryName (string)- the name of the category (e.g. Futbols) channelId (integer) - the ID of the channel groupId - None channelLanguage (string) - the language of the channel (nat - Latvian, rus - Russian) categoryLanguage (integer) - ID of the channel language relatedArticles (list of integers or None) - a list of related articles' ID's relatedTags(string or None) -- related tags are comma-separated
Keyword: latvian news article; news corpus
URL: http://hdl.handle.net/11356/1409
BASE
Hide details
12
List of single-word male and female occupations in Slovenian
Supej, Anka; Ulčar, Matej; Robnik-Šikonja, Marko. - : Jožef Stefan Institute, 2021. : Faculty of Computer and Information Science, University of Ljubljana, 2021
BASE
Show details
13
SimLex-999 Slovenian translation SimLex-999-sl 1.0
Pollak, Senja; Vulić, Ivan; Pelicon, Andraž. - : University of Ljubljana, 2021
BASE
Show details
14
Slav-NER: the 3rd Cross-lingual Challenge on Recognition, Normalization, Classification, and Linking of Named Entities across Slavic languages ...
BASE
Show details
15
Slav-NER: the 3rd Cross-lingual Challenge on Recognition, Normalization, Classification, and Linking of Named Entities across Slavic languages ...
BASE
Show details
16
Evaluation of contextual embeddings on less-resourced languages ...
BASE
Show details
17
Simple Discovery of COVID IS WAR Metaphors Using Word Embeddings ...
BASE
Show details
18
Simple Discovery of COVID IS WAR Metaphors Using Word Embeddings ...
BASE
Show details
19
Investigating cross-lingual training for offensive language detection
In: PeerJ Comput Sci (2021)
BASE
Show details
20
Temporal Integration of Text Transcripts and Acoustic Features for Alzheimer's Diagnosis Based on Spontaneous Speech
In: Front Aging Neurosci (2021)
BASE
Show details

Page: 1 2 3

Catalogues
0
0
0
0
1
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
41
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern