DE eng

Search in the Catalogues and Directories

Hits 1 – 19 of 19

1
Croatian Twitter training corpus ReLDI-NormTagNER-hr 2.1
Ljubešić, Nikola; Erjavec, Tomaž; Batanović, Vuk. - : Jožef Stefan Institute, 2019
BASE
Show details
2
CMC training corpus Janes-Tag 2.1
Erjavec, Tomaž; Fišer, Darja; Čibej, Jaka. - : Jožef Stefan Institute, 2019
BASE
Show details
3
Serbian Twitter training corpus ReLDI-NormTagNER-sr 2.1
Ljubešić, Nikola; Erjavec, Tomaž; Batanović, Vuk. - : Jožef Stefan Institute, 2019
BASE
Show details
4
CMC training corpus Janes-Tag 2.0
Erjavec, Tomaž; Fišer, Darja; Čibej, Jaka. - : Jožef Stefan Institute, 2017
BASE
Show details
5
Croatian Twitter training corpus ReLDI-NormTag-hr 1.1
Ljubešić, Nikola; Farkaš, Daša; Klubička, Filip. - : Jožef Stefan Institute, 2017
BASE
Show details
6
Serbian Twitter training corpus ReLDI-NormTag-sr 1.0
Ljubešić, Nikola; Farkaš, Daša; Klubička, Filip. - : Jožef Stefan Institute, 2017
BASE
Show details
7
Croatian Twitter training corpus ReLDI-NormTag-hr 1.0
Ljubešić, Nikola; Farkaš, Daša; Klubička, Filip. - : Jožef Stefan Institute, 2017
BASE
Show details
8
Serbian Twitter training corpus ReLDI-NormTagNER-sr 2.0
Ljubešić, Nikola; Erjavec, Tomaž; Miličević, Maja. - : Jožef Stefan Institute, 2017
BASE
Show details
9
Wikipedia talk corpus Janes-Wiki 1.0
Ljubešić, Nikola; Erjavec, Tomaž; Fišer, Darja. - : Jožef Stefan Institute, 2017
BASE
Show details
10
Serbian Twitter training corpus ReLDI-NormTag-sr 1.1
Ljubešić, Nikola; Farkaš, Daša; Klubička, Filip. - : Jožef Stefan Institute, 2017
BASE
Show details
11
News comment corpus Janes-News 1.0
Erjavec, Tomaž; Ljubešić, Nikola; Fišer, Darja. - : Jožef Stefan Institute, 2017
BASE
Show details
12
Croatian Twitter training corpus ReLDI-NormTagNER-hr 2.0
Ljubešić, Nikola; Erjavec, Tomaž; Miličević, Maja. - : Jožef Stefan Institute, 2017
BASE
Show details
13
Blog post and comment corpus Janes-Blog 1.0
Erjavec, Tomaž; Ljubešić, Nikola; Fišer, Darja. - : Jožef Stefan Institute, 2017
Abstract: Janes-Blog is an annotated corpus of Slovene blogs from websites rtvslo.si and publishwall.si from the period 2006-10 to 2016-01. The corpus is structured into individual texts containing the post of the blog and comments on the post, together with their metadata. The texts in the corpus are tokenised, sentence segmented, word normalised, morphosyntactically tagged, lemmatised and annotated with named entities. Due to protection of privacy, usernames are not included in the metadata and 'person' as well as 'person derivative' named entities have been removed from the texts.
Keyword: blogs; computer-mediated communication; named entities; TEI; word normalisation
URL: http://hdl.handle.net/11356/1138
BASE
Hide details
14
cSMTiser: word standardisation
Ljubešić, Nikola; Perovšek, Matic; Erjavec, Tomaž. - : Jožef Stefan Institute, 2017
BASE
Show details
15
Forum corpus Janes-Forum 1.0
Erjavec, Tomaž; Ljubešić, Nikola; Fišer, Darja. - : Jožef Stefan Institute, 2017
BASE
Show details
16
Twitter corpus Janes-Tweet 1.0
Ljubešić, Nikola; Erjavec, Tomaž; Fišer, Darja. - : Jožef Stefan Institute, 2017
BASE
Show details
17
Dataset of normalised Slovene text KonvNormSl 1.0
Ljubešić, Nikola; Zupan, Katja; Fišer, Darja. - : Jožef Stefan Institute, 2016
BASE
Show details
18
CMC training corpus Janes-Tag 1.2
Erjavec, Tomaž; Fišer, Darja; Čibej, Jaka. - : Jožef Stefan Institute, 2016
BASE
Show details
19
CMC training corpus Janes-Norm 1.2
Erjavec, Tomaž; Fišer, Darja; Čibej, Jaka. - : Jožef Stefan Institute, 2016
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
19
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern