DE eng

Search in the Catalogues and Directories

Hits 1 – 19 of 19

1
Croatian Twitter training corpus ReLDI-NormTagNER-hr 2.1
Ljubešić, Nikola; Erjavec, Tomaž; Batanović, Vuk. - : Jožef Stefan Institute, 2019
BASE
Show details
2
CMC training corpus Janes-Tag 2.1
Erjavec, Tomaž; Fišer, Darja; Čibej, Jaka. - : Jožef Stefan Institute, 2019
BASE
Show details
3
Serbian Twitter training corpus ReLDI-NormTagNER-sr 2.1
Ljubešić, Nikola; Erjavec, Tomaž; Batanović, Vuk. - : Jožef Stefan Institute, 2019
BASE
Show details
4
CMC training corpus Janes-Tag 2.0
Erjavec, Tomaž; Fišer, Darja; Čibej, Jaka. - : Jožef Stefan Institute, 2017
BASE
Show details
5
Croatian Twitter training corpus ReLDI-NormTag-hr 1.1
Ljubešić, Nikola; Farkaš, Daša; Klubička, Filip. - : Jožef Stefan Institute, 2017
BASE
Show details
6
Serbian Twitter training corpus ReLDI-NormTag-sr 1.0
Ljubešić, Nikola; Farkaš, Daša; Klubička, Filip. - : Jožef Stefan Institute, 2017
BASE
Show details
7
Croatian Twitter training corpus ReLDI-NormTag-hr 1.0
Ljubešić, Nikola; Farkaš, Daša; Klubička, Filip. - : Jožef Stefan Institute, 2017
BASE
Show details
8
Serbian Twitter training corpus ReLDI-NormTagNER-sr 2.0
Ljubešić, Nikola; Erjavec, Tomaž; Miličević, Maja. - : Jožef Stefan Institute, 2017
BASE
Show details
9
Wikipedia talk corpus Janes-Wiki 1.0
Ljubešić, Nikola; Erjavec, Tomaž; Fišer, Darja. - : Jožef Stefan Institute, 2017
BASE
Show details
10
Serbian Twitter training corpus ReLDI-NormTag-sr 1.1
Ljubešić, Nikola; Farkaš, Daša; Klubička, Filip. - : Jožef Stefan Institute, 2017
BASE
Show details
11
News comment corpus Janes-News 1.0
Erjavec, Tomaž; Ljubešić, Nikola; Fišer, Darja. - : Jožef Stefan Institute, 2017
BASE
Show details
12
Croatian Twitter training corpus ReLDI-NormTagNER-hr 2.0
Ljubešić, Nikola; Erjavec, Tomaž; Miličević, Maja. - : Jožef Stefan Institute, 2017
BASE
Show details
13
Blog post and comment corpus Janes-Blog 1.0
Erjavec, Tomaž; Ljubešić, Nikola; Fišer, Darja. - : Jožef Stefan Institute, 2017
BASE
Show details
14
cSMTiser: word standardisation
Ljubešić, Nikola; Perovšek, Matic; Erjavec, Tomaž. - : Jožef Stefan Institute, 2017
BASE
Show details
15
Forum corpus Janes-Forum 1.0
Erjavec, Tomaž; Ljubešić, Nikola; Fišer, Darja. - : Jožef Stefan Institute, 2017
BASE
Show details
16
Twitter corpus Janes-Tweet 1.0
Ljubešić, Nikola; Erjavec, Tomaž; Fišer, Darja. - : Jožef Stefan Institute, 2017
BASE
Show details
17
Dataset of normalised Slovene text KonvNormSl 1.0
Ljubešić, Nikola; Zupan, Katja; Fišer, Darja; Erjavec, Tomaž. - : Jožef Stefan Institute, 2016
Abstract: Data used in the experiments described in: Nikola Ljubešić, Katja Zupan, Darja Fišer and Tomaž Erjavec: Normalising Slovene data: historical texts vs. user-generated content. Proceedings of KONVENS 2016, September 19–21, 2016, Bochum, Germany. https://www.linguistics.rub.de/konvens16/pub/19_konvensproc.pdf (https://www.linguistics.rub.de/konvens16/) Data are split into the "token" folder (experiment on normalising individual tokens) and "segment" folder (experiment on normalising whole segments of text, i.e. sentences or tweets). Each experiment folder contains the "train", "dev" and "test" subfolders. Each subfolder contains two files for each sample, the original data (*.orig.txt) and the data with hand-normalised words (*.norm.txt). The files are aligned by lines. There are four datasets: - goo300k-bohoric: historical Slovene, hard case (
Keyword: computer-mediated communication; experimental data; historical language; manual annotation; word normalisation
URL: http://hdl.handle.net/11356/1068
BASE
Hide details
18
CMC training corpus Janes-Tag 1.2
Erjavec, Tomaž; Fišer, Darja; Čibej, Jaka. - : Jožef Stefan Institute, 2016
BASE
Show details
19
CMC training corpus Janes-Norm 1.2
Erjavec, Tomaž; Fišer, Darja; Čibej, Jaka. - : Jožef Stefan Institute, 2016
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
19
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern