DE eng

Search in the Catalogues and Directories

Page: 1 2
Hits 1 – 20 of 31

1
What's New in EuReCo? Interoperability, Comparable Corpora, Licensing
Kupietz, Marc [Verfasser]; Margaretha, Eliza [Verfasser]; Diewald, Nils [Verfasser]. - Mannheim : Leibniz-Institut für Deutsche Sprache (IDS), Bibliothek, 2019
DNB Subject Category Language
Show details
2
The Vast and the Focused: On the need for domain-focused web corpora
Barbaresi, Adrien [Verfasser]; Bański, Piotr [Herausgeber]; Barbaresi, Adrien [Herausgeber]. - Mannheim : Leibniz-Institut für Deutsche Sprache (IDS), Bibliothek, 2019
DNB Subject Category Language
Show details
3
Asynchronous pipelines for processing huge corpora on medium to low resource infrastructures
Ortiz Suárez, Pedro Javier [Verfasser]; Sagot, Benoît [Verfasser]; Romary, Laurent [Verfasser]. - Mannheim : Leibniz-Institut für Deutsche Sprache (IDS), Bibliothek, 2019
DNB Subject Category Language
Show details
4
Proceedings of the Workshop on Challenges in the Management of Large Corpora (CMLC-7) 2019. Cardiff, 22 July 2019
Bański, Piotr [Herausgeber]; Barbaresi, Adrien [Herausgeber]; Biber, Hanno [Herausgeber]. - Mannheim : Leibniz-Institut für Deutsche Sprache (IDS), Bibliothek, 2019
DNB Subject Category Language
Show details
5
Modelling large parallel corpora. The Zurich Parallel Corpus Collection
Graën, Johannes [Verfasser]; Kew, Tannon [Verfasser]; Shaitarova, Anastassia [Verfasser]. - Mannheim : Leibniz-Institut für Deutsche Sprache (IDS), Bibliothek, 2019
DNB Subject Category Language
Show details
6
Deduplication in large web corpora
Benko, Vladimír [Verfasser]; Bański, Piotr [Herausgeber]; Barbaresi, Adrien [Herausgeber]. - Mannheim : Leibniz-Institut für Deutsche Sprache (IDS), Bibliothek, 2019
DNB Subject Category Language
Show details
7
The best of both worlds: Multi-billion word “dynamic” corpora
Lüngen, Harald [Herausgeber]; Breiteneder, Evelyn [Herausgeber]; Barbaresi, Adrien [Herausgeber]. - Mannheim : Leibniz-Institut für Deutsche Sprache (IDS), Bibliothek, 2019
DNB Subject Category Language
Show details
8
Modelling Large Parallel Corpora: The Zurich Parallel Corpus Collection
In: Graën, Johannes; Kew, Tannon; Shaitarova, Anastassia; Volk, Martin (2019). Modelling Large Parallel Corpora: The Zurich Parallel Corpus Collection. In: Challenges in the Management of Large Corpora (CMLC-7), Cardiff, Wales, 22 July 2019 - 22 July 2019. (2019)
BASE
Show details
9
Proceedings of the LREC 2018 Workshop “Challenges in the Management of Large Corpora (CMLC-6)” 07 May 2018 – Miyazaki, Japan
Bański, Piotr [Herausgeber]; Kupietz, Marc [Herausgeber]; Barbaresi, Adrien [Herausgeber]. - Mannheim : Institut für Deutsche Sprache, Bibliothek, 2018
DNB Subject Category Language
Show details
10
How to get the computation near the data: improving data accessibility to, and reusability of analysis functions in corpus query platforms
Kupietz, Marc [Verfasser]; Diewald, Nils [Verfasser]; Frankhauser, Peter [Verfasser]. - Mannheim : Institut für Deutsche Sprache, Bibliothek, 2018
DNB Subject Category Language
Show details
11
Increasing Interoperability for Embedding Corpus Annotation Pipelines in Wmatrix and other corpus retrieval tools
Abstract: Computational tools and methods employed in corpus linguistics are split into three main types: compilation, annotation and retrieval. These mirror and support the usual corpus linguistics methodology of corpus collection, manual and/or automatic tagging, followed by query and analysis. Typically, corpus software to support retrieval implements some or all of the five major methods in corpus linguistics only at the word level: frequency list, concordance, keyword, collocation and n-gram, and such software may or may not provide support for text which has already been tagged, for example at the part-of-speech (POS) level. Wmatrix is currently one of the few retrieval tools which have annotation tools built in. However, annotation in Wmatrix is currently limited to the UCREL English POS and semantic tagging pipeline. In this paper, we describe an approach to extend support for embedding other tagging pipelines and tools in Wmatrix via the use of APIs, and describe how such an approach is also applicable to other retrieval tools, potentially enabling support for tagged data.
URL: https://eprints.lancs.ac.uk/id/eprint/123923/1/wmatrix_interoperability_cmlc.pdf
https://eprints.lancs.ac.uk/id/eprint/123923/
BASE
Hide details
12
Challenges in the Management of Large Corpora (CMLC-6)
In: Challenges in the Management of Large Corpora (CMLC-6). Edited by: Banski, Piotr; Kupietz, Marc; Barbaresi, Adrien; Biber, Hanno; Breiteneder, Evelyn; Clematide, Simon; Witt, Andreas (2018). Paris: European Language Resources Association (ELRA). (2018)
BASE
Show details
13
Accelerating corpus search using multiple cores
Rábara, Radoslav [Verfasser]; Rychlý, Pavel [Verfasser]; Herman, Ondřej [Verfasser]. - Mannheim : Institut für Deutsche Sprache, Bibliothek, 2017
DNB Subject Category Language
Show details
14
Are web corpora inferior? The Case of Czech and Slovak
Benko, Vladimír [Verfasser]; Bański, Piotr [Herausgeber]; Kupietz, Marc [Herausgeber]. - Mannheim : Institut für Deutsche Sprache, Bibliothek, 2017
DNB Subject Category Language
Show details
15
Creating CorCenCC (Corpws Cenedlaethol Cymraeg Cyfoes - The National Corpus of Contemporary Welsh)
Knight, Dawn Verfasser]. - Mannheim : Institut für Deutsche Sprache, Bibliothek, 2017
DNB Subject Category Language
Show details
16
CMC Corpora in DeReKo
Lüngen, Harald [Verfasser] [Herausgeber]; Kupietz, Marc [Verfasser] [Herausgeber]; Bański, Piotr [Herausgeber]. - Mannheim : Institut für Deutsche Sprache, Bibliothek, 2017
DNB Subject Category Language
Show details
17
From ICE to ICC: The new International Comparable Corpus
Kirk, John [Verfasser]; Čermáková, Anna [Verfasser]; Bański, Piotr [Herausgeber]. - Mannheim : Institut für Deutsche Sprache, Bibliothek, 2017
DNB Subject Category Language
Show details
18
Intra-connecting an exemplary literary corpus with semantic web technologies for exploratory literary studies
Dittrich, Andreas [Verfasser]; Bański, Piotr [Herausgeber]; Kupietz, Marc [Herausgeber]. - Mannheim : Institut für Deutsche Sprache, Bibliothek, 2017
DNB Subject Category Language
Show details
19
Keeping Properties with the Data CL-MetaHeaders - An Open Specification
Vidler, John [Verfasser]; Wattam, Stephen [Verfasser]; Bański, Piotr [Herausgeber]. - Mannheim : Institut für Deutsche Sprache, Bibliothek, 2017
DNB Subject Category Language
Show details
20
Removing spam from web corpora through supervised learning using FastText
Suchomel, Vít [Verfasser]; Bański, Piotr [Herausgeber]; Kupietz, Marc [Herausgeber]. - Mannheim : Institut für Deutsche Sprache, Bibliothek, 2017
DNB Subject Category Language
Show details

Page: 1 2

Catalogues
0
0
0
0
23
0
1
Bibliographies
1
0
1
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
5
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern