8 |
Prague Dependency Treebank -- Consolidated 1.0 ...
|
|
|
|
Abstract:
We present a richly annotated and genre-diversified language resource, the Prague Dependency Treebank-Consolidated 1.0 (PDT-C 1.0), the purpose of which is - as it always been the case for the family of the Prague Dependency Treebanks - to serve both as a training data for various types of NLP tasks as well as for linguistically-oriented research. PDT-C 1.0 contains four different datasets of Czech, uniformly annotated using the standard PDT scheme (albeit not everything is annotated manually, as we describe in detail here). The texts come from different sources: daily newspaper articles, Czech translation of the Wall Street Journal, transcribed dialogs and a small amount of user-generated, short, often non-standard language segments typed into a web translator. Altogether, the treebank contains around 180,000 sentences with their morphological, surface and deep syntactic annotation. The diversity of the texts and annotations should serve well the NLP applications as well as it is an invaluable resource for ... : Accepted at LREC 2020 (Proceedings of Language Resources and Evaluation, Marseille, France) ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://arxiv.org/abs/2006.03679 https://dx.doi.org/10.48550/arxiv.2006.03679
|
|
BASE
|
|
Hide details
|
|
11 |
Universal Dependencies 2.2
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-01930733 ; 2018 (2018)
|
|
BASE
|
|
Show details
|
|
14 |
Universal Dependencies 2.1
|
|
|
|
In: https://hal.inria.fr/hal-01682188 ; 2017 (2017)
|
|
BASE
|
|
Show details
|
|
15 |
Universal Dependencies 2.0 – CoNLL 2017 Shared Task Development and Test Data
|
|
|
|
BASE
|
|
Show details
|
|
19 |
CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies
|
|
|
|
BASE
|
|
Show details
|
|
|
|