3 |
Quality and Efficiency of Manual Annotation: Data from the Pre-annotation Bias Experiment (part of the PDT-C 2.0 project)
|
|
|
|
BASE
|
|
Show details
|
|
12 |
RobeCzech Base
|
|
|
|
Abstract:
RobeCzech is a monolingual RoBERTa language representation model trained on Czech data. RoBERTa is a robustly optimized Transformer-based pretraining approach. We show that RobeCzech considerably outperforms equally-sized multilingual and Czech-trained contextualized language representation models, surpasses current state of the art in all five evaluated NLP tasks and reaches state-of-theart results in four of them. The RobeCzech model is released publicly at https://hdl.handle.net/11234/1-3691 and https://huggingface.co/ufal/robeczech-base, both for PyTorch and TensorFlow.
|
|
Keyword:
BERT; Czech; Czech language; RoBERTa
|
|
URL: http://hdl.handle.net/11234/1-3691
|
|
BASE
|
|
Hide details
|
|
16 |
Czech HS Contracts Dataset (CHSC) 1.0
|
|
Szabó, Adam; Straka, Milan. - : Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL), 2021
|
|
BASE
|
|
Show details
|
|
17 |
RobeCzech: Czech RoBERTa, a monolingual contextualized language representation model ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
ÚFAL at MultiLexNorm 2021: Improving Multilingual Lexical Normalization by Fine-tuning ByT5 ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Morpho-syntactically annotated corpora provided for the PARSEME Shared Task on Semi-Supervised Identification of Verbal Multiword Expressions (edition 1.2)
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Slovak MorphoDiTa Models 170914
|
|
Straka, Milan. - : Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL), 2020
|
|
BASE
|
|
Show details
|
|
|
|