1 |
Enhancing Sequence-to-Sequence Neural Lemmatization with External Resources ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
EstBERT: A Pretrained Language-Specific BERT for Estonian ...
|
|
|
|
Abstract:
This paper presents EstBERT, a large pretrained transformer-based language-specific BERT model for Estonian. Recent work has evaluated multilingual BERT models on Estonian tasks and found them to outperform the baselines. Still, based on existing studies on other languages, a language-specific BERT model is expected to improve over the multilingual ones. We first describe the EstBERT pretraining process and then present the results of the models based on finetuned EstBERT for multiple NLP tasks, including POS and morphological tagging, named entity recognition and text classification. The evaluation results show that the models based on EstBERT outperform multilingual BERT models on five tasks out of six, providing further evidence towards a view that training language-specific BERT models are still useful, even when multilingual models are available. ... : NoDaLiDa 2021 ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://dx.doi.org/10.48550/arxiv.2011.04784 https://arxiv.org/abs/2011.04784
|
|
BASE
|
|
Hide details
|
|
4 |
STransE: a novel embedding model of entities and relationships in knowledge bases ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
STransE : a novel embedding model of entities and relationships in knowledge bases
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Query-based single document summarization using an Ensemble Noisy Auto-Encoder
|
|
|
|
BASE
|
|
Show details
|
|
7 |
POS induction with distributional and morphological information using a distance-dependent Chinese Restaurant Process
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Minimally-supervised morphological segmentation using adaptor grammars
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Noisy-channel spelling correction models for Estonian learner language corpus lemmatisation
|
|
|
|
BASE
|
|
Show details
|
|
10 |
A Hierarchical dirichlet process model for joint part-of-speech and morphology induction
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Korpuste tükeldamine : rakendusi silpide ning allkeeltega ; Cutting the text corpora : applications with syllables and sub-languages
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Eesti silbisüsteemi struktuurist ; A preliminary structural view of the Estonian syllable system
|
|
|
|
BASE
|
|
Show details
|
|
|
|