DE eng

Search in the Catalogues and Directories

Page: 1 2 3
Hits 1 – 20 of 52

1
Neural Token Segmentation for High Token-Internal Complexity ...
Brusilovsky, Idan; Tsarfaty, Reut. - : arXiv, 2022
Abstract: Tokenizing raw texts into word units is an essential pre-processing step for critical tasks in the NLP pipeline such as tagging, parsing, named entity recognition, and more. For most languages, this tokenization step straightforward. However, for languages with high token-internal complexity, further token-to-word segmentation is required. Previous canonical segmentation studies were based on character-level frameworks, with no contextualised representation involved. Contextualized vectors a la BERT show remarkable results in many applications, but were not shown to improve performance on linguistic segmentation per se. Here we propose a novel neural segmentation model which combines the best of both worlds, contextualised token representation and char-level decoding, which is particularly effective for languages with high token-internal complexity and extreme morphological ambiguity. Our model shows substantial improvements in segmentation accuracy on Hebrew and Arabic compared to the state-of-the-art, and ...
Keyword: Computation and Language cs.CL; FOS Computer and information sciences
URL: https://arxiv.org/abs/2203.10845
https://dx.doi.org/10.48550/arxiv.2203.10845
BASE
Hide details
2
Morphological Reinflection with Multiple Arguments: An Extended Annotation schema and a Georgian Case Study ...
BASE
Show details
3
Exploiting emojis for abusive language detection
Wiegand, Michael [Verfasser]; Ruppenhofer, Josef [Verfasser]; Merlo, Paola [Herausgeber]. - Mannheim : Leibniz-Institut für Deutsche Sprache (IDS), Bibliothek, 2021
DNB Subject Category Language
Show details
4
Implicitly abusive comparisons – a new dataset and linguistic analysis
Wiegand, Michael [Verfasser]; Geulig, Maja [Verfasser]; Ruppenhofer, Josef [Verfasser]. - Mannheim : Leibniz-Institut für Deutsche Sprache (IDS), Bibliothek, 2021
DNB Subject Category Language
Show details
5
Universal Dependencies 2.9
Zeman, Daniel; Nivre, Joakim; Abrams, Mitchell. - : Universal Dependencies Consortium, 2021
BASE
Show details
6
Universal Dependencies 2.8.1
Zeman, Daniel; Nivre, Joakim; Abrams, Mitchell. - : Universal Dependencies Consortium, 2021
BASE
Show details
7
Universal Dependencies 2.8
Zeman, Daniel; Nivre, Joakim; Abrams, Mitchell. - : Universal Dependencies Consortium, 2021
BASE
Show details
8
Minimal Supervision for Morphological Inflection ...
Goldman, Omer; Tsarfaty, Reut. - : arXiv, 2021
BASE
Show details
9
Well-Defined Morphology is Sentence-Level Morphology ...
BASE
Show details
10
Applying the Transformer to Character-level Transduction
In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume (2021)
BASE
Show details
11
Telling BERT's Full Story: from Local Attention to Global Aggregation
In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume (2021)
BASE
Show details
12
Disambiguatory Signals are Stronger in Word-initial Positions
In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume (2021)
BASE
Show details
13
Minimal Supervision for Morphological Inflection ...
BASE
Show details
14
Asking It All: Generating Contextualized Questions for any Semantic Role ...
BASE
Show details
15
The Possible, the Plausible, and the Desirable: Event-Based Modality Detection for Language Processing ...
BASE
Show details
16
The Possible, the Plausible, and the Desirable: Event-Based Modality Detection for Language Processing ...
BASE
Show details
17
Formae reformandae: for a reorganisation of verb form annotation in Universal Dependencies illustrated by the specific case of Latin
Cecchini, Flavio Massimiliano (orcid:0000-0001-9029-1822). - : Association for Computational Linguistics, 2021. : country:BGR, 2021. : place:Sofia, 2021
BASE
Show details
18
RelWalk - A Latent Variable Model Approach to Knowledge Graph Embedding.
Bollegala, Danushka; Kawarabayashi, Ken-ichi; Yoshida, Yuichi. - : Association for Computational Linguistics, 2021
BASE
Show details
19
Dictionary-based Debiasing of Pre-trained Word Embeddings.
Bollegala, Danushka; Kaneko, Masahiro. - : Association for Computational Linguistics, 2021
BASE
Show details
20
Debiasing Pre-trained Contextualised Embeddings.
Kaneko, Masahiro; Bollegala, Danushka. - : Association for Computational Linguistics, 2021
BASE
Show details

Page: 1 2 3

Catalogues
0
0
0
0
2
0
0
Bibliographies
1
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
1
0
0
0
Open access documents
48
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern