DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4 5 6 7 8 9...690
Hits 81 – 100 of 13.783

81
Adapting BigScience Multilingual Model to Unseen Languages ...
BASE
Show details
82
On Efficiently Acquiring Annotations for Multilingual Models ...
BASE
Show details
83
Team ÚFAL at CMCL 2022 Shared Task: Figuring out the correct recipe for predicting Eye-Tracking features using Pretrained Language Models ...
BASE
Show details
84
Does Corpus Quality Really Matter for Low-Resource Languages? ...
BASE
Show details
85
IIITDWD-ShankarB@ Dravidian-CodeMixi-HASOC2021: mBERT based model for identification of offensive content in south Indian languages ...
Biradar, Shankar; Saumya, Sunil. - : arXiv, 2022
BASE
Show details
86
mSLAM: Massively multilingual joint pre-training for speech and text ...
Bapna, Ankur; Cherry, Colin; Zhang, Yu. - : arXiv, 2022
BASE
Show details
87
On the Representation Collapse of Sparse Mixture of Experts ...
Abstract: Sparse mixture of experts provides larger model capacity while requiring a constant computational overhead. It employs the routing mechanism to distribute input tokens to the best-matched experts according to their hidden representations. However, learning such a routing mechanism encourages token clustering around expert centroids, implying a trend toward representation collapse. In this work, we propose to estimate the routing scores between tokens and experts on a low-dimensional hypersphere. We conduct extensive experiments on cross-lingual language model pre-training and fine-tuning on downstream tasks. Experimental results across seven multilingual benchmarks show that our method achieves consistent gains. We also present a comprehensive analysis on the representation and routing behaviors of our models. Our method alleviates the representation collapse issue and achieves more consistent routing than the baseline mixture-of-experts methods. ...
Keyword: Computation and Language cs.CL; FOS Computer and information sciences; Machine Learning cs.LG
URL: https://dx.doi.org/10.48550/arxiv.2204.09179
https://arxiv.org/abs/2204.09179
BASE
Hide details
88
Politics and Virality in the Time of Twitter: A Large-Scale Cross-Party Sentiment Analysis in Greece, Spain and United Kingdom ...
BASE
Show details
89
L3Cube-MahaHate: A Tweet-based Marathi Hate Speech Detection Dataset and BERT models ...
BASE
Show details
90
Few-Shot Cross-lingual Transfer for Coarse-grained De-identification of Code-Mixed Clinical Texts ...
BASE
Show details
91
A Unified Strategy for Multilingual Grammatical Error Correction with Pre-trained Cross-Lingual Language Model ...
Sun, Xin; Ge, Tao; Ma, Shuming. - : arXiv, 2022
BASE
Show details
92
A New Generation of Perspective API: Efficient Multilingual Character-level Transformers ...
Lees, Alyssa; Tran, Vinh Q.; Tay, Yi. - : arXiv, 2022
BASE
Show details
93
Factual Consistency of Multilingual Pretrained Language Models ...
BASE
Show details
94
Examining Scaling and Transfer of Language Model Architectures for Machine Translation ...
BASE
Show details
95
MuMiN: A Large-Scale Multilingual Multimodal Fact-Checked Misinformation Social Network Dataset ...
BASE
Show details
96
Mono vs Multilingual BERT for Hate Speech Detection and Text Classification: A Case Study in Marathi ...
BASE
Show details
97
Agreement ...
Tal, Shira. - : Open Science Framework, 2022
BASE
Show details
98
Agreement ...
Tal, Shira. - : Open Science Framework, 2022
BASE
Show details
99
Natural Language Descriptions of Deep Visual Features ...
BASE
Show details
100
From Examples to Rules: Neural Guided Rule Synthesis for Information Extraction ...
BASE
Show details

Page: 1 2 3 4 5 6 7 8 9...690

Catalogues
517
4
412
0
2
0
22
Bibliographies
2.117
0
0
0
0
0
0
5
50
Linked Open Data catalogues
0
Online resources
73
17
0
0
Open access documents
11.476
5
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern