1 |
Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation ...
|
|
Lovenia, Holy; Cahyawijaya, Samuel; Winata, Genta Indra; Xu, Peng; Yan, Xu; Liu, Zihan; Frieske, Rita; Yu, Tiezheng; Dai, Wenliang; Barezi, Elham J.; Chen, Qifeng; Ma, Xiaojuan; Shi, Bertram E.; Fung, Pascale. - : arXiv, 2021
|
|
Abstract:
Code-switching is a speech phenomenon occurring when a speaker switches language during a conversation. Despite the spontaneous nature of code-switching in conversational spoken language, most existing works collect code-switching data from read speech instead of spontaneous speech. ASCEND (A Spontaneous Chinese-English Dataset) is a high-quality Mandarin Chinese-English code-switching corpus built on spontaneous multi-turn conversational dialogue sources collected in Hong Kong. We report ASCEND's design and procedure for collecting the speech data, including annotations. ASCEND consists of 10.62 hours of clean speech, collected from 23 bilingual speakers of Chinese and English. Furthermore, we conduct baseline experiments using pre-trained wav2vec 2.0 models, achieving a best performance of 22.69\% character error rate and 27.05% mixed error rate. ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://arxiv.org/abs/2112.06223 https://dx.doi.org/10.48550/arxiv.2112.06223
|
|
BASE
|
|
Hide details
|
|
3 |
Continual Mixed-Language Pre-Training for Extremely Low-Resource Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
BiToD: A Bilingual Multi-Domain Dataset For Task-Oriented Dialogue Modeling ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
IndoNLG: Benchmark and Resources for Evaluating Indonesian Natural Language Generation ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Zero-Shot Dialogue State Tracking via Cross-Task Transfer ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Adapting High-resource NMT Models to Translate Low-resource Related Languages without Parallel Data ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Learning Fast Adaptation on Cross-Accented Speech Recognition ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Exploring Fine-tuning Techniques for Pre-trained Cross-lingual Models via Continual Learning ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Meta-Transfer Learning for Code-Switched Speech Recognition ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
On the Importance of Word Order Information in Cross-lingual Sequence Labeling ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Multilingual and Interlingual Semantic Representations for Natural Language Processing: A Brief Introduction
|
|
|
|
In: Computational Linguistics, Vol 46, Iss 2, Pp 249-255 (2020) (2020)
|
|
BASE
|
|
Show details
|
|
18 |
Attention-Informed Mixed-Language Training for Zero-shot Cross-lingual Task-oriented Dialogue Systems ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Zero-shot Cross-lingual Dialogue Systems with Transferable Latent Variables ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Towards Universal End-to-End Affect Recognition from Multilingual Speech by ConvNets ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|