Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4

Hits 1 – 20 of 77

1	Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset ...
	Yu, Tiezheng; Frieske, Rita; Xu, Peng; Cahyawijaya, Samuel; Yiu, Cheuk Tung Shadow; Lovenia, Holy; Dai, Wenliang; Barezi, Elham J.; Chen, Qifeng; Ma, Xiaojuan; Shi, Bertram E.; Fung, Pascale. - : arXiv, 2022
	Abstract: Automatic speech recognition (ASR) on low resource languages improves the access of linguistic minorities to technological advantages provided by artificial intelligence (AI). In this paper, we address the problem of data scarcity for the Hong Kong Cantonese language by creating a new Cantonese dataset. Our dataset, Multi-Domain Cantonese Corpus (MDCC), consists of 73.6 hours of clean read speech paired with transcripts, collected from Cantonese audiobooks from Hong Kong. It comprises philosophy, politics, education, culture, lifestyle and family domains, covering a wide range of topics. We also review all existing Cantonese datasets and analyze them according to their speech type, data source, total size and availability. We further conduct experiments with Fairseq S2T Transformer, a state-of-the-art ASR model, on the biggest existing dataset, Common Voice zh-HK, and our proposed MDCC, and the results show the effectiveness of our dataset. In addition, we create a powerful and robust Cantonese ASR model by ...
	Keyword: Audio and Speech Processing eess.AS; Computation and Language cs.CL; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; Sound cs.SD
	URL: https://dx.doi.org/10.48550/arxiv.2201.02419 https://arxiv.org/abs/2201.02419
	BASE
	Hide details

2	ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation ...
	Lovenia, Holy; Cahyawijaya, Samuel; Winata, Genta Indra. - : arXiv, 2021
	BASE
	Show details

3	Continual Mixed-Language Pre-Training for Extremely Low-Resource Neural Machine Translation ...
	Liu, Zihan; Winata, Genta Indra; Fung, Pascale. - : arXiv, 2021
	BASE
	Show details

4	Language Models are Few-shot Multilingual Learners ...
	Winata, Genta Indra; Madotto, Andrea; Lin, Zhaojiang. - : arXiv, 2021
	BASE
	Show details

5	BiToD: A Bilingual Multi-Domain Dataset For Task-Oriented Dialogue Modeling ...
	Lin, Zhaojiang; Madotto, Andrea; Winata, Genta Indra. - : arXiv, 2021
	BASE
	Show details

6	Are Multilingual Models Effective in Code-Switching? ...
	Winata, Genta Indra; Cahyawijaya, Samuel; Liu, Zihan. - : arXiv, 2021
	BASE
	Show details

7	IndoNLG: Benchmark and Resources for Evaluating Indonesian Natural Language Generation ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; ., Sebastian; Bahar, Syafri. - : Underline Science Inc., 2021
	BASE
	Show details

8	Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; Dai, Wenliang; Fung, Pascale. - : Underline Science Inc., 2021
	BASE
	Show details

9	Zero-Shot Dialogue State Tracking via Cross-Task Transfer ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; Cho, Eunjoon; Crook, Paul. - : Underline Science Inc., 2021
	BASE
	Show details

10	XPersona: Evaluating Multilingual Personalized Chatbot ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; ., Zihan; Bang, Yejin. - : Underline Science Inc., 2021
	BASE
	Show details

11	Adapting High-resource NMT Models to Translate Low-resource Related Languages without Parallel Data ...
	Ko, Wei-Jen; El-Kishky, Ahmed; Renduchintala, Adithya. - : arXiv, 2021
	BASE
	Show details

12	Learning Fast Adaptation on Cross-Accented Speech Recognition ...
	Winata, Genta Indra; Cahyawijaya, Samuel; Liu, Zihan. - : arXiv, 2020
	BASE
	Show details

13	Exploring Fine-tuning Techniques for Pre-trained Cross-lingual Models via Continual Learning ...
	Liu, Zihan; Winata, Genta Indra; Madotto, Andrea. - : arXiv, 2020
	BASE
	Show details

14	XPersona: Evaluating Multilingual Personalized Chatbot ...
	Lin, Zhaojiang; Liu, Zihan; Winata, Genta Indra. - : arXiv, 2020
	BASE
	Show details

15	Meta-Transfer Learning for Code-Switched Speech Recognition ...
	Winata, Genta Indra; Cahyawijaya, Samuel; Lin, Zhaojiang. - : arXiv, 2020
	BASE
	Show details

16	On the Importance of Word Order Information in Cross-lingual Sequence Labeling ...
	Liu, Zihan; Winata, Genta Indra; Cahyawijaya, Samuel. - : arXiv, 2020
	BASE
	Show details

17	Multilingual and Interlingual Semantic Representations for Natural Language Processing: A Brief Introduction
	Costa-jussà, Marta R.; España-Bonet, Cristina; Fung, Pascale...
	In: Computational Linguistics, Vol 46, Iss 2, Pp 249-255 (2020) (2020)
	BASE
	Show details

18	Attention-Informed Mixed-Language Training for Zero-shot Cross-lingual Task-oriented Dialogue Systems ...
	Liu, Zihan; Winata, Genta Indra; Lin, Zhaojiang. - : arXiv, 2019
	BASE
	Show details

19	Zero-shot Cross-lingual Dialogue Systems with Transferable Latent Variables ...
	Liu, Zihan; Shin, Jamin; Xu, Yan. - : arXiv, 2019
	BASE
	Show details

20	Towards Universal End-to-End Affect Recognition from Multilingual Speech by ConvNets ...
	Bertero, Dario; Kampman, Onno; Fung, Pascale. - : arXiv, 2019
	BASE
	Show details

Page: 1 2 3 4

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern