Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Hits 1 – 4 of 4

1	Learning Speaker Embedding from Text-to-Speech ...
	Cho, Jaejin; Zelasko, Piotr; Villalba, Jesus. - : arXiv, 2020
	BASE
	Show details

2	Language model integration based on memory control for sequence to sequence speech recognition ...
	Cho, Jaejin; Watanabe, Shinji; Hori, Takaaki. - : arXiv, 2018
	BASE
	Show details

3	Transfer learning of language-independent end-to-end ASR with language model fusion ...
	Inaguma, Hirofumi; Cho, Jaejin; Baskar, Murali Karthick. - : arXiv, 2018
	BASE
	Show details

4	Multilingual sequence-to-sequence speech recognition: architecture, transfer learning, and language modeling ...
	Cho, Jaejin; Baskar, Murali Karthick; Li, Ruizhi; Wiesner, Matthew; Mallidi, Sri Harish; Yalta, Nelson; Karafiat, Martin; Watanabe, Shinji; Hori, Takaaki. - : arXiv, 2018
	Abstract: Sequence-to-sequence (seq2seq) approach for low-resource ASR is a relatively new direction in speech research. The approach benefits by performing model training without using lexicon and alignments. However, this poses a new problem of requiring more data compared to conventional DNN-HMM systems. In this work, we attempt to use data from 10 BABEL languages to build a multi-lingual seq2seq model as a prior model, and then port them towards 4 other BABEL languages using transfer learning approach. We also explore different architectures for improving the prior multilingual seq2seq model. The paper also discusses the effect of integrating a recurrent neural network language model (RNNLM) with a seq2seq model during decoding. Experimental results show that the transfer learning approach from the multilingual model shows substantial gains over monolingual models across all 4 BABEL languages. Incorporating an RNNLM also brings significant improvements in terms of %WER, and achieves recognition performance ...
	Keyword: Audio and Speech Processing eess.AS; Computation and Language cs.CL; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; Machine Learning cs.LG; Sound cs.SD
	URL: https://arxiv.org/abs/1810.03459 https://dx.doi.org/10.48550/arxiv.1810.03459
	BASE
	Hide details

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern