2 |
Language model integration based on memory control for sequence to sequence speech recognition ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Transfer learning of language-independent end-to-end ASR with language model fusion ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Multilingual sequence-to-sequence speech recognition: architecture, transfer learning, and language modeling ...
|
|
Cho, Jaejin; Baskar, Murali Karthick; Li, Ruizhi; Wiesner, Matthew; Mallidi, Sri Harish; Yalta, Nelson; Karafiat, Martin; Watanabe, Shinji; Hori, Takaaki. - : arXiv, 2018
|
|
Abstract:
Sequence-to-sequence (seq2seq) approach for low-resource ASR is a relatively new direction in speech research. The approach benefits by performing model training without using lexicon and alignments. However, this poses a new problem of requiring more data compared to conventional DNN-HMM systems. In this work, we attempt to use data from 10 BABEL languages to build a multi-lingual seq2seq model as a prior model, and then port them towards 4 other BABEL languages using transfer learning approach. We also explore different architectures for improving the prior multilingual seq2seq model. The paper also discusses the effect of integrating a recurrent neural network language model (RNNLM) with a seq2seq model during decoding. Experimental results show that the transfer learning approach from the multilingual model shows substantial gains over monolingual models across all 4 BABEL languages. Incorporating an RNNLM also brings significant improvements in terms of %WER, and achieves recognition performance ...
|
|
Keyword:
Audio and Speech Processing eess.AS; Computation and Language cs.CL; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; Machine Learning cs.LG; Sound cs.SD
|
|
URL: https://arxiv.org/abs/1810.03459 https://dx.doi.org/10.48550/arxiv.1810.03459
|
|
BASE
|
|
Hide details
|
|
|
|