Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Hits 1 – 13 of 13

1	Improving the fusion of acoustic and text representations in RNN-T ...
	Zhang, Chao; Li, Bo; Lu, Zhiyun. - : arXiv, 2022
	BASE
	Show details

2	Joint Unsupervised and Supervised Training for Multilingual ASR ...
	Bai, Junwen; Li, Bo; Zhang, Yu. - : arXiv, 2021
	BASE
	Show details

3	Scaling End-to-End Models for Large-Scale Multilingual ASR ...
	Li, Bo; Pang, Ruoming; Sainath, Tara N.. - : arXiv, 2021
	BASE
	Show details

4	Improving Proper Noun Recognition in End-to-End ASR By Customization of the MWER Loss Criterion ...
	Peyser, Cal; Sainath, Tara N.; Pundak, Golan. - : arXiv, 2020
	Abstract: Proper nouns present a challenge for end-to-end (E2E) automatic speech recognition (ASR) systems in that a particular name may appear only rarely during training, and may have a pronunciation similar to that of a more common word. Unlike conventional ASR models, E2E systems lack an explicit pronounciation model that can be specifically trained with proper noun pronounciations and a language model that can be trained on a large text-only corpus. Past work has addressed this issue by incorporating additional training data or additional models. In this paper, we instead build on recent advances in minimum word error rate (MWER) training to develop two new loss criteria that specifically emphasize proper noun recognition. Unlike past work on this problem, this method requires no new data during training or external models during inference. We see improvements ranging from 2% to 7% relative on several relevant benchmarks. ...
	Keyword: Audio and Speech Processing eess.AS; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; Machine Learning cs.LG; Sound cs.SD
	URL: https://dx.doi.org/10.48550/arxiv.2005.09756 https://arxiv.org/abs/2005.09756
	BASE
	Hide details

5	Deliberation Model Based Two-Pass End-to-End Speech Recognition ...
	Hu, Ke; Sainath, Tara N.; Pang, Ruoming. - : arXiv, 2020
	BASE
	Show details

6	Large-Scale Multilingual Speech Recognition with a Streaming End-to-End Model ...
	Kannan, Anjuli; Datta, Arindrima; Sainath, Tara N.. - : arXiv, 2019
	BASE
	Show details

7	Contextual Speech Recognition with Difficult Negative Training Examples ...
	Alon, Uri; Pundak, Golan; Sainath, Tara N.. - : arXiv, 2018
	BASE
	Show details

8	Multi-Dialect Speech Recognition With A Single Sequence-To-Sequence Model ...
	Li, Bo; Sainath, Tara N.; Sim, Khe Chai. - : arXiv, 2017
	BASE
	Show details

9	No Need for a Lexicon? Evaluating the Value of the Pronunciation Lexica in End-to-End Models ...
	Sainath, Tara N.; Prabhavalkar, Rohit; Kumar, Shankar. - : arXiv, 2017
	BASE
	Show details

10	Multilingual Speech Recognition With A Single End-To-End Model ...
	Toshniwal, Shubham; Sainath, Tara N.; Weiss, Ron J.. - : arXiv, 2017
	BASE
	Show details

11	Exemplar-based sparse representation features: from TIMIT to LVCSR
	Sainath, Tara N.; Kanevsky, Dimitri; Ramabhadran, Bhuvana...
	In: Institute of Electrical and Electronics Engineers. IEEE transactions on audio, speech and language processing. - New York, NY : Inst. 19 (2011) 8, 2598-2613
	BLLDB
	OLC Linguistik
	Show details

12	Applications of broad class knowledge for noise robust speech recognition
	Sainath, Tara N. - : Massachusetts Institute of Technology, 2009
	BASE
	Show details

13	Acoustic landmark detection and segmentation using the McAulay-Quatieri Sinusoidal Model
	Sainath, Tara N. - : Massachusetts Institute of Technology, 2005
	BASE
	Show details

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern