1 |
A Comparison of Hybrid and End-to-End ASR Systems for the IberSpeech-RTVE 2020 Speech-to-Text Transcription Challenge
|
|
|
|
In: Applied Sciences; Volume 12; Issue 2; Pages: 903 (2022)
|
|
Abstract:
This paper describes a comparison between hybrid and end-to-end Automatic Speech Recognition (ASR) systems, which were evaluated on the IberSpeech-RTVE 2020 Speech-to-Text Transcription Challenge. Deep Neural Networks (DNNs) are becoming the most promising technology for ASR at present. In the last few years, traditional hybrid models have been evaluated and compared to other end-to-end ASR systems in terms of accuracy and efficiency. We contribute two different approaches: a hybrid ASR system based on a DNN-HMM and two state-of-the-art end-to-end ASR systems, based on Lattice-Free Maximum Mutual Information (LF-MMI). To address the high difficulty in the speech-to-text transcription of recordings with different speaking styles and acoustic conditions from TV studios to live recordings, data augmentation and Domain Adversarial Training (DAT) techniques were studied. Multi-condition data augmentation applied to our hybrid DNN-HMM demonstrated WER improvements in noisy scenarios (about 10% relatively). In contrast, the results obtained using an end-to-end PyChain-based ASR system were far from our expectations. Nevertheless, we found that when including DAT techniques, a relative WER improvement of 2.87% was obtained as compared to the PyChain-based system.
|
|
Keyword:
ASR systems; domain adversarial training; end-to-end deep learning; hybrid DNN-HMM; TV show speech-to-text transcription
|
|
URL: https://doi.org/10.3390/app12020903
|
|
BASE
|
|
Hide details
|
|
2 |
What Do We Know about Hybrid Regimes after Two Decades of Scholarship?
|
|
|
|
In: Politics and Governance ; 6 ; 2 ; 112-119 ; Authoritarianism in the 21st Century (2022)
|
|
BASE
|
|
Show details
|
|
3 |
Automatic Speech Recognition : from hybrid to end-to-end approach ; Reconnaissance automatique de la parole à large vocabulaire : des approches hybrides aux approches End-to-End
|
|
|
|
In: https://tel.archives-ouvertes.fr/tel-03616588 ; Intelligence artificielle [cs.AI]. Université Paul Sabatier - Toulouse III, 2021. Français. ⟨NNT : 2021TOU30116⟩ (2021)
|
|
BASE
|
|
Show details
|
|
4 |
Large vocabulary automatic speech recognition: from hybrid to end-to-end approaches ; Reconnaissance automatique de la parole à large vocabulaire : des approches hybrides aux approches End-to-End
|
|
|
|
In: https://hal.archives-ouvertes.fr/tel-03269807 ; Son [cs.SD]. Université toulouse 3 Paul Sabatier, 2021. Français (2021)
|
|
BASE
|
|
Show details
|
|
5 |
Implementing an Intelligent Collaborative Agent as Teammate in Collaborative Writing: toward a Synergy of Humans and AI
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Script optimization for TTS voice corpus design in audio-book generation ; Optimisation de script pour la conception de corpus vocaux de TTS dans la génération de livres audio
|
|
|
|
In: https://tel.archives-ouvertes.fr/tel-03270968 ; Computation and Language [cs.CL]. Université Rennes 1, 2020. English. ⟨NNT : 2020REN1S107⟩ (2020)
|
|
BASE
|
|
Show details
|
|
7 |
'n Model vir 'n aanlyn GIS-vakwoordeboek
|
|
|
|
In: Lexikos, Vol 30, Pp 499-518 (2020) (2020)
|
|
BASE
|
|
Show details
|
|
8 |
When humans and machines collaborate: Cross-lingual Label Editing in Wikidata ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
When humans and machines collaborate: Cross-lingual Label Editing in Wikidata ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
An Ontological Driven Approach of HAD Specific Language Designing
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-01309151 ; 2016 (2016)
|
|
BASE
|
|
Show details
|
|
11 |
Evaluating the impact of using a domain-specific bilingual lexicon on the performance of a hybrid machine translation approach
|
|
|
|
In: 10th International Conference on Recent Advances in Natural Language Processing, RANLP 201 ; https://hal-cea.archives-ouvertes.fr/cea-01844051 ; 10th International Conference on Recent Advances in Natural Language Processing, RANLP 201, Sep 2015, Hissar, Bulgaria. pp.579-587 (2015)
|
|
BASE
|
|
Show details
|
|
12 |
Projetos sobre tradução automática do português no laboratório de sistemas de língua falada do INESC-ID
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Interaction between Linguists and Machine Learning ; L'interaction entre linguistes et apprentissage automatique
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-01118870 ; 2014 (2014)
|
|
BASE
|
|
Show details
|
|
14 |
Modelo de controlo difuso de um sistema de produção de energia com base em recursos renováveis
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Enhancing a Rule-Based MT System with Cross-Lingual WSD
|
|
|
|
In: Rudnick, Alex; Rios, Annette; Gasser, Michael (2014). Enhancing a Rule-Based MT System with Cross-Lingual WSD. In: SaLTMiL Workshop on free/open-source language resources for the machine translation of less-resourced languages (LREC'14), Reykjavik, Iceland, 22 May 2014. SALTMIL, 31-36. (2014)
|
|
BASE
|
|
Show details
|
|
16 |
Hibridación de sistemas borrosos para el modelado y control ; Hybridization of fuzzy systems for modeling and control
|
|
|
|
BASE
|
|
Show details
|
|
17 |
A HYBRID FUZZY/GENETIC ALGORITHM FOR INTRUSION DETECTION IN RFID SYSTEMS
|
|
|
|
BASE
|
|
Show details
|
|
18 |
A HYBRID FUZZY/GENETIC ALGORITHM FOR INTRUSION DETECTION IN RFID SYSTEMS
|
|
|
|
BASE
|
|
Show details
|
|
19 |
A random forest system combination approach for error detection in digital dictionaries
|
|
|
|
BASE
|
|
Show details
|
|
20 |
A Hybrid Approach for QA Track Definitional Questions
|
|
|
|
In: DTIC (2006)
|
|
BASE
|
|
Show details
|
|
|
|