DE eng

Search in the Catalogues and Directories

Hits 1 – 6 of 6

1
Automatic Speech Recognition : from hybrid to end-to-end approach ; Reconnaissance automatique de la parole à large vocabulaire : des approches hybrides aux approches End-to-End
Heba, Abdelwahab. - : HAL CCSD, 2021
In: https://tel.archives-ouvertes.fr/tel-03616588 ; Intelligence artificielle [cs.AI]. Université Paul Sabatier - Toulouse III, 2021. Français. ⟨NNT : 2021TOU30116⟩ (2021)
BASE
Show details
2
Large vocabulary automatic speech recognition: from hybrid to end-to-end approaches ; Reconnaissance automatique de la parole à large vocabulaire : des approches hybrides aux approches End-to-End
Heba, Abdelwahab. - : HAL CCSD, 2021
In: https://hal.archives-ouvertes.fr/tel-03269807 ; Son [cs.SD]. Université toulouse 3 Paul Sabatier, 2021. Français (2021)
BASE
Show details
3
Char+CV-CTC: Combining Graphemes and Consonant/Vowel Units for CTC-Based ASR Using Multitask Learning
In: Proceedings of INTERSPEECH 2019 ; 20th Annual Conference of the International Speech Communication Association (INTERSPEECH 2019) ; https://hal.archives-ouvertes.fr/hal-02419431 ; 20th Annual Conference of the International Speech Communication Association (INTERSPEECH 2019), Sep 2019, Graz, Austria. pp.1611-1615 (2019)
Abstract: International audience ; Previous work has shown that end-to-end neural-based speech recognition systems can be improved by adding auxiliary tasks at intermediate layers. In this paper, we report multitask learning (MTL) experiments in the context of connectionist temporal classification (CTC) based speech recognition at character level. We compare several MTL architectures that jointly learn to predict characters (sometimes called graphemes) and consonant/vowel (CV) binary labels. The best approach, which we call Char+CV-CTC, adds up the character and CV logits to obtain the final character predictions. The idea is to put more weight on the vowel (consonant) characters when the vowel (consonant) symbol ‘V’ (‘C’) is predicted in the auxiliary-task branch of the network. Experiments were carried out on the Wall Street Journal (WSJ) corpus. Char+CV-CTC achieved the best ASR results with a 2.2% Character Error Rate and a 6.1% Word Error Rate (WER) on the Eval92 evaluation subset. This model outperformed its monotask model counterpart by 0.7% absolute in WER and also achieved almost the same performance of 6.0% as a strong baseline phone-based Time Delay Neural Network (“TDNN-Phone+TR2”) model.
Keyword: [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; Automatic speech recognition; Connectionist tem-poral classification; Multi-task learning
URL: https://hal.archives-ouvertes.fr/hal-02419431/file/heba_25028.pdf
https://hal.archives-ouvertes.fr/hal-02419431/document
https://hal.archives-ouvertes.fr/hal-02419431
BASE
Hide details
4
Lexical Emphasis Detection in Spoken French using F-BANKs and neural networks
In: SLSP 2017: Statistical Language and Speech Processing ; International Conference on Statistical Language and Speech Processing (SLSP 2017) ; https://hal.archives-ouvertes.fr/hal-02559768 ; International Conference on Statistical Language and Speech Processing (SLSP 2017), Oct 2017, Le Mans, France. pp.241-249 (2017)
BASE
Show details
5
Twist your logic with TouIST
In: Proceedings of the 4th International Conference on Tools for Teaching Logic ; 4th International Congress on Tools for Teaching Logic (TTL 2015) ; https://hal.archives-ouvertes.fr/hal-01671317 ; 4th International Congress on Tools for Teaching Logic (TTL 2015), IRISA: Institut de Recherche en Informatique et Systèmes Aléatoires; INRIA, Jun 2015, Rennes, France. pp.1-8 ; http://ttl2015.irisa.fr/ (2015)
BASE
Show details
6
Twist your logic with TouIST ...
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
6
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern