DE eng

Search in the Catalogues and Directories

Hits 1 – 11 of 11

1
An auditory saliency pooling-based LSTM model for speech intelligibility classification
Abstract: This article belongs to the Section Computer and Engineering Science and Symmetry/Asymmetry. ; Speech intelligibility is a crucial element in oral communication that can be influenced by multiple elements, such as noise, channel characteristics, or speech disorders. In this paper, we address the task of speech intelligibility classification (SIC) in this last circumstance. Taking our previous works, a SIC system based on an attentional long short-term memory (LSTM) network, as a starting point, we deal with the problem of the inadequate learning of the attention weights due to training data scarcity. For overcoming this issue, the main contribution of this paper is a novel type of weighted pooling (WP) mechanism, called saliency pooling where the WP weights are not automatically learned during the training process of the network, but are obtained from an external source of information, the Kalinli’s auditory saliency model. In this way, it is intended to take advantage of the apparent symmetry between the human auditory attention mechanism and the attentional models integrated into deep learning networks. The developed systems are assessed on the UA-speech dataset that comprises speech uttered by subjects with several dysarthria levels. Results show that all the systems with saliency pooling significantly outperform a reference support vector machine (SVM)-based system and LSTM-based systems with mean pooling and attention pooling, suggesting that Kalinli’s saliency can be successfully incorporated into the LSTM architecture as an external cue for the estimation of the speech intelligibility level. ; The work leading to these results has been supported by the Spanish Ministry of Economy, Industry and Competitiveness through TEC2017-84395-P (MINECO) and TEC2017-84593-C2-1-R (MINECO) projects (AEI/FEDER, UE), and the Universidad Carlos III de Madrid under Strategic Action 2018/00071/001.
Keyword: Attention; Auditory saliency model; LSTM; Saliency; Speech intelligibility; Telecomunicaciones; Weighted pooling
URL: https://doi.org/10.3390/sym13091728
http://hdl.handle.net/10016/33706
BASE
Hide details
2
A Comparison of Open-Source Segmentation Architectures for Dealing with Imperfect Data from the Media in Speech Synthesis
Gallardo Antolín, Ascensión; Montero, Juan Manuel; King, Simon. - : International Speech Communication Association, 2014
BASE
Show details
3
A satisfaction-based model for affect recognition from conversational features in spoken dialog systems
In: Speech communication. - Amsterdam [u.a.] : Elsevier 55 (2013) 7, 825-840
OLC Linguistik
Show details
4
I Feel You: The Design and Evaluation of a Domotic Affect-Sensitive Spoken Conversational Agent
Lutfi, Syaheerah Lebai; Fernández-Martínez, Fernando; Lorenzo-Trueba, Jaime. - : Molecular Diversity Preservation International (MDPI), 2013
BASE
Show details
5
Automatic categorization for improving Spanish into Spanish Sign Language machine translation
In: Computer speech and language. - Amsterdam [u.a.] : Elsevier 26 (2012) 3, 149-167
BLLDB
OLC Linguistik
Show details
6
Speaker diarization based on intensity channel contribution
In: Institute of Electrical and Electronics Engineers. IEEE transactions on audio, speech and language processing. - New York, NY : Inst. 19 (2011) 4, 754-761
BLLDB
OLC Linguistik
Show details
7
Analysis of Statistical Parametric and Unit Selection Speech Synthesis Systems Applied to Emotional Speech
In: ISSN: 0167-6393 ; EISSN: 1872-7182 ; Speech Communication ; https://hal.archives-ouvertes.fr/hal-00627926 ; Speech Communication, Elsevier : North-Holland, 2010, 52 (5), pp.394. ⟨10.1016/j.specom.2009.12.007⟩ (2010)
BASE
Show details
8
Analysis of statistical parametric and unit selection speech synthesis systems applied to emotional speech
In: Speech communication. - Amsterdam [u.a.] : Elsevier 52 (2010) 5, 394-404
BLLDB
OLC Linguistik
Show details
9
Speech to sign language translation system for Spanish
In: Speech communication. - Amsterdam [u.a.] : Elsevier 50 (2008) 11-12, 1009-1020
BLLDB
OLC Linguistik
Show details
10
Knowledge-combining methodology for dialogue design in spoken language systems
In: International journal of speech technology. - Boston, Mass. [u.a.] : Kluwer Acad. Publ. 8 (2005) 1, 45-66
BLLDB
Show details
11
Selection of the most significant parameters for duration modelling in a Spanish text-to-speech system using neural networks
In: Computer speech and language. - Amsterdam [u.a.] : Elsevier 16 (2002) 2, 183-203
BLLDB
Show details

Catalogues
0
0
5
0
0
0
0
Bibliographies
6
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
4
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern