Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Hits 1 – 11 of 11

1	Unsupervised quantification of entity consistency between photos and text in real-world news ...
	Müller-Budack, Eric. - : Hannover : Institutionelles Repositorium der Leibniz Universität Hannover, 2022
	BASE
	Show details

2	Traitement neuronal des voix et familiarité : entre reconnaissance et identification du locuteur
	Plante-Hébert, Julien. - 2020
	BASE
	Show details

3	Non-linguistic Vocalization Recognition Based on Convolutional, Long Short-Term Memory, Deep Neural Networks
	Qiu, Liang. - : eScholarship, University of California, 2018
	In: Qiu, Liang. (2018). Non-linguistic Vocalization Recognition Based on Convolutional, Long Short-Term Memory, Deep Neural Networks. UCLA: Electrical Engineering 0303. Retrieved from: http://www.escholarship.org/uc/item/1pz29229 (2018)
	Abstract: Non-linguistic Vocalization Recognition refers to the detection and classification of non-speech voice such as laughter, sneeze, cough, cry, screaming, etc. It could be seen as a subtask of Acoustic Event Detection (AED). Great progress has been made by previous research to increase the accuracy of AED. On the front end, multiple kinds of features such as Mel-Frequency Cepstral Coefficients (MFCCs), Gammatone Cepstral Coefficients (GTCCs) and many other hand-crafted features were explored. While on the back end, models or methods such as Gaussian Mixture Models (GMMs), Hidden Markov Models (HMMs), Bags-of-Audio-Words (BoAW), Support Vector Machine (SVM) and various types of neural networks were experimented. Recent researches on Automatic Speech Recognition (ASR) and Acoustic Scene Classification (ASC) show the advantage of using Convolutional, Long Short-Term Memory, Deep Neural Networks (CLDNNs) on audio processing tasks. In this thesis, I am building a non-linguistic vocalization recognition system using CLDNNs. Log Mel-filterbank coefficients are adopted as input features and data augmentation methods such as random shifting and noise mixture are discussed. The built system is evaluated on a custom dataset collected from several resources and tested for real time application. The performance of CLDNNs for non-linguistic vocalization recognition is also compared with hybrid GMM-SVMs, Convolutional Neural Networks, Long Short-Term Memory and a fully connected Deep Neural Network trained on VGGish embeddings. The results indicate that CLDNNs outperform the other models in classification precision and recall. Visualization of CLDNNs are presented to help understand the framework. The model is proved accurate and fast enough for real time applications.
	Keyword: acoustic event detection; Artificial intelligence; CLDNNs; Computer science; Electrical engineering; non-linguistic vocalization recognition
	URL: http://www.escholarship.org/uc/item/1pz29229
	BASE
	Hide details

4	Electrophysiological evidence for the integral nature of tone in Mandarin spoken word recognition
	Ho, Amanda. - 2015
	BASE
	Show details

5	Neural responses demonstrate the dynamicity of speech perception
	Kramer, Samantha. - 2014
	BASE
	Show details

6	Semantic richness effects in visual word processing
	Rabovsky, Milena. - : Humboldt-Universität zu Berlin, Lebenswissenschaftliche Fakultät, 2014
	BASE
	Show details

7	Visual word recognition in dyslexia : implication of ventral and dorsal pathways ; La reconnaissance visuelle des mots chez le dyslexique : implication des voies ventrale et dorsale
	Mahé, Gwendoline. - : HAL CCSD, 2013
	In: https://tel.archives-ouvertes.fr/tel-00919475 ; Médecine humaine et pathologie. Université de Strasbourg, 2013. Français. ⟨NNT : 2013STRAJ014⟩ (2013)
	BASE
	Show details

8	The Dynamic Role of Subphonemic Cues in Speech Perception: Investigating Coarticulatory Processing Across Sound Classes
	Arbour, Jessica. - 2012
	BASE
	Show details

9	Applications in pharmacokinetic modeling
	Arnold, Esther. - : uga, 2003
	BASE
	Show details

10	Neural correlates of consciousness : empirical and conceptual questions
	Franks, Nicholas P. (Mitarb.); Nijhawan, Romi (Mitarb.); Metzinger, Thomas (Hrsg.). - Cambridge, Mass. [u.a.] : MIT Press, 2002
	BLLDB
	UB Frankfurt Linguistik
	Show details

11	Time map phonology : finite state models and event logics in speech recognition
	Carson-Berndsen, Julie. - Dordrecht [u.a.] : Kluwer, 1998
	BLLDB
	Institut für Empirische Sprachwissenschaft
	UB Frankfurt Linguistik
	Show details

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern