DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4 5
Hits 1 – 20 of 87

1
Об особенностях развития речи у детей с нарушением слухового восприятия ... : On peculiar features of speech development in children with auditory perceptual disorders ...
Визель, Татьяна Григорьевна; Клевцова, Светлана Вячеславовна; Зайцева, Светлана Александровна. - : Уральский государственный педагогический университет, 2019
BASE
Show details
2
Sparsity Motivated Auditory Wavelet Representation and Blind Deconvolution
Abstract: In many scenarios, events such as singularities and transients that carry important information about a signal undergo spreading during acquisition or transmission and it is important to localize the events. For example, edges in an image, point sources in a microscopy or astronomical image are blurred by the point-spread function (PSF) of the acquisition system, while in a speech signal, the epochs corresponding to glottal closure instants are shaped by the vocal tract response. Such events can be extracted with the help of techniques that promote sparsity, which enables separation of the smooth components from the transient ones. In this thesis, we consider development of such sparsity promoting techniques. The contributions of the thesis are three-fold: (i) an auditory-motivated continuous wavelet design and representation, which helps identify singularities; (ii) a sparsity-driven deconvolution technique; and (iii) a sparsity-driven deconvolution technique for reconstruction of nite-rate-of-innovation (FRI) signals. We use the speech signal for illustrating the performance of the techniques in the first two parts and super-resolution microscopy (2-D) for the third part. In the rst part, we develop a continuous wavelet transform (CWT) starting from an auditory motivation. Wavelet analysis provides good time and frequency localization, which has made it a popular tool for time-frequency analysis of signals. The CWT is a multiresolution analysis tool that involves decomposition of a signal using a constant-Q wavelet filterbank, akin to the time-frequency analysis performed by basilar membrane in the peripheral human auditory system. This connection motivated us to develop wavelets that possess auditory localization capabilities. Gammatone functions are extensively used in the modeling of the basilar membrane, but the non-zero average of the functions poses a hurdle. We construct bona de wavelets from the Gammatone function called Gammatone wavelets and analyze their properties such as admissibility, time-bandwidth product, vanishing moments, etc. Of particular interest is the vanishing moments property, which enables the wavelet to suppress smooth regions in a signal leading to sparsi cation. We show how this property of the Gammatone wavelets coupled with multiresolution analysis could be employed for singularity and transient detection. Using these wavelets, we also construct equivalent lterbank models and obtain cepstral feature vectors out of such a representation. We show that the Gammatone wavelet cepstral coefficients (GWCC) are effective for robust speech recognition compared with mel-frequency cepstral coefficients (MFCC). In the second part, we consider the problem of sparse blind deconvolution (SBD) starting from a signal obtained as the convolution of an unknown PSF and a sparse excitation. The BD problem is ill-posed and the goal is to employ sparsity to come up with an accurate solution. We formulate the SBD problem within a Bayesian framework. The estimation of lter and excitation involves optimization of a cost function that consists of an `2 data- fidelity term and an `p-norm (p 2 [0; 1]) regularizer, as the sparsity promoting prior. Since the `p-norm is not differentiable at the origin, we consider a smoothed version of the `p-norm as a proxy in the optimization. Apart from the regularizer being non-convex, the data term is also non-convex in the filter and excitation as they are both unknown. We optimize the non-convex cost using an alternating minimization strategy, and develop an alternating `p `2 projections algorithm (ALPA). We demonstrate convergence of the iterative algorithm and analyze in detail the role of the pseudo-inverse solution as an initialization for the ALPA and provide probabilistic bounds on its accuracy considering the presence of noise and the condition number of the linear system of equations. We also consider the case of bounded noise and derive tight tail bounds using the Hoe ding inequality. As an application, we consider the problem of blind deconvolution of speech signals. In the linear model for speech production, voiced speech is assumed to be the result of a quasi-periodic impulse train exciting a vocal-tract lter. The locations of the impulses or epochs indicate the glottal closure instants and the spacing between them the pitch. Hence, the excitation in the case of voiced speech is sparse and its deconvolution from the vocal-tract filter is posed as a SBD problem. We employ ALPA for SBD and show that excitation obtained is sparser than the excitations obtained using sparse linear prediction, smoothed `1=`2 sparse blind deconvolution algorithm, and majorization-minimization-based sparse deconvolution techniques. We also consider the problem of epoch estimation and show that epochs estimated by ALPA in both clean and noisy conditions are closer to the instants indicated by the electroglottograph when with to the estimates provided by the zero-frequency ltering technique, which is the state-of-the-art epoch estimation technique. In the third part, we consider the problem of deconvolution of a specific class of continuous-time signals called nite-rate-of-innovation (FRI) signals, which are not bandlimited, but specified by a nite number of parameters over an observation interval. The signal is assumed to be a linear combination of delayed versions of a prototypical pulse. The reconstruction problem is posed as a 2-D SBD problem. The kernel is assumed to have a known form but with unknown parameters. Given the sampled version of the FRI signal, the delays quantized to the nearest point on the sampling grid are rst estimated using proximal-operator-based alternating `p `2 algorithm (ALPAprox), and then super-resolved to obtain o -grid (O. G.) estimates using gradient-descent optimization. The overall technique is termed OG-ALPAprox. We show application of OG-ALPAprox to a particular modality of super-resolution microscopy (SRM), called stochastic optical reconstruction microscopy (STORM). The resolution of the traditional optical microscope is limited by di raction and is termed as Abbe's limit. The goal of SRM is to engineer the optical imaging system to resolve structures in specimens, such as proteins, whose dimensions are smaller than the di raction limit. The specimen to be imaged is tagged or labeled with light-emitting or uorescent chemical compounds called uorophores. These compounds speci cally bind to proteins and exhibit uorescence upon excitation. The uorophores are assumed to be point sources and the light emitted by them undergo spreading due to di raction. STORM employs a sequential approach, wherein each step only a few uorophores are randomly excited and the image is captured by a sensor array. The obtained image is di raction-limited, however, the separation between the uorophores allows for localizing the point sources with high precision. The localization is performed using Gaussian peak- tting. This process of random excitation coupled with localization is performed sequentially and subsequently consolidated to obtain a high-resolution image. We pose the localization as a SBD problem and employ OG-ALPAprox to estimate the locations. We also report comparisons with the de facto standard Gaussian peak- tting algorithm and show that the statistical performance is superior. Experimental results on real data show that the reconstruction quality is on par with the Gaussian peak- tting.
Keyword: Auditory System Modeling; Auditory Wavelet Representation; Blind Deconvolution; Electrical Engineering; Gammatone Wavelet Cepstral Coe cients (GWCC); Gammatone Wavelet Transform; Sparse Blind Deconvolution (SBD); Sparse Coding; Time Frequency Representation; Voiced Speech Signals
URL: http://etd.iisc.ac.in/handle/2005/4009
http://etd.iisc.ernet.in/abstracts/4875/G28512-Abs.pdf
BASE
Hide details
3
Spatio-Temporal Progression of Cortical Activity Related to Continuous Overt and Covert Speech Production in a Reading Task
Gunduz, Aysegul; Schalk, Gerwin; Ritaccio, Anthony L.. - : Public Library of Science, 2016
BASE
Show details
4
Silent Spatialized Communication Among Dispersed Forces
In: DTIC (2015)
BASE
Show details
5
Neural Encoding of Complex Signals in the Healthy and Impaired Auditory Systems
In: Open Access Dissertations (2013)
BASE
Show details
6
Familiar Speaker Recognition
In: DTIC (2012)
BASE
Show details
7
Machine Recognition vs Human Recognition of Voices
In: DTIC (2012)
BASE
Show details
8
Compressed Domain Automatic Level Control Based on ITU-T G.722.2
In: DTIC (2012)
BASE
Show details
9
Duration of auditory sensory memory in parents of children with SLI: A mismatch negativity study
In: BRAIN LANG , 104 (1) 75 - 88. (2008) (2008)
BASE
Show details
10
A Computational Auditory Scene Analysis System for Speech Segregation and Robust Speech Recognition
BASE
Show details
11
Informational and Energetic Masking Effects in Multitalker Speech Perception
In: DTIC (2006)
BASE
Show details
12
Across-ear Interference from Parametrically Degraded Synthetic Speech Signals in a Dichotic Cocktail-party Listening Task
In: DTIC AND NTIS (2005)
BASE
Show details
13
Study of Acoustic Features of Newborn Cries that Correlate with the Context
In: DTIC AND NTIS (2001)
BASE
Show details
14
Auditory Features Underlying Cross-Language Human Capabilities in Stop Consonant Discrimination
In: DTIC (2000)
BASE
Show details
15
Auditory Modeling for Noisy Speech Recognition.
In: DTIC AND NTIS (2000)
BASE
Show details
16
Communication and Localization with Hearing Protectors
In: DTIC AND NTIS (2000)
BASE
Show details
17
The effects of symmetrical and asymmetrical sensorineural hearing loss on speech perception in noise
BASE
Show details
18
COMINT Audio Interface
In: DTIC AND NTIS (1999)
BASE
Show details
19
Spatial Audio Displays for Speech Communications: A Comparison of Free Field and Virtual Acoustic Environments
In: DTIC AND NTIS (1999)
BASE
Show details
20
QoS Based Evaluation of the Berkeley Continuous Media Toolkit
In: DTIC AND NTIS (1999)
BASE
Show details

Page: 1 2 3 4 5

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
87
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern