DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4
Hits 1 – 20 of 79

1
Evaluation of Tacotron Based Synthesizers for Spanish and Basque
In: Applied Sciences; Volume 12; Issue 3; Pages: 1686 (2022)
BASE
Show details
2
CCG Supertagging as Top-down Tree Generation
In: Proceedings of the Society for Computation in Linguistics (2021)
BASE
Show details
3
Vowel Harmony Viewed as Error-Correcting Code
In: Proceedings of the Society for Computation in Linguistics (2021)
BASE
Show details
4
Generating Adversarial Examples for Topic-dependent Argument Classification
In: COMMA 2020 - 8th International Conference on Computational Models of Argument ; https://hal.archives-ouvertes.fr/hal-02933266 ; COMMA 2020 - 8th International Conference on Computational Models of Argument, Sep 2020, Perugia, Italy (2020)
BASE
Show details
5
Complexity of Stability ...
Frei, Fabian; Hemaspaandra, Edith; Rothe, Jörg. - : Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020
BASE
Show details
6
Complexity of Stability ...
BASE
Show details
7
Automatic word count estimation from daylong child-centered recordings in various language environments using language-independent syllabification of speech
Soderstrom, M; Karadayi, J; Casillas, M. - : Elsevier BV, 2020
BASE
Show details
8
Complexity of Stability
In: Leibniz International Proceedings in Informatics, 181 ; 31st International Symposium on Algorithms and Computation (ISAAC 2020) (2020)
BASE
Show details
9
NAT: Noise-Aware Training for Robust Neural Sequence Labeling
In: Fraunhofer IAIS (2020)
BASE
Show details
10
VoiceHome-2, an extended corpus for multichannel speech processing in real homes
In: ISSN: 0167-6393 ; EISSN: 1872-7182 ; Speech Communication ; https://hal.inria.fr/hal-01923108 ; Speech Communication, Elsevier : North-Holland, 2019, 106, pp.68-78. ⟨10.1016/j.specom.2018.11.002⟩ (2019)
BASE
Show details
11
Towards Interpretability and Robustness of Machine Learning Models
Chen, Jianbo. - : eScholarship, University of California, 2019
BASE
Show details
12
Assessing the Robustness of Conversational Agents using Paraphrases
BASE
Show details
13
Robust speech recognition for german and dialectal broadcast programmes
In: Fraunhofer IAIS (2018)
Abstract: Audio mining systems automatically analyse large amounts of heterogeneous media files such as television and radio programmes so that the analysed audio content can be efficiently searched for spoken words. Typically audio mining systems such as the Fraunhofer IAIS audio mining system consist of several modules to structure and analyse the data. The most important module is the large vocabulary continuous speech recognition (LVCSR) module, which is responsible to transform the audio signal into written text. Because of the tremendous developments in the field of speech recognition and to provide the customers with a high-performance audio mining system, the LVCSR module has to be trained and updated regularly by using the latest state-of-the-art algorithms provided by the research community and also by employing large amounts of training data. Today speech recognition systems usually perform very well in clean conditions, however when noise, reverberation or dialectal speakers are present, the performance of these systems degrade considerably. In broadcast media typically a large number of different speakers with high variability are present, like anchormen, interviewers, interviewees, speaking colloquial or planned speech, with or without dialect, or even with voice-overs. Especially in regional programmes of public broadcast, a considerable fraction of the speakers speak with an accent or a dialect. Also, a large amount of different background noises appears in the data, like background speech, or background music. Post-processing algorithms like compression, expansion, and stereo effect processing, which are generously used in broadcast media, further manipulate the audio data. All these issues make speech recognition in the broadcast domain a challenging task. This thesis focuses on the development and the optimisation of the German broadcast LVCSR system, which is part of the Fraunhofer IAIS audio mining system, over the course of several years, dealing with robustness related problems that arise for German broadcast media and also dealing with the requirements for the employment of the ASR system in a productive audiomining system for the industrial use including stability, decoding time and memory consumption. We approach the following three problems: the continuous development and optimisation of the German broadcast LVCSR system over a long period, rapidly finding the optimal ASR decoder parameters automatically and dealing with German dialects in the German broadcast LVCSR system. To guarantee superb performance over long periods of time, we regularly re-train the system using the latest algorithms and system architectures that became available by the research community, and evaluate the performance of the algorithms on German broadcast speech. We also drastically increase the training data by annotating a large and novel German broadcast speech corpus, which is unique in Germany. After training an automatic speech recognition (ASR) system, a speech recognition decoder is responsible to decode the most likely text hypothesis for a certain audio signal given the ASR model. Typically the ASR decoder comes with a large number of h yperparameters, which are usually set to default values or manually optimised. These parameters are often far from the optimum in terms of accuracy and decoding speed. State-of-the-art decoder parameter optimisation algorithms take a long time to converge. Hence, we approach the automatic decoder parameter optimisation in the context of German broadcast speech recognition in this thesis for both unconstrained and constrained (in terms of decoding speed) decoding, by introducing and extending an optimisation algorithm that has not been used for the task of speech recognitinon before to ASR decoder parameter optimisation. Germany has a large variety of dialects that are also often present in broadcast media especially in regional programmes. Dialectal speakers cause severely degraded perfor mance of the speech recognition system due to the mismatch in phonetics and grammar. In this thesis, we approach the large variety of German dialects by introducing a dialect identification system to infer the dialect of the speaker in order to use adapted dialectal speech recognition models to retrieve the spoken text. To train the dialect identification system, a novel database was collected and annotated. By approaching the three issues we arrive at an audio mining system that includes a high-performance speech recognition system, which is able to cope with dialectal speakers and with optimal decoder parameters that can be inferred quickly.
Keyword: Deep neural networks; dialectal robustness; gradient-free decoder parameter optimisation; robust speech recognition
URL: http://publica.fraunhofer.de/documents/N-520009.html
BASE
Hide details
14
Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition
Xie, Z; Sun, Z; Jin, L. - 2017
BASE
Show details
15
Lightweight Spoken Utterance Classification with CFG, tf-idf and Dynamic Programming
In: ISBN: 978-3-319-68455-0 ; Statistical Language and Speech Processing (SLSP) pp. 143-154 (2017)
BASE
Show details
16
A French corpus for distant-microphone speech processing in real homes
In: Interspeech 2016 ; https://hal.inria.fr/hal-01343060 ; Interspeech 2016, Sep 2016, San Francisco, United States (2016)
BASE
Show details
17
Reconnaissance automatique de gestes manuels en langue des signes
In: RFIA 2016 ; RFIA'16: Le vingtième congrès national sur la Reconnaissance des Formes et l'Intelligence Artificielle ; https://hal.archives-ouvertes.fr/hal-01332141 ; RFIA'16: Le vingtième congrès national sur la Reconnaissance des Formes et l'Intelligence Artificielle , Jun 2016, Clermont-Ferrand, France (2016)
BASE
Show details
18
Investigation of Back-off Based Interpolation Between Recurrent Neural Network and N-gram Language Models (Author's Manuscript)
BASE
Show details
19
Lexicographic α-robustness: an application to the 1-median problem
BASE
Show details
20
Robust 1-median location problem on a tree
BASE
Show details

Page: 1 2 3 4

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
79
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern