Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4

Hits 1 – 20 of 79

1	Evaluation of Tacotron Based Synthesizers for Spanish and Basque
	Víctor García; Inma Hernáez; Eva Navas
	In: Applied Sciences; Volume 12; Issue 3; Pages: 1686 (2022)
	BASE
	Show details

2	CCG Supertagging as Top-down Tree Generation
	Prange, Jakob; Schneider, Nathan; Srikumar, Vivek
	In: Proceedings of the Society for Computation in Linguistics (2021)
	BASE
	Show details

3	Vowel Harmony Viewed as Error-Correcting Code
	Meeres, Yvo; Pirinen, Tommi A
	In: Proceedings of the Society for Computation in Linguistics (2021)
	BASE
	Show details

4	Generating Adversarial Examples for Topic-dependent Argument Classification
	Mayer, Tobias; Marro, Santiago; Cabrio, Elena...
	In: COMMA 2020 - 8th International Conference on Computational Models of Argument ; https://hal.archives-ouvertes.fr/hal-02933266 ; COMMA 2020 - 8th International Conference on Computational Models of Argument, Sep 2020, Perugia, Italy (2020)
	BASE
	Show details

5	Complexity of Stability ...
	Frei, Fabian; Hemaspaandra, Edith; Rothe, Jörg. - : Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020
	BASE
	Show details

6	Complexity of Stability ...
	Frei, Fabian; Hemaspaandra, Edith; Rothe, Jörg. - : ETH Zurich, 2020
	BASE
	Show details

7	Automatic word count estimation from daylong child-centered recordings in various language environments using language-independent syllabification of speech
	Soderstrom, M; Karadayi, J; Casillas, M. - : Elsevier BV, 2020
	BASE
	Show details

8	Complexity of Stability
	Frei, Fabian; Hemaspaandra, Edith; Rothe, Jörg
	In: Leibniz International Proceedings in Informatics, 181 ; 31st International Symposium on Algorithms and Computation (ISAAC 2020) (2020)
	BASE
	Show details

9	NAT: Noise-Aware Training for Robust Neural Sequence Labeling
	Behnke, Sven; Namysl, Marcin; Köhler, Joachim
	In: Fraunhofer IAIS (2020)
	BASE
	Show details

10	VoiceHome-2, an extended corpus for multichannel speech processing in real homes
	Bertin, Nancy; Camberlein, Ewen; Lebarbenchon, Romain...
	In: ISSN: 0167-6393 ; EISSN: 1872-7182 ; Speech Communication ; https://hal.inria.fr/hal-01923108 ; Speech Communication, Elsevier : North-Holland, 2019, 106, pp.68-78. ⟨10.1016/j.specom.2018.11.002⟩ (2019)
	BASE
	Show details

11	Towards Interpretability and Robustness of Machine Learning Models
	Chen, Jianbo. - : eScholarship, University of California, 2019
	BASE
	Show details

12	Assessing the Robustness of Conversational Agents using Paraphrases
	Guichard, Jonathan; Ruane, Elayne; Smith, Ross. - : IEEE, 2019
	BASE
	Show details

13	Robust speech recognition for german and dialectal broadcast programmes
	Stadtschnitzer, Michael. - 2018
	In: Fraunhofer IAIS (2018)
	Abstract: Audio mining systems automatically analyse large amounts of heterogeneous media files such as television and radio programmes so that the analysed audio content can be efficiently searched for spoken words. Typically audio mining systems such as the Fraunhofer IAIS audio mining system consist of several modules to structure and analyse the data. The most important module is the large vocabulary continuous speech recognition (LVCSR) module, which is responsible to transform the audio signal into written text. Because of the tremendous developments in the field of speech recognition and to provide the customers with a high-performance audio mining system, the LVCSR module has to be trained and updated regularly by using the latest state-of-the-art algorithms provided by the research community and also by employing large amounts of training data. Today speech recognition systems usually perform very well in clean conditions, however when noise, reverberation or dialectal speakers are present, the performance of these systems degrade considerably. In broadcast media typically a large number of different speakers with high variability are present, like anchormen, interviewers, interviewees, speaking colloquial or planned speech, with or without dialect, or even with voice-overs. Especially in regional programmes of public broadcast, a considerable fraction of the speakers speak with an accent or a dialect. Also, a large amount of different background noises appears in the data, like background speech, or background music. Post-processing algorithms like compression, expansion, and stereo effect processing, which are generously used in broadcast media, further manipulate the audio data. All these issues make speech recognition in the broadcast domain a challenging task. This thesis focuses on the development and the optimisation of the German broadcast LVCSR system, which is part of the Fraunhofer IAIS audio mining system, over the course of several years, dealing with robustness related problems that arise for German broadcast media and also dealing with the requirements for the employment of the ASR system in a productive audiomining system for the industrial use including stability, decoding time and memory consumption. We approach the following three problems: the continuous development and optimisation of the German broadcast LVCSR system over a long period, rapidly finding the optimal ASR decoder parameters automatically and dealing with German dialects in the German broadcast LVCSR system. To guarantee superb performance over long periods of time, we regularly re-train the system using the latest algorithms and system architectures that became available by the research community, and evaluate the performance of the algorithms on German broadcast speech. We also drastically increase the training data by annotating a large and novel German broadcast speech corpus, which is unique in Germany. After training an automatic speech recognition (ASR) system, a speech recognition decoder is responsible to decode the most likely text hypothesis for a certain audio signal given the ASR model. Typically the ASR decoder comes with a large number of h yperparameters, which are usually set to default values or manually optimised. These parameters are often far from the optimum in terms of accuracy and decoding speed. State-of-the-art decoder parameter optimisation algorithms take a long time to converge. Hence, we approach the automatic decoder parameter optimisation in the context of German broadcast speech recognition in this thesis for both unconstrained and constrained (in terms of decoding speed) decoding, by introducing and extending an optimisation algorithm that has not been used for the task of speech recognitinon before to ASR decoder parameter optimisation. Germany has a large variety of dialects that are also often present in broadcast media especially in regional programmes. Dialectal speakers cause severely degraded perfor mance of the speech recognition system due to the mismatch in phonetics and grammar. In this thesis, we approach the large variety of German dialects by introducing a dialect identification system to infer the dialect of the speaker in order to use adapted dialectal speech recognition models to retrieve the spoken text. To train the dialect identification system, a novel database was collected and annotated. By approaching the three issues we arrive at an audio mining system that includes a high-performance speech recognition system, which is able to cope with dialectal speakers and with optimal decoder parameters that can be inferred quickly.
	Keyword: Deep neural networks; dialectal robustness; gradient-free decoder parameter optimisation; robust speech recognition
	URL: http://publica.fraunhofer.de/documents/N-520009.html
	BASE
	Hide details

14	Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition
	Xie, Z; Sun, Z; Jin, L. - 2017
	BASE
	Show details

15	Lightweight Spoken Utterance Classification with CFG, tf-idf and Dynamic Programming
	Rayner, Emmanuel; Tsourakis, Nikolaos; Gerlach, Johanna
	In: ISBN: 978-3-319-68455-0 ; Statistical Language and Speech Processing (SLSP) pp. 143-154 (2017)
	BASE
	Show details

16	A French corpus for distant-microphone speech processing in real homes
	Bertin, Nancy; Camberlein, Ewen; Vincent, Emmanuel...
	In: Interspeech 2016 ; https://hal.inria.fr/hal-01343060 ; Interspeech 2016, Sep 2016, San Francisco, United States (2016)
	BASE
	Show details

17	Reconnaissance automatique de gestes manuels en langue des signes
	Nasreddine, Kamal; Benzinou, Abdesslam
	In: RFIA 2016 ; RFIA'16: Le vingtième congrès national sur la Reconnaissance des Formes et l'Intelligence Artificielle ; https://hal.archives-ouvertes.fr/hal-01332141 ; RFIA'16: Le vingtième congrès national sur la Reconnaissance des Formes et l'Intelligence Artificielle , Jun 2016, Clermont-Ferrand, France (2016)
	BASE
	Show details

18	Investigation of Back-off Based Interpolation Between Recurrent Neural Network and N-gram Language Models (Author's Manuscript)
	Chen, X; Liu,X; Gales,M J F. - 2016
	BASE
	Show details

19	Lexicographic α-robustness: an application to the 1-median problem
	Kalaï, Rim; Aloulou, Mohamed Ali; Vallin, Philippe. - 2016
	BASE
	Show details

20	Robust 1-median location problem on a tree
	Aloulou, Mohamed Ali; Kalaï, Rim; Vallin, Philippe. - 2016
	BASE
	Show details

Page: 1 2 3 4

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern