Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4 5...22

Hits 1 – 20 of 431

1	Multistream neural architectures for cued-speech recognition using a pre-trained visual feature extractor and constrained CTC decoding
	Sankar, Sanjana; Beautemps, Denis; Hueber, Thomas
	In: ICASSP 2022 - IEEE International Conference on Acoustics, Speech and Signal Processing ; https://hal.archives-ouvertes.fr/hal-03578503 ; ICASSP 2022 - IEEE International Conference on Acoustics, Speech and Signal Processing, May 2022, Singapour, Singapore (2022)
	BASE
	Show details

2	An Overview of Indian Spoken Language Recognition from Machine Learning Perspective
	Dey, Spandan; Sahidullah, Md; Saha, Goutam
	In: ISSN: 2375-4699 ; EISSN: 2375-4702 ; ACM Transactions on Asian and Low-Resource Language Information Processing ; https://hal.inria.fr/hal-03616853 ; ACM Transactions on Asian and Low-Resource Language Information Processing, ACM, In press, ⟨10.1145/3523179⟩ (2022)
	BASE
	Show details

3	Differentially private speaker anonymization
	Shamsabadi, Ali Shahin; Srivastava, Brij Mohan Lal; Bellet, Aurélien...
	In: https://hal.inria.fr/hal-03588932 ; 2022 (2022)
	BASE
	Show details

4	A Study of F0 Modification for X-Vector Based Speech Pseudonymization Across Gender
	Champion, Pierre; Jouvet, Denis; Larcher, Anthony
	In: PPAI 2021 - The Second AAAI Workshop on Privacy-Preserving Artificial Intelligence ; https://hal.archives-ouvertes.fr/hal-02995862 ; PPAI 2021 - The Second AAAI Workshop on Privacy-Preserving Artificial Intelligence, Feb 2021, Virtual, China (2021)
	BASE
	Show details

5	Assessment of adult speech disorders: current situation and needs in French-speaking clinical practice
	Pommée, Timothy; Balaguer, Mathieu; Mauclair, Julie...
	In: ISSN: 1401-5439 ; Logopedics Phoniatrics Vocology ; https://hal.archives-ouvertes.fr/hal-03120115 ; Logopedics Phoniatrics Vocology, Taylor & Francis, 2021, pp.1-15. ⟨10.1080/14015439.2020.1870245⟩ (2021)
	BASE
	Show details

6	Utterance partitioning for speaker recognition: an experimental review and analysis with new findings under GMM-SVM framework
	Sen, Nirmalya; Sahidullah, Md; Patil, Hemant...
	In: ISSN: 1381-2416 ; EISSN: 1572-8110 ; International Journal of Speech Technology ; https://hal.archives-ouvertes.fr/hal-03232723 ; International Journal of Speech Technology, Springer Verlag, In press, ⟨10.1007/s10772-021-09862-8⟩ (2021)
	BASE
	Show details

7	Alternate Endings: Improving Prosody for Incremental Neural TTS with Predicted Future Text Input
	Stephenson, Brooke; Hueber, Thomas; Girin, Laurent...
	In: Interspeech 2021 - 22nd Annual Conference of the International Speech Communication Association ; https://hal.archives-ouvertes.fr/hal-03372802 ; Interspeech 2021 - 22nd Annual Conference of the International Speech Communication Association, Aug 2021, Brno, Czech Republic. pp.3865-3869, ⟨10.21437/Interspeech.2021-275⟩ (2021)
	BASE
	Show details

8	Speaker Attentive Speech Emotion Recognition
	Le Moine, Clément; Obin, Nicolas; Roebel, Axel
	In: Proccedings of interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03554368 ; Interspeech 2021, Aug 2021, Brno, Czech Republic. pp.2866-2870, ⟨10.21437/interspeech.2021-573⟩ (2021)
	BASE
	Show details

9	EVOLEX : la reconnaissance vocale au service du diagnostic des dysfonctionnements langagiers
	Petiot, Jim; Gravellier, Lila; Jucla, Mélanie...
	In: Séminaire AFCP 2021 – Phonétique Clinique ; https://hal-univ-tlse3.archives-ouvertes.fr/hal-03269242 ; Séminaire AFCP 2021 – Phonétique Clinique, May 2021, Toulouse (virtuel), France ; http://www.afcp-parole.org/seminaire-afcp-phonetique-clinique-27-mai-2021/ (2021)
	BASE
	Show details

10	Recognizing lexical units in low-resource language contexts with supervised and unsupervised neural networks
	MACAIRE, Cécile
	In: https://hal.archives-ouvertes.fr/hal-03429051 ; [Research Report] LACITO (UMR 7107). 2021 (2021)
	Abstract: Automatic Speech Recognition (ASR) has made significant progress thanks to the advent of deep neural networks (DNNs). In the context of under-resourced languages, for which few resources are available, spectacular achievements has been reported. ASR systems are a step forward for language documentation, as the annotation cost is considerably reduced for field linguists (manually annotated an audio file can take a tremendous amount of time), and the language is preserved and perpetuated through documentation. Previous `standard' deep neural networks reached very good performances for phonemic transcription (such as with Kaldi and ESPnet approaches).However, these methods only rely on the phoneme-level. In this thesis, we explore recently published ASR approaches which have shown to be effective on low-resource languages to produce word-level audio-aligned transcriptions. The first approach, based on self-supervised learning, is a speech model that uses a Connectionist Temporal Classification (CTC). The second, entitled wav2vec-U, proposes a framework intended to build an ASR system in a fully unsupervised fashion. With few resources at our disposal, we try to assess the usability that can be made from dictionaries. We conducted experiments on two low-resource corpora, the Yongning Na and the Japhug from the Pangloss Collection. The experimental results from the first approach demonstrate powerful word-level transcriptions with competitive error rates. Preliminary results are reported on the second approach. By a coverage measure of dictionaries on the available transcriptions, we show that these resources are not yet usable in the conducted approaches.
	Keyword: [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]; [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE]; [INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing; [SHS.LANGUE]Humanities and Social Sciences/Linguistics; Automatic Speech Recognition ASR; deep learning; Machine learning; Neural networks
	URL: https://hal.archives-ouvertes.fr/hal-03429051/file/Macaire2021_RecognizingLexicalUnits.pdf https://hal.archives-ouvertes.fr/hal-03429051/document https://hal.archives-ouvertes.fr/hal-03429051
	BASE
	Hide details

11	Automatic extraction of speech rhythm descriptors for speech intelligibility assessment in the context of Head and Neck Cancers
	Vaysse, Robin; Farinas, Jérôme; Astésano, Corine...
	In: à paraître ; INTERSPEECH 2021 ; https://hal-univ-tlse3.archives-ouvertes.fr/hal-03269227 ; INTERSPEECH 2021, ISCA : International Speech and Communication Association, Aug 2021, Brno, Czech Republic ; https://www.interspeech2021.org (2021)
	BASE
	Show details

12	Prosodic Boundary Prediction Model for Vietnamese Text-To-Speech
	Trang, Nguyen Thi Thu; Ky, Nguyen,; Rilliard, Albert...
	In: Proc. Interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03329116 ; Interspeech 2021, Aug 2021, Brno, Czech Republic. pp.3885-3889, ⟨10.21437/interspeech.2021-125⟩ (2021)
	BASE
	Show details

13	“Motherese” Prosody in Fetal-Directed Speech: An Exploratory Study Using Automatic Social Signal Processing
	Parlato-Oliveira, Erika; Saint-Georges, Catherine; Cohen, David...
	In: ISSN: 1664-1078 ; Frontiers in Psychology ; https://hal.sorbonne-universite.fr/hal-03184038 ; Frontiers in Psychology, Frontiers, 2021, 12, ⟨10.3389/fpsyg.2021.646170⟩ (2021)
	BASE
	Show details

14	Learning spectro-temporal representations of complex sounds with parameterized neural networks
	Riad, Rachid; Karadayi, Julien; Bachoud-Lévi, Anne-Catherine...
	In: ISSN: 0001-4966 ; EISSN: 1520-8524 ; Journal of the Acoustical Society of America ; https://hal.inria.fr/hal-03329261 ; Journal of the Acoustical Society of America, Acoustical Society of America, 2021, 150 (1), pp.353-366. ⟨10.1121/10.0005482⟩ (2021)
	BASE
	Show details

15	Modeling the effect of military oxygen masks on speech characteristics
	Elie, Benjamin; Gauvain, Jodie; Gauvain, Jean-Luc...
	In: Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03325087 ; Interspeech 2021, Aug 2021, Brno, Czech Republic (2021)
	BASE
	Show details

16	MRI Vocal Tract Sagittal Slices Estimation during Speech Production of CV
	Douros, Ioannis,; Kulkarni, Ajinkya; Xie, Yu...
	In: EUSIPCO 2020 - 28th European Signal Processing Conference ; https://hal.inria.fr/hal-03090824 ; EUSIPCO 2020 - 28th European Signal Processing Conference, Jan 2021, Amsterdam / Virtual, Netherlands ; https://eusipco2020.org/ (2021)
	BASE
	Show details

17	Construction of an automatic score for the evaluation of speech disorders among patients treated for a cancer of the oral cavity or the oropharynx: The Carcinologic Speech Severity Index
	Woisard, Virginie; Balaguer, Mathieu; Fredouille, Corinne...
	In: ISSN: 1043-3074 ; EISSN: 1097-0347 ; Head and Neck ; https://hal-univ-tlse3.archives-ouvertes.fr/hal-03413678 ; Head and Neck, Wiley, In press, ⟨10.1002/hed.26903⟩ (2021)
	BASE
	Show details

18	Automated Assessment of Glottal Dysfunction Through Unified Acoustic Voice Analysis
	McLoughlin, Ian Vince; Perrotin, Olivier; Sharifzadeh, Hamid...
	In: ISSN: 0892-1997 ; Journal of Voice ; https://hal.archives-ouvertes.fr/hal-02987882 ; Journal of Voice, Elsevier, In press, ⟨10.1016/j.jvoice.2020.08.032⟩ (2021)
	BASE
	Show details

19	Leveraging lyrics from audio for MIR ; Exploiter les paroles de chansons à partir de l'audio pour le MIR
	Vaglio, Andrea. - : HAL CCSD, 2021
	In: https://tel.archives-ouvertes.fr/tel-03558515 ; Signal and Image processing. Institut Polytechnique de Paris, 2021. English. ⟨NNT : 2021IPPAT027⟩ (2021)
	BASE
	Show details

20	On the effect of normalization layers on Differentially Private training of deep Neural networks
	Davody, Ali; Adelani, David Ifeoluwa; Kleinbauer, Thomas...
	In: https://hal.inria.fr/hal-03475600 ; 2021 (2021)
	BASE
	Show details

Page: 1 2 3 4 5...22

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern