DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4 5...22
Hits 1 – 20 of 431

1
Multistream neural architectures for cued-speech recognition using a pre-trained visual feature extractor and constrained CTC decoding
In: ICASSP 2022 - IEEE International Conference on Acoustics, Speech and Signal Processing ; https://hal.archives-ouvertes.fr/hal-03578503 ; ICASSP 2022 - IEEE International Conference on Acoustics, Speech and Signal Processing, May 2022, Singapour, Singapore (2022)
BASE
Show details
2
An Overview of Indian Spoken Language Recognition from Machine Learning Perspective
In: ISSN: 2375-4699 ; EISSN: 2375-4702 ; ACM Transactions on Asian and Low-Resource Language Information Processing ; https://hal.inria.fr/hal-03616853 ; ACM Transactions on Asian and Low-Resource Language Information Processing, ACM, In press, ⟨10.1145/3523179⟩ (2022)
BASE
Show details
3
Differentially private speaker anonymization
In: https://hal.inria.fr/hal-03588932 ; 2022 (2022)
BASE
Show details
4
A Study of F0 Modification for X-Vector Based Speech Pseudonymization Across Gender
In: PPAI 2021 - The Second AAAI Workshop on Privacy-Preserving Artificial Intelligence ; https://hal.archives-ouvertes.fr/hal-02995862 ; PPAI 2021 - The Second AAAI Workshop on Privacy-Preserving Artificial Intelligence, Feb 2021, Virtual, China (2021)
BASE
Show details
5
Assessment of adult speech disorders: current situation and needs in French-speaking clinical practice
In: ISSN: 1401-5439 ; Logopedics Phoniatrics Vocology ; https://hal.archives-ouvertes.fr/hal-03120115 ; Logopedics Phoniatrics Vocology, Taylor & Francis, 2021, pp.1-15. ⟨10.1080/14015439.2020.1870245⟩ (2021)
BASE
Show details
6
Utterance partitioning for speaker recognition: an experimental review and analysis with new findings under GMM-SVM framework
In: ISSN: 1381-2416 ; EISSN: 1572-8110 ; International Journal of Speech Technology ; https://hal.archives-ouvertes.fr/hal-03232723 ; International Journal of Speech Technology, Springer Verlag, In press, ⟨10.1007/s10772-021-09862-8⟩ (2021)
BASE
Show details
7
Alternate Endings: Improving Prosody for Incremental Neural TTS with Predicted Future Text Input
In: Interspeech 2021 - 22nd Annual Conference of the International Speech Communication Association ; https://hal.archives-ouvertes.fr/hal-03372802 ; Interspeech 2021 - 22nd Annual Conference of the International Speech Communication Association, Aug 2021, Brno, Czech Republic. pp.3865-3869, ⟨10.21437/Interspeech.2021-275⟩ (2021)
BASE
Show details
8
Speaker Attentive Speech Emotion Recognition
In: Proccedings of interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03554368 ; Interspeech 2021, Aug 2021, Brno, Czech Republic. pp.2866-2870, ⟨10.21437/interspeech.2021-573⟩ (2021)
BASE
Show details
9
EVOLEX : la reconnaissance vocale au service du diagnostic des dysfonctionnements langagiers
In: Séminaire AFCP 2021 – Phonétique Clinique ; https://hal-univ-tlse3.archives-ouvertes.fr/hal-03269242 ; Séminaire AFCP 2021 – Phonétique Clinique, May 2021, Toulouse (virtuel), France ; http://www.afcp-parole.org/seminaire-afcp-phonetique-clinique-27-mai-2021/ (2021)
BASE
Show details
10
Recognizing lexical units in low-resource language contexts with supervised and unsupervised neural networks
In: https://hal.archives-ouvertes.fr/hal-03429051 ; [Research Report] LACITO (UMR 7107). 2021 (2021)
Abstract: Automatic Speech Recognition (ASR) has made significant progress thanks to the advent of deep neural networks (DNNs). In the context of under-resourced languages, for which few resources are available, spectacular achievements has been reported. ASR systems are a step forward for language documentation, as the annotation cost is considerably reduced for field linguists (manually annotated an audio file can take a tremendous amount of time), and the language is preserved and perpetuated through documentation. Previous `standard' deep neural networks reached very good performances for phonemic transcription (such as with Kaldi and ESPnet approaches).However, these methods only rely on the phoneme-level. In this thesis, we explore recently published ASR approaches which have shown to be effective on low-resource languages to produce word-level audio-aligned transcriptions. The first approach, based on self-supervised learning, is a speech model that uses a Connectionist Temporal Classification (CTC). The second, entitled wav2vec-U, proposes a framework intended to build an ASR system in a fully unsupervised fashion. With few resources at our disposal, we try to assess the usability that can be made from dictionaries. We conducted experiments on two low-resource corpora, the Yongning Na and the Japhug from the Pangloss Collection. The experimental results from the first approach demonstrate powerful word-level transcriptions with competitive error rates. Preliminary results are reported on the second approach. By a coverage measure of dictionaries on the available transcriptions, we show that these resources are not yet usable in the conducted approaches.
Keyword: [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]; [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE]; [INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing; [SHS.LANGUE]Humanities and Social Sciences/Linguistics; Automatic Speech Recognition ASR; deep learning; Machine learning; Neural networks
URL: https://hal.archives-ouvertes.fr/hal-03429051/file/Macaire2021_RecognizingLexicalUnits.pdf
https://hal.archives-ouvertes.fr/hal-03429051/document
https://hal.archives-ouvertes.fr/hal-03429051
BASE
Hide details
11
Automatic extraction of speech rhythm descriptors for speech intelligibility assessment in the context of Head and Neck Cancers
In: à paraître ; INTERSPEECH 2021 ; https://hal-univ-tlse3.archives-ouvertes.fr/hal-03269227 ; INTERSPEECH 2021, ISCA : International Speech and Communication Association, Aug 2021, Brno, Czech Republic ; https://www.interspeech2021.org (2021)
BASE
Show details
12
Prosodic Boundary Prediction Model for Vietnamese Text-To-Speech
In: Proc. Interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03329116 ; Interspeech 2021, Aug 2021, Brno, Czech Republic. pp.3885-3889, ⟨10.21437/interspeech.2021-125⟩ (2021)
BASE
Show details
13
“Motherese” Prosody in Fetal-Directed Speech: An Exploratory Study Using Automatic Social Signal Processing
In: ISSN: 1664-1078 ; Frontiers in Psychology ; https://hal.sorbonne-universite.fr/hal-03184038 ; Frontiers in Psychology, Frontiers, 2021, 12, ⟨10.3389/fpsyg.2021.646170⟩ (2021)
BASE
Show details
14
Learning spectro-temporal representations of complex sounds with parameterized neural networks
In: ISSN: 0001-4966 ; EISSN: 1520-8524 ; Journal of the Acoustical Society of America ; https://hal.inria.fr/hal-03329261 ; Journal of the Acoustical Society of America, Acoustical Society of America, 2021, 150 (1), pp.353-366. ⟨10.1121/10.0005482⟩ (2021)
BASE
Show details
15
Modeling the effect of military oxygen masks on speech characteristics
In: Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03325087 ; Interspeech 2021, Aug 2021, Brno, Czech Republic (2021)
BASE
Show details
16
MRI Vocal Tract Sagittal Slices Estimation during Speech Production of CV
In: EUSIPCO 2020 - 28th European Signal Processing Conference ; https://hal.inria.fr/hal-03090824 ; EUSIPCO 2020 - 28th European Signal Processing Conference, Jan 2021, Amsterdam / Virtual, Netherlands ; https://eusipco2020.org/ (2021)
BASE
Show details
17
Construction of an automatic score for the evaluation of speech disorders among patients treated for a cancer of the oral cavity or the oropharynx: The Carcinologic Speech Severity Index
In: ISSN: 1043-3074 ; EISSN: 1097-0347 ; Head and Neck ; https://hal-univ-tlse3.archives-ouvertes.fr/hal-03413678 ; Head and Neck, Wiley, In press, ⟨10.1002/hed.26903⟩ (2021)
BASE
Show details
18
Automated Assessment of Glottal Dysfunction Through Unified Acoustic Voice Analysis
In: ISSN: 0892-1997 ; Journal of Voice ; https://hal.archives-ouvertes.fr/hal-02987882 ; Journal of Voice, Elsevier, In press, ⟨10.1016/j.jvoice.2020.08.032⟩ (2021)
BASE
Show details
19
Leveraging lyrics from audio for MIR ; Exploiter les paroles de chansons à partir de l'audio pour le MIR
Vaglio, Andrea. - : HAL CCSD, 2021
In: https://tel.archives-ouvertes.fr/tel-03558515 ; Signal and Image processing. Institut Polytechnique de Paris, 2021. English. ⟨NNT : 2021IPPAT027⟩ (2021)
BASE
Show details
20
On the effect of normalization layers on Differentially Private training of deep Neural networks
In: https://hal.inria.fr/hal-03475600 ; 2021 (2021)
BASE
Show details

Page: 1 2 3 4 5...22

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
431
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern