1 |
Automatic Speech Recognition and Query By Example for Creole Languages Documentation
|
|
|
|
In: Findings of the Association for Computational Linguistics: ACL 2022 ; https://hal.archives-ouvertes.fr/hal-03625303 ; Findings of the Association for Computational Linguistics: ACL 2022, May 2022, Dublin, Ireland (2022)
|
|
BASE
|
|
Show details
|
|
2 |
Cross-Situational Learning Towards Robot Grounding
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03628290 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
3 |
Cross-Situational Learning Towards Robot Grounding
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03628290 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
4 |
End-to-end speaker segmentation for overlap-aware resegmentation
|
|
|
|
In: Interspeech 2021 ; https://hal-univ-lemans.archives-ouvertes.fr/hal-03257524 ; Interspeech 2021, Aug 2021, Brno, Czech Republic ; https://www.interspeech2021.org/ (2021)
|
|
BASE
|
|
Show details
|
|
5 |
High-resolution speaker counting in reverberant rooms using CRNN with Ambisonics features
|
|
|
|
In: EUSIPCO 2020 - 28th European Signal Processing Conference (EUSIPCO) ; https://hal.archives-ouvertes.fr/hal-03537323 ; EUSIPCO 2020 - 28th European Signal Processing Conference (EUSIPCO), Jan 2021, Amsterdam, Netherlands. pp.71-75, ⟨10.23919/Eusipco47968.2020.9287637⟩ (2021)
|
|
BASE
|
|
Show details
|
|
6 |
Tackling Morphological Analogies Using Deep Learning -- Extended Version
|
|
|
|
In: https://hal.inria.fr/hal-03425776 ; 2021 (2021)
|
|
BASE
|
|
Show details
|
|
7 |
Recognizing lexical units in low-resource language contexts with supervised and unsupervised neural networks
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03429051 ; [Research Report] LACITO (UMR 7107). 2021 (2021)
|
|
BASE
|
|
Show details
|
|
8 |
What does the Canary Say? Low-Dimensional GAN Applied to Birdsong
|
|
|
|
In: https://hal.inria.fr/hal-03244723 ; 2021 (2021)
|
|
BASE
|
|
Show details
|
|
9 |
What does the Canary Say? Low-Dimensional GAN Applied to Birdsong
|
|
|
|
In: https://hal.inria.fr/hal-03244723 ; 2021 (2021)
|
|
BASE
|
|
Show details
|
|
10 |
Artificial Text Detection via Examining the Topology of Attention Maps
|
|
|
|
In: ACL Anthology ; Empirical Methods in Natural Language Processing ; https://hal.archives-ouvertes.fr/hal-03456191 ; Empirical Methods in Natural Language Processing, ACL (Association for Computational Linguistics), Nov 2021, Punta Cana, Dominican Republic (2021)
|
|
BASE
|
|
Show details
|
|
11 |
Modeling the neural network responsible for song learning ; Modélisation du réseau neuronal responsable de l'apprentissage du chant chez l'oiseau chanteur
|
|
|
|
In: https://tel.archives-ouvertes.fr/tel-03217834 ; Modeling and Simulation. Université de Bordeaux, 2021. English. ⟨NNT : 2021BORD0107⟩ (2021)
|
|
BASE
|
|
Show details
|
|
12 |
Multimodal Coarticulation Modeling : Towards the animation of an intelligible talking head ; Modélisation de la coarticulation multimodale : vers l'animation d'une tête parlante intelligible
|
|
|
|
In: https://hal.univ-lorraine.fr/tel-03203815 ; Intelligence artificielle [cs.AI]. Université de Lorraine, 2021. Français. ⟨NNT : 2021LORR0019⟩ (2021)
|
|
BASE
|
|
Show details
|
|
13 |
Impact of Segmentation and Annotation in French end-to-end Synthesis
|
|
|
|
In: Proc. 11th ISCA Speech Synthesis Workshop (SSW 11) ; SSW 11th ISCA Speech Synthesis Workshop ; https://hal.archives-ouvertes.fr/hal-03362000 ; SSW 11th ISCA Speech Synthesis Workshop, Aug 2021, Budapest, Hungary. pp.13-18, ⟨10.21437/SSW.2021-3⟩ ; https://ssw11.hte.hu/ (2021)
|
|
Abstract:
International audience ; Audio books are commonly used to train text-to-speech models (TTS), as they offer large phonetic content with rather expressive pronunciation, but number and sizes of publicly available audio books corpora differ between languages. Moreover, the quality and accuracy of the available utterance segmentations are debatable. Yet, the impact of segmentation on the output synthesis is not well established. Additionally, utterances are generally used individually, without taking advantage of text level structuring information, even though they influence speaker reading. In this paper, we conduct a multidimensional evaluation of Tacotron2 trained on different segmentations and text level annotations of the same French corpus. We show that both spectrum accuracy and expressiveness depend on the segmentation used. In particular, a shorter segmentation, in addition with the annotation of paragraphs, benefits to spectrum reconstruction at the detriment of phrasing. Multidimensional analysis of mean opinion scores obtained with a MUSHRA-experiment revealed that phrasing was relatively more important than spectrum accuracy in perceptual judgement. This work serves as evidence that particular attention must be given to models evaluation, as well as how to use the training corpus to maximize synthesis characteristics of interest.
|
|
Keyword:
[INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE]; [INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing; [INFO]Computer Science [cs]; French dataset; French TTS; mixed-inputs TTS; Speech Synthesis
|
|
URL: https://doi.org/10.21437/SSW.2021-3 https://hal.archives-ouvertes.fr/hal-03362000/file/lenglet21_ssw.pdf https://hal.archives-ouvertes.fr/hal-03362000 https://hal.archives-ouvertes.fr/hal-03362000/document
|
|
BASE
|
|
Hide details
|
|
14 |
Which Hype for my New Task? Hints and Random Search for Reservoir Computing Hyperparameters
|
|
|
|
In: ICANN 2021 - 30th International Conference on Artificial Neural Networks ; https://hal.inria.fr/hal-03203318 ; ICANN 2021 - 30th International Conference on Artificial Neural Networks, Sep 2021, Bratislava, Slovakia (2021)
|
|
BASE
|
|
Show details
|
|
15 |
Canary Song Decoder: Transduction and Implicit Segmentation with ESNs and LTSMs
|
|
|
|
In: https://hal.inria.fr/hal-03203374 ; 2021 (2021)
|
|
BASE
|
|
Show details
|
|
16 |
Which Hype for my New Task? Hints and Random Search for Reservoir Computing Hyperparameters
|
|
|
|
In: https://hal.inria.fr/hal-03203318 ; 2021 (2021)
|
|
BASE
|
|
Show details
|
|
17 |
Canary Song Decoder: Transduction and Implicit Segmentation with ESNs and LTSMs
|
|
|
|
In: ICANN 2021 - 30th International Conference on Artificial Neural Networks ; https://hal.inria.fr/hal-03203374 ; ICANN 2021 - 30th International Conference on Artificial Neural Networks, Sep 2021, Bratislava, Slovakia. pp.71--82, ⟨10.1007/978-3-030-86383-8_6⟩ ; https://link.springer.com/chapter/10.1007/978-3-030-86383-8_6 (2021)
|
|
BASE
|
|
Show details
|
|
18 |
On the use of Self-supervised Pre-trained Acoustic and Linguistic Features for Continuous Speech Emotion Recognition
|
|
|
|
In: IEEE Spoken Language Technology Workshop ; https://hal.archives-ouvertes.fr/hal-03003469 ; IEEE Spoken Language Technology Workshop, Jan 2021, Virtual, China (2021)
|
|
BASE
|
|
Show details
|
|
19 |
Hierarchical-Task Reservoir for Online Semantic Analysis from Continuous Speech
|
|
|
|
In: ISSN: 2162-237X ; IEEE Transactions on Neural Networks and Learning Systems ; https://hal.inria.fr/hal-03031413 ; IEEE Transactions on Neural Networks and Learning Systems, IEEE, 2021, ⟨10.1109/TNNLS.2021.3095140⟩ ; https://ieeexplore.ieee.org/abstract/document/9548713/metrics#metrics (2021)
|
|
BASE
|
|
Show details
|
|
20 |
Recognizing lexical units in low-resource language contexts with supervised and unsupervised neural networks
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03429051 ; [Research Report] LACITO (UMR 7107). 2021 (2021)
|
|
BASE
|
|
Show details
|
|
|
|