DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4 5...827
Hits 1 – 20 of 16.528

1
A Bottleneck Auto-Encoder for F0 Transformations on Speech and Singing Voice
In: ISSN: 2078-2489 ; Information ; https://hal.archives-ouvertes.fr/hal-03599085 ; Information, MDPI, 2022, 13 (3), pp.102. ⟨10.3390/info13030102⟩ (2022)
Abstract: International audience ; In this publication, we present a deep learning-based method to transform the f0 in speech and singing voice recordings. f0 transformation is performed by training an auto-encoder on the voice signal’s mel-spectrogram and conditioning the auto-encoder on the f0. Inspired by AutoVC/F0, we apply an information bottleneck to it to disentangle the f0 from its latent code. The resulting model successfully applies the desired f0 to the input mel-spectrograms and adapts the speaker identity when necessary, e.g., if the requested f0 falls out of the range of the source speaker/singer. Using the mean f0 error in the transformed mel-spectrograms, we define a disentanglement measure and perform a study over the required bottleneck size. The study reveals that to remove the f0 from the auto-encoder’s latent code, the bottleneck size should be smaller than four for singing and smaller than nine for speech. Through a perceptive test, we compare the audio quality of the proposed auto-encoder to f0 transformations obtained with a classical vocoder. The perceptive test confirms that the audio quality is better for the auto-encoder than for the classical vocoder. Finally, a visual analysis of the latent code for the two-dimensional case is carried out. We observe that the auto-encoder encodes phonemes as repeated discontinuous temporal gestures within the latent code.
Keyword: [INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD]; [SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing
URL: https://doi.org/10.3390/info13030102
https://hal.archives-ouvertes.fr/hal-03599085
BASE
Hide details
2
Neural Vocoding for Singing and Speaking Voices with the Multi-Band Excited WaveNet
In: ISSN: 2078-2489 ; Information ; https://hal.archives-ouvertes.fr/hal-03599076 ; Information, MDPI, 2022, 13 (3), pp.103. ⟨10.3390/info13030103⟩ (2022)
BASE
Show details
3
From Biological Synapses to “Intelligent” Robots
In: ISSN: 2079-9292 ; Electronics ; https://hal.archives-ouvertes.fr/hal-03590998 ; Electronics, MDPI, 2022, 11 (5), pp.707. ⟨10.3390/electronics11050707⟩ (2022)
BASE
Show details
4
Meta-Analysis of the Functional Neuroimaging Literature with Probabilistic Logic Programming
In: https://hal.archives-ouvertes.fr/hal-03590714 ; 2022 (2022)
BASE
Show details
5
Linguistic resources for paraphrase generation in Portuguese: a Lexicon-Grammar approach
In: ISSN: 1574-020X ; EISSN: 1574-0218 ; Language Resources and Evaluation ; https://hal.archives-ouvertes.fr/hal-03548861 ; Language Resources and Evaluation, Springer Verlag, 2022, ⟨10.1007/s10579-021-09561-5⟩ ; https://link.springer.com/article/10.1007/s10579-021-09561-5 (2022)
BASE
Show details
6
Caveats of Measuring Semantic Change of Cognates and Borrowings using Multilingual Word Embeddings
In: LChange'22 - 3rd International Workshop on Computational Approaches to Historical Language Change 2022 ; https://hal.inria.fr/hal-03635005 ; LChange'22 - 3rd International Workshop on Computational Approaches to Historical Language Change 2022, May 2022, Dublin, Ireland (2022)
BASE
Show details
7
DeepL et Google Translate face à l'ambiguïté phraséologique
In: https://hal.archives-ouvertes.fr/hal-03583995 ; 2022 (2022)
BASE
Show details
8
Preprint Citation Praxis in PLOS
In: ISSN: 0138-9130 ; EISSN: 1588-2861 ; Scientometrics ; https://hal.archives-ouvertes.fr/hal-03506094 ; In press (2022)
BASE
Show details
9
Emotion on a textual level: the structuring function of emotions observed from annotations ; L'émotion à un niveau textuel : la fonction structurante des émotions observée à partir d'annotations
In: ISSN: 1963-1723 ; Discours - Revue de linguistique, psycholinguistique et informatique ; https://hal.archives-ouvertes.fr/hal-03607564 ; Discours - Revue de linguistique, psycholinguistique et informatique, Laboratoire LATTICE, A paraître (2022)
BASE
Show details
10
An Overview of Indian Spoken Language Recognition from Machine Learning Perspective
In: ISSN: 2375-4699 ; EISSN: 2375-4702 ; ACM Transactions on Asian and Low-Resource Language Information Processing ; https://hal.inria.fr/hal-03616853 ; ACM Transactions on Asian and Low-Resource Language Information Processing, ACM, In press, ⟨10.1145/3523179⟩ (2022)
BASE
Show details
11
The contextual logic
In: https://hal.archives-ouvertes.fr/hal-03195162 ; 2022 (2022)
BASE
Show details
12
Morphology in the Corsican Language Database (BDLC) : assessment and perspectives ; La morphologie dans la Banque de Données Langue Corse : bilan et perspectives
In: ISSN: 1638-9808 ; EISSN: 1765-3126 ; Corpus ; https://hal.archives-ouvertes.fr/hal-03591866 ; Corpus, Bases, Corpus, Langage - UMR 7320, 2022, Corpus et données en morpholgie, ⟨10.4000/corpus.7115⟩ ; https://journals.openedition.org/corpus/7115 (2022)
BASE
Show details
13
Automatic generation of the complete vocal tract shape from the sequence of phonemes to be articulated
In: ISSN: 0167-6393 ; EISSN: 1872-7182 ; Speech Communication ; https://hal.univ-lorraine.fr/hal-03650212 ; Speech Communication, Elsevier : North-Holland, 2022, ⟨10.1016/j.specom.2022.04.004⟩ (2022)
BASE
Show details
14
Entities, Dates, and Languages: Zero-Shot on Historical Texts with T0
In: Proceedings of the International Workshop on Challenges & Perspectives in Creating Large Language Models 2022 (BigScience 2022) ; https://hal.inria.fr/hal-03639144 ; Proceedings of the International Workshop on Challenges & Perspectives in Creating Large Language Models 2022 (BigScience 2022), May 2022, Dublin, France (2022)
BASE
Show details
15
Probing Multilingual Cognate Prediction Models
In: Findings of the Association for Computational Linguistics: ACL 2022 ; https://hal.inria.fr/hal-03614691 ; Findings of the Association for Computational Linguistics: ACL 2022, May 2022, Dublin, Ireland (2022)
BASE
Show details
16
Automatic Speech Recognition and Query By Example for Creole Languages Documentation
In: Findings of the Association for Computational Linguistics: ACL 2022 ; https://hal.archives-ouvertes.fr/hal-03625303 ; Findings of the Association for Computational Linguistics: ACL 2022, May 2022, Dublin, Ireland (2022)
BASE
Show details
17
Annotation of Morphological Errors in L2 Russian Corpus Analysis
In: 21st Annual Second Language Acquisition and Teaching Interdisciplinary Roundtable ; https://hal.archives-ouvertes.fr/hal-03620469 ; 21st Annual Second Language Acquisition and Teaching Interdisciplinary Roundtable, University of Arizona, Feb 2022, Tucson, United States (2022)
BASE
Show details
18
Usages du Dictionnaire Électronique des Synonymes (DES) du CRISCO : focus sur les mots inexistants
In: ISSN: 2607-0987 ; Le carnet de la MRSH ; https://halshs.archives-ouvertes.fr/halshs-03606075 ; 2022 (2022)
BASE
Show details
19
Identifier l’ironie ?
Grezka, Aude; Niziołek, Małgorzata. - : HAL CCSD, 2022. : GERFLINT, 2022
In: ISSN: 1774-7988 ; EISSN: 2261-3455 ; Synergies Pologne ; https://halshs.archives-ouvertes.fr/halshs-03552205 ; Synergies Pologne, 2022 (2022)
BASE
Show details
20
Cross-Situational Learning Towards Robot Grounding
In: https://hal.archives-ouvertes.fr/hal-03628290 ; 2022 (2022)
BASE
Show details

Page: 1 2 3 4 5...827

Catalogues
1
0
0
0
0
0
0
Bibliographies
1
0
0
0
0
0
0
0
12
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
16.515
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern