1 |
The contextual logic
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03195162 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
2 |
Learning and controlling the source-filter representation of speech with a variational autoencoder
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03650569 ; 2022 (2022)
|
|
Abstract:
17 pages, 4 figures, companion website: https://samsad35.github.io/site-sfvae/ ; Understanding and controlling latent representations in deep generative models is a challenging yet important problem for analyzing, transforming and generating various types of data. In speech processing, inspiring from the anatomical mechanisms of phonation, the source-filter model considers that speech signals are produced from a few independent and physically meaningful continuous latent factors, among which the fundamental frequency f0 and the formants are of primary importance. In this work, we show that the source-filter model of speech production naturally arises in the latent space of a variational autoencoder (VAE) trained in an unsupervised manner on a dataset of natural speech signals. Using only a few seconds of labeled speech signals generated with an artificial speech synthesizer, we experimentally illustrate that f0 and the formant frequencies are encoded in orthogonal subspaces of the VAE latent space and we develop a weakly-supervised method to accurately and independently control these speech factors of variation within the learned latent subspaces. Without requiring additional information such as text or human-labeled data, this results in a deep generative model of speech spectrograms that is conditioned on f0 and the formant frequencies, and which is applied to the transformation of speech signals.
|
|
Keyword:
[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG]; [INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD]; Deep generative models; Representation learning; Source-filter model; Variational autoencoder
|
|
URL: https://hal.archives-ouvertes.fr/hal-03650569 https://hal.archives-ouvertes.fr/hal-03650569/document https://hal.archives-ouvertes.fr/hal-03650569/file/sadok2022learning.pdf
|
|
BASE
|
|
Hide details
|
|
3 |
Assessing the impact of OCR noise on multilingual event detection over digitised documents
|
|
|
|
In: ISSN: 1432-5012 ; EISSN: 1432-1300 ; International Journal on Digital Libraries ; https://hal.archives-ouvertes.fr/hal-03635985 ; International Journal on Digital Libraries, Springer Verlag, 2022, ⟨10.1007/s00799-022-00325-2⟩ (2022)
|
|
BASE
|
|
Show details
|
|
4 |
Introducing the HIPE 2022 Shared Task: Named Entity Recognition and Linking in Multilingual Historical Documents
|
|
|
|
In: Advances in Information Retrieval. 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10–14, 2022, Proceedings, Part II ; https://hal.archives-ouvertes.fr/hal-03635971 ; Matthias Hagen; Suzan Verberne; Craig Macdonald; Christin Seifert; Krisztian Balog; Kjetil Nørvåg; Vinay Setty. Advances in Information Retrieval. 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10–14, 2022, Proceedings, Part II, 13186, Springer International Publishing, pp.347-354, 2022, Lecture Notes in Computer Science, 978-3-030-99738-0. ⟨10.1007/978-3-030-99739-7_44⟩ (2022)
|
|
BASE
|
|
Show details
|
|
5 |
Can Character-based Language Models Improve Downstream Task Performance in Low-Resource and Noisy Language Scenarios?
|
|
|
|
In: Seventh Workshop on Noisy User-generated Text (W-NUT 2021, colocated with EMNLP 2021) ; https://hal.inria.fr/hal-03527328 ; Seventh Workshop on Noisy User-generated Text (W-NUT 2021, colocated with EMNLP 2021), Jan 2022, punta cana, Dominican Republic ; https://aclanthology.org/2021.wnut-1.47/ (2022)
|
|
BASE
|
|
Show details
|
|
6 |
Annotation of Morphological Errors in L2 Russian Corpus Analysis
|
|
|
|
In: 21st Annual Second Language Acquisition and Teaching Interdisciplinary Roundtable ; https://hal.archives-ouvertes.fr/hal-03620469 ; 21st Annual Second Language Acquisition and Teaching Interdisciplinary Roundtable, University of Arizona, Feb 2022, Tucson, United States (2022)
|
|
BASE
|
|
Show details
|
|
7 |
Cross-Situational Learning Towards Robot Grounding
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03628290 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
8 |
Cross-Situational Learning Towards Robot Grounding
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03628290 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
9 |
A Methodology for the Comparison of Human Judgments With Metrics for Coreference Resolution
|
|
|
|
In: HumEval at ACL ; https://hal.archives-ouvertes.fr/hal-03650294 ; HumEval at ACL, May 2022, Dublin, Ireland ; https://humeval.github.io/ (2022)
|
|
BASE
|
|
Show details
|
|
10 |
Can machines learn to see without visual databases?
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03526569 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
11 |
Analyzing the impact of speaker localization errors on speech separation for automatic speech recognition
|
|
|
|
In: EUSIPCO 2020 - 28th European Signal Processing Conference ; https://hal.inria.fr/hal-02355669 ; EUSIPCO 2020 - 28th European Signal Processing Conference, Jan 2021, Amsterdam / Virtual, Netherlands. ⟨10.23919/Eusipco47968.2020.9287541⟩ ; https://eusipco2020.org/ (2021)
|
|
BASE
|
|
Show details
|
|
12 |
A Neural Approach for Detecting Morphological Analogies
|
|
|
|
In: The 8th IEEE International Conference on Data Science and Advanced Analytics (DSAA) ; https://hal.inria.fr/hal-03313556 ; The 8th IEEE International Conference on Data Science and Advanced Analytics (DSAA), Oct 2021, Porto/Online, Portugal (2021)
|
|
BASE
|
|
Show details
|
|
13 |
Utterance partitioning for speaker recognition: an experimental review and analysis with new findings under GMM-SVM framework
|
|
|
|
In: ISSN: 1381-2416 ; EISSN: 1572-8110 ; International Journal of Speech Technology ; https://hal.archives-ouvertes.fr/hal-03232723 ; International Journal of Speech Technology, Springer Verlag, In press, ⟨10.1007/s10772-021-09862-8⟩ (2021)
|
|
BASE
|
|
Show details
|
|
14 |
Preference Aggregation in the Generalised Unavailable Candidate Model
|
|
|
|
In: 7th International Conference on Algorithmic Decision Theory ; https://hal.sorbonne-universite.fr/hal-03384439 ; 7th International Conference on Algorithmic Decision Theory, University of Toulouse, Nov 2021, Toulouse, France. pp.35-50, ⟨10.1007/978-3-030-87756-9_3⟩ ; https://www.irit.fr/ADT2021/ (2021)
|
|
BASE
|
|
Show details
|
|
15 |
SocialAI: Benchmarking Socio-Cognitive Abilities in Deep Reinforcement Learning Agents
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03372670 ; 2021 (2021)
|
|
BASE
|
|
Show details
|
|
16 |
Optimizing Word Alignments with Better Subword Tokenization
|
|
|
|
In: The 18th biennial conference of the International Association of Machine Translation ; https://hal.archives-ouvertes.fr/hal-03322842 ; The 18th biennial conference of the International Association of Machine Translation, Aug 2021, Miami (virtual), United States (2021)
|
|
BASE
|
|
Show details
|
|
17 |
Representation of Explanations of Possibilistic Inference Decisions
|
|
|
|
In: Symbolic and Quantitative Approaches to Reasoning with Uncertainty ; ECSQARU 2021: European Conference on Symbolic and Quantitative Approaches with Uncertainty ; https://hal-cea.archives-ouvertes.fr/cea-03406884 ; ECSQARU 2021: European Conference on Symbolic and Quantitative Approaches with Uncertainty, Sep 2021, Prague, Czech Republic. pp.513-527, ⟨10.1007/978-3-030-86772-0_37⟩ (2021)
|
|
BASE
|
|
Show details
|
|
18 |
Grounding Language to Autonomously-Acquired Skills via Goal Generation
|
|
|
|
In: ICLR 2021 - Ninth International Conference on Learning Representation ; https://hal.inria.fr/hal-03121146 ; ICLR 2021 - Ninth International Conference on Learning Representation, May 2021, Vienna / Virtual, Austria (2021)
|
|
BASE
|
|
Show details
|
|
19 |
État de l'art du changement sémantique à partir de plongements contextualisés
|
|
|
|
In: COnférence en Recherche d'Informations et Applications - CORIA 2021, French Information Retrieval Conference ; https://hal.archives-ouvertes.fr/hal-03320337 ; COnférence en Recherche d'Informations et Applications - CORIA 2021, French Information Retrieval Conference, Apr 2021, Grenoble (virtuel), France (2021)
|
|
BASE
|
|
Show details
|
|
20 |
EVOLEX : la reconnaissance vocale au service du diagnostic des dysfonctionnements langagiers
|
|
|
|
In: Séminaire AFCP 2021 – Phonétique Clinique ; https://hal-univ-tlse3.archives-ouvertes.fr/hal-03269242 ; Séminaire AFCP 2021 – Phonétique Clinique, May 2021, Toulouse (virtuel), France ; http://www.afcp-parole.org/seminaire-afcp-phonetique-clinique-27-mai-2021/ (2021)
|
|
BASE
|
|
Show details
|
|
|
|