6 |
Multilingual and Cross-Lingual Intent Detection from Spoken Data ...
|
|
Gerz, Daniela; Su, Pei-Hao; Kusztos, Razvan; Mondal, Avishek; Lis, Michał; Singhal, Eshan; Mrkšić, Nikola; Wen, Tsung-Hsien; Vulić, Ivan. - : arXiv, 2021
|
|
Abstract:
We present a systematic study on multilingual and cross-lingual intent detection from spoken data. The study leverages a new resource put forth in this work, termed MInDS-14, a first training and evaluation resource for the intent detection task with spoken data. It covers 14 intents extracted from a commercial system in the e-banking domain, associated with spoken examples in 14 diverse language varieties. Our key results indicate that combining machine translation models with state-of-the-art multilingual sentence encoders (e.g., LaBSE) can yield strong intent detectors in the majority of target languages covered in MInDS-14, and offer comparative analyses across different axes: e.g., zero-shot versus few-shot learning, translation direction, and impact of speech recognition. We see this work as an important step towards more inclusive development and evaluation of multilingual intent detectors from spoken data, in a much wider spectrum of languages compared to prior work. ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://arxiv.org/abs/2104.08524 https://dx.doi.org/10.48550/arxiv.2104.08524
|
|
BASE
|
|
Hide details
|
|
7 |
Crossing the Conversational Chasm: A Primer on Natural Language Processing for Multilingual Task-Oriented Dialogue Systems ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Modelling Latent Translations for Cross-Lingual Transfer ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Prix-LM: Pretraining for Multilingual Knowledge Base Construction ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Learning Domain-Specialised Representations for Cross-Lingual Biomedical Entity Linking ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
On Cross-Lingual Retrieval with Multilingual Text Encoders ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
MirrorWiC: On Eliciting Word-in-Context Representations from Pretrained Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Evaluating Multilingual Text Encoders for Unsupervised Cross-Lingual Retrieval ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
RedditBias: A Real-World Resource for Bias Evaluation and Debiasing of Conversational Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Parameter space factorization for zero-shot learning across tasks and languages ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
MirrorWiC: On Eliciting Word-in-Context Representations from Pretrained Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
UNKs Everywhere: Adapting Multilingual Language Models to New Scripts ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|