1 |
Delving Deeper into Cross-lingual Visual Question Answering ...
|
|
|
|
Abstract:
Visual question answering (VQA) is one of the crucial vision-and-language tasks. Yet, the bulk of research until recently has focused only on the English language due to the lack of appropriate evaluation resources. Previous work on cross-lingual VQA has reported poor zero-shot transfer performance of current multilingual multimodal Transformers and large gaps to monolingual performance, attributed mostly to misalignment of text embeddings between the source and target languages, without providing any additional deeper analyses. In this work, we delve deeper and address different aspects of cross-lingual VQA holistically, aiming to understand the impact of input data, fine-tuning and evaluation regimes, and interactions between the two modalities in cross-lingual setups. 1) We tackle low transfer performance via novel methods that substantially reduce the gap to monolingual English performance, yielding +10 accuracy points over existing transfer methods. 2) We study and dissect cross-lingual VQA across ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://arxiv.org/abs/2202.07630 https://dx.doi.org/10.48550/arxiv.2202.07630
|
|
BASE
|
|
Hide details
|
|
2 |
IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Smelting Gold and Silver for Improved Multilingual AMR-to-Text Generation ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Smelting Gold and Silver for Improved Multilingual AMR-to-Text Generation ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
UNKs Everywhere: Adapting Multilingual Language Models to New Scripts ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
UNKs Everywhere: Adapting Multilingual Language Models to New Scripts ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer
|
|
|
|
BASE
|
|
Show details
|
|
13 |
AdapterHub: A Framework for Adapting Transformers
|
|
Pfeiffer, Jonas; Ruckle, Andreas; Poth, Clifton. - : Association for Computational Linguistics, 2020. : Proceedings of the Conference on Empirical Methods in Natural Language Processing: System Demonstrations (EMNLP 2020), 2020
|
|
BASE
|
|
Show details
|
|
14 |
Specialising Distributional Vectors of All Words for Lexical Entailment ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Specializing distributional vectors of all words for lexical entailment
|
|
|
|
BASE
|
|
Show details
|
|
16 |
A neural autoencoder approach for document ranking and query refinement in pharmacogenomic information retrieval
|
|
|
|
BASE
|
|
Show details
|
|
|
|