1 |
Vision-and-Language or Vision-for-Language? On Cross-Modal Influence in Multimodal Transformers ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Multimodal Pretraining Unmasked: A Meta-Analysis and a Unified Framework of Vision-and-Language BERTs ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Visually Grounded Reasoning across Languages and Cultures ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Visually Grounded Reasoning across Languages and Cultures ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|