1 |
Dense Contrastive Visual-Linguistic Pretraining ...
|
|
|
|
Abstract:
Inspired by the success of BERT, several multimodal representation learning approaches have been proposed that jointly represent image and text. These approaches achieve superior performance by capturing high-level semantic information from large-scale multimodal pretraining. In particular, LXMERT and UNITER adopt visual region feature regression and label classification as pretext tasks. However, they tend to suffer from the problems of noisy labels and sparse semantic annotations, based on the visual features having been pretrained on a crowdsourced dataset with limited and inconsistent semantic labeling. To overcome these issues, we propose unbiased Dense Contrastive Visual-Linguistic Pretraining (DCVLP), which replaces the region regression and classification with cross-modality region contrastive learning that requires no annotations. Two data augmentation strategies (Mask Perturbation and Intra-/Inter-Adversarial Perturbation) are developed to improve the quality of negative samples used in contrastive ... : Accepted by ACM Multimedia 2021. arXiv admin note: text overlap with arXiv:2007.13135 ...
|
|
Keyword:
Computation and Language cs.CL; Computer Vision and Pattern Recognition cs.CV; FOS Computer and information sciences
|
|
URL: https://arxiv.org/abs/2109.11778 https://dx.doi.org/10.48550/arxiv.2109.11778
|
|
BASE
|
|
Hide details
|
|
3 |
Guilt by Association: Emotion Intensities in Lexical Representations ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Guilt by Association: Emotion Intensities in Lexical Representations ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Leveraging Adversarial Training in Self-Learning for Cross-Lingual Text Classification ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Incorporating Pragmatic Reasoning Communication into Emergent Language ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Sentence Analogies: Exploring Linguistic Relationships and Regularities in Sentence Embeddings ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Sentence Analogies: Linguistic Regularities in Sentence Embeddings ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
1055 - Domain-Specific Sentiment Lexicons Induced from Labeled Documents ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Cross-Lingual Emotion Lexicon Induction using Representation Alignment in Low-Resource Settings ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
The Open Linguistics Working Group: Developing the Linguistic Linked Open Data Cloud ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
The Open Linguistics Working Group: Developing the Linguistic Linked Open Data Cloud
|
|
|
|
In: McCrae, John Philip; Chiarcos, Christian; Bond, Francis; Cimiano, Philipp; Declerck, Thierry; de Melo, Gerard; Gracia, Jorge; Hellmann, Sebastian; Klimek, Bettina; Moran, Steven; Osenova, Petya; Pareja-Lora, Antonio; Pool, Jonathan (2016). The Open Linguistics Working Group: Developing the Linguistic Linked Open Data Cloud. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portorož, Slovenia, 23 May 2016 - 28 May 2016. European Language Resources Association (ELRA), 2435-2441. (2016)
|
|
BASE
|
|
Show details
|
|
16 |
Graph-based Methods for Large-Scale Multilingual Knowledge Integration ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Graph-based Methods for Large-Scale Multilingual Knowledge Integration
|
|
de Melo, Gerard. - : Universität des Saarlandes, 2012. : Fakultät 6 - Naturwissenschaftlich-Technische Fakultät I. Fachrichtung 6.2 - Informatik, 2012
|
|
BASE
|
|
Show details
|
|
18 |
Graph-based Methods for Large-Scale Multilingual Knowledge Integration
|
|
de Melo, Gerard. - : Saarländische Universitäts- und Landesbibliothek, 2012
|
|
BASE
|
|
Show details
|
|
|
|