1 |
FLAVA: A Foundational Language And Vision Alignment Model ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Visual Coreference Resolution in Visual Dialog using Neural Module Networks ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Learning to Reason: End-to-End Module Networks for Visual Question Answering ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Modeling Relationships in Referential Expressions with Compositional Modular Networks ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Combining visual recognition and computational linguistics : linguistic knowledge for visual recognition and natural language descriptions of visual content ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Combining visual recognition and computational linguistics : linguistic knowledge for visual recognition and natural language descriptions of visual content
|
|
|
|
BASE
|
|
Show details
|
|
|
|