2 |
Distributional semantics of objects in visual scenes in comparison to text
|
|
|
|
Abstract:
The distributional hypothesis states that the meaning of a concept is defined through the contexts it occurs in. In practice, often word co-occurrence and proximity are analyzed in text corpora for a given word to obtain a real-valued semantic word vector, which is taken to (at least partially) encode the meaning of this word. Here we transfer this idea from text to images, where pre-assigned labels of other objects or activations of convolutional neural networks serve as context. We propose a simple algorithm that extracts and processes object contexts from an image database and yields semantic vectors for objects. We show empirically that these representations exhibit on par performance with state-of-the-art distributional models over a set of conventional objects. For this we employ well-known word benchmarks in addition to a newly proposed object-centric benchmark. ; peerReviewed
|
|
Keyword:
530; Computer vision; Distributional hypothesis; Object semantics; Semantics; Vision and language
|
|
URL: http://resolver.sub.uni-goettingen.de/purl?gs-1/16274 https://doi.org/10.1016/j.artint.2018.12.009
|
|
BASE
|
|
Hide details
|
|
|
|