Monday, January 21, 2008

Visual Objects in Context

I read this amazing review by Moshe Bar on the contextual processing for object recognition. Here are my notes on the non-neuro part of this review.

---------------------------------------------------------------------------------
Bar, M. Visual Objects in Context Nature Reviews: Neuroscience, 2004, 5, 617-629

An excellent survey (review) of the cognitive psychology, neuroscience, and a few computational studies on using the context for object recognition.

Terminology:
1) Priming: An experience-based facilitation in perceiving a physical stimulus. In a typical object priming experiment, subjects are presented with stimuli (the primes) and their performance in object naming is recorded. Subsequently, subjects are presented with eiher the same stimuli or stimuli that have some defined relationship to the primes. any stimulus-specific difference in performance is taken as a measure of priming.
2) Context frames: According to a popular proposal for the contextual representation of associated objects (for cortical processing), the prototypical contexts are represented as structures that integrate info about the identity of the objects that are most likely to appear in a specific scene with info about their relationships. These structures are referred to as context frames. Also called schemata, scripts, and frames.
3) Boundary extension: A type of memory distortion in which observers report having seen not only information that was physically present in a pcture, but also information that they have extrapolated outside the scene's boundaries.

Five types of relationships characterizing a scene (Biederman) : (1) Support (physically supported vs. floating; (2) Interposition (e.g., occlusion); (3) Probability (some objects are more likely than others); (4) Position (typical location of objects); (5) Size (familiar relative size of objects).

- Visual objects are contextually related if they tend to co-occur in our environment.
- Representation in brain (hypothesis) : Grouping by objects in the occipital visual cortex, by basic-level categories in the anterior temporal cortex, by contextual relations in the parahippocampal cortex, and by semantic relations in the prefrontal cortex. In addition there can be other stored relations. There is one centralized, detailed object representation that serves all these relations 'on demand'.
- Isolated objects may still be easier to recognize than the object embedded in a contextually coherent scene because of possible difficulties with (a) segmentation, and (b) attentional distraction.
- When does context get processed? Competing hypotheses : (1) So rapidly that it facilitates perceptual analysis of objects; (2) when a context frame is activated it might sensitize the representation of objects associated with that context; (3) object recognition and contextual scene analysis are functionally separate and interact only at a later stage.

From a computational standpoint, it is clear that contextual representations may provide efficient generalizations in new situations and shortcuts in perceptual analysis.

2 comments:

tombone said...

Moshe Bar's work is awesome!

My favorite paper is "The Proactive Brain: Using analogies and associations to generate predictions."

http://barlab.mgh.harvard.edu/papers/TICS2007.pdf

VJ said...

I am surprised to see that I actually missed Tom's comment. I have not read this paper, but it is on my TO-READ list ;)

 
Learning in Vision: Visual Objects in Context