Transformation of shape information in the ventral pathway
Introduction
The world is familiar and comprehensible to us because we recognize and understand the objects it contains. We identify, distinguish and evaluate objects based on their shapes, which range from simple (letters and numbers) to extremely complex (faces). This is one of the most computationally daunting tasks the brain performs, owing to the complexity and variability of the input data (retinal images of objects) and the high dimensionality of object shape. It seems trivial or even transparent to us only because one of the two major pathways in visual cortex [1] is dedicated to continuously processing object information with extraordinary accuracy and rapidity [2••]. This object-processing pathway (also known as the ventral, temporal or ‘what’ pathway) transforms retinal signals into object representations that are explicit enough to support our vivid appreciation of object structure, compact enough to be stored in memory, and stable enough to generalize across different viewing conditions (Figure 1). The neural algorithms that make this possible are not yet understood, and it has proven difficult to duplicate human visual performance using computer vision systems at the present stage.
The ventral pathway runs from primary visual cortex (V1) and secondary visual cortex (V2) through area V4 and then into a series of further processing stages in ventral occipitotemporal cortex. In humans, these further stages include area V8 [3] (alternatively labeled V4 by some authors), the lateral occipital complex [4], and parts of the fusiform and parahippocampal gyri [5, 6]. The general functionality of these areas can be studied using functional magnetic resonance imaging (fMRI), which has revealed, for example, regions specialized for face, body and scene processing [6, 7, 8]. Algorithmic-level shape processing can be studied using electrode recordings in monkeys, which have similar visual capacities and a highly analogous organization of ventral visual cortex (V1, V2, V4 and multiple stages in inferotemporal cortex [IT], including face and body patches) [9]. Here, we review human and monkey experiments from the past two years that shed light on how the ventral pathway transforms retinal signals into useful object representations.
Section snippets
Higher-order contour derivatives
It is well-established that the first stage in the ventral pathway transformation involves the extraction of local orientation and spatial frequency information [10]. Orientation is a first-order derivative that efficiently encodes elongated contrast regions that correspond to object contours. As a result, the transformation that occurs in V1 maximizes sparseness (minimizes the number of active neurons) in the representation of natural images [11, 12]. This constitutes a major step towards more
Structural representation in object-based coordinates
Many theories of shape processing [27, 28] are based on the idea of structural representation — that is, shape description in terms of object parts and their positional and connectional relationships. Structural codes are compact, because even complex shapes comprise a manageable number of parts. Structural codes are highly generative, because even a limited basis set of different elements can be combined in so many ways. Thus, a finite number of neurons that encode object parts can represent a
Composite tuning dimensions for face representation
Further integration can produce composite tuning dimensions that summarize large amounts of geometric detail. For example, our exquisite expertise in discriminating faces must depend on high-level neurons that are sensitive to complex combinations of simpler structural parameters [34]. Face perception is so specialized that face-selective regions of cortex can be identified at the gross anatomical level. These regions seem to be truly specialized for faces, and not just for general
Long-term storage of object knowledge
Face perception is the extreme example of lifelong shape learning, but other aspects of object perception likewise depend on long-term memory and continual calibration. Object perception is highly inferential: the visual system learns inductive principles from the environment, taking advantage of its peculiar properties to optimize coding. For example, experience teaches that the most common lighting direction is from above, and we take advantage of that prior knowledge to derive
Concluding remarks
Transformation of object shape information in the ventral pathway is one of the most computationally complex tasks the brain performs. Correspondingly, it is one of the most difficult processes to understand. At present, we have only superficial knowledge of how object representations become compact, explicit and stable enough to support our remarkable perceptual abilities. At intermediate stages of processing, we know that transformations are geometric in nature, and include extraction of
References and recommended reading
Papers of particular interest, published within the period of review, have been highlighted as:
• of special interest
•• of outstanding interest
Acknowledgements
This work was supported by the National Eye Institute and by the Pew Scholars Program in the Biomedical Sciences.
References (57)
- et al.
Feature analysis in early vision: evidence from search asymmetries
Psychol Rev
(1988) - et al.
Curvature is a basic feature for visual search tasks
Perception
(1992) - et al.
Population coding of shape in area V4
Nat Neurosci
(2002) - et al.
Conjunction and linear non-separability effects in visual shape encoding
Vision Res
(2000) - et al.
Evaluation of a shape-based model of human face discrimination using FMRI and behavioral techniques
Neuron
(2006) - et al.
Can generic expertise explain special processing for faces?
Trends Cogn Sci
(2007) - et al.
Face perception; domain specific, not process specific
Neuron
(2004) - et al.
A cortical region consisting entirely of face-selective cells
Science
(2006) - et al.
Microstimulation of inferotemporal cortex influences face categorization
Nature
(2006) - et al.
Visual expertise with nonface objects leads to competition with the early perceptual processing of faces in the human occipitotemporal cortex
Proc Natl Acad Sci USA
(2004)