Transformation of shape information in the ventral pathway

https://doi.org/10.1016/j.conb.2007.03.002Get rights and content

Object perception seems effortless to us, but it depends on intensive neural processing across multiple stages in ventral pathway visual cortex. Shape information at the retinal level is hopelessly complex, variable and implicit. The ventral pathway must somehow transform retinal signals into much more compact, stable and explicit representations of object shape. Recent findings highlight key aspects of this transformation: higher-order contour derivatives, structural representation in object-based coordinates, composite shape tuning dimensions, and long-term storage of object knowledge. These coding principles could help to explain our remarkable ability to perceive, distinguish, remember and understand a virtual infinity of objects.

Introduction

The world is familiar and comprehensible to us because we recognize and understand the objects it contains. We identify, distinguish and evaluate objects based on their shapes, which range from simple (letters and numbers) to extremely complex (faces). This is one of the most computationally daunting tasks the brain performs, owing to the complexity and variability of the input data (retinal images of objects) and the high dimensionality of object shape. It seems trivial or even transparent to us only because one of the two major pathways in visual cortex [1] is dedicated to continuously processing object information with extraordinary accuracy and rapidity [2••]. This object-processing pathway (also known as the ventral, temporal or ‘what’ pathway) transforms retinal signals into object representations that are explicit enough to support our vivid appreciation of object structure, compact enough to be stored in memory, and stable enough to generalize across different viewing conditions (Figure 1). The neural algorithms that make this possible are not yet understood, and it has proven difficult to duplicate human visual performance using computer vision systems at the present stage.

The ventral pathway runs from primary visual cortex (V1) and secondary visual cortex (V2) through area V4 and then into a series of further processing stages in ventral occipitotemporal cortex. In humans, these further stages include area V8 [3] (alternatively labeled V4 by some authors), the lateral occipital complex [4], and parts of the fusiform and parahippocampal gyri [5, 6]. The general functionality of these areas can be studied using functional magnetic resonance imaging (fMRI), which has revealed, for example, regions specialized for face, body and scene processing [6, 7, 8]. Algorithmic-level shape processing can be studied using electrode recordings in monkeys, which have similar visual capacities and a highly analogous organization of ventral visual cortex (V1, V2, V4 and multiple stages in inferotemporal cortex [IT], including face and body patches) [9]. Here, we review human and monkey experiments from the past two years that shed light on how the ventral pathway transforms retinal signals into useful object representations.

Section snippets

Higher-order contour derivatives

It is well-established that the first stage in the ventral pathway transformation involves the extraction of local orientation and spatial frequency information [10]. Orientation is a first-order derivative that efficiently encodes elongated contrast regions that correspond to object contours. As a result, the transformation that occurs in V1 maximizes sparseness (minimizes the number of active neurons) in the representation of natural images [11, 12]. This constitutes a major step towards more

Structural representation in object-based coordinates

Many theories of shape processing [27, 28] are based on the idea of structural representation — that is, shape description in terms of object parts and their positional and connectional relationships. Structural codes are compact, because even complex shapes comprise a manageable number of parts. Structural codes are highly generative, because even a limited basis set of different elements can be combined in so many ways. Thus, a finite number of neurons that encode object parts can represent a

Composite tuning dimensions for face representation

Further integration can produce composite tuning dimensions that summarize large amounts of geometric detail. For example, our exquisite expertise in discriminating faces must depend on high-level neurons that are sensitive to complex combinations of simpler structural parameters [34]. Face perception is so specialized that face-selective regions of cortex can be identified at the gross anatomical level. These regions seem to be truly specialized for faces, and not just for general

Long-term storage of object knowledge

Face perception is the extreme example of lifelong shape learning, but other aspects of object perception likewise depend on long-term memory and continual calibration. Object perception is highly inferential: the visual system learns inductive principles from the environment, taking advantage of its peculiar properties to optimize coding. For example, experience teaches that the most common lighting direction is from above, and we take advantage of that prior knowledge to derive

Concluding remarks

Transformation of object shape information in the ventral pathway is one of the most computationally complex tasks the brain performs. Correspondingly, it is one of the most difficult processes to understand. At present, we have only superficial knowledge of how object representations become compact, explicit and stable enough to support our remarkable perceptual abilities. At intermediate stages of processing, we know that transformations are geometric in nature, and include extraction of

References and recommended reading

Papers of particular interest, published within the period of review, have been highlighted as:

  • • of special interest

  • •• of outstanding interest

Acknowledgements

This work was supported by the National Eye Institute and by the Pew Scholars Program in the Biomedical Sciences.

References (57)

  • T. Vetter et al.

    The importance of symmetry and virtual views in three-dimensional object recognition

    Curr Biol

    (1994)
  • Y. Naya et al.

    Forward processing of long-term associative memory in monkey inferotemporal cortex

    J Neurosci

    (2003)
  • L.G. Ungerleider et al.

    Two cortical visual systems

  • C.P. Hung et al.

    Fast readout of object identity from macaque inferior temporal cortex

    Science

    (2005)
  • N. Hadjikhani et al.

    Retinotopy and color sensitivity in human visual cortical area V8

    Nat Neurosci

    (1998)
  • R. Malach et al.

    Object-related activity revealed by functional magnetic resonance imaging in human occipital cortex

    Proc Natl Acad Sci USA

    (1995)
  • N. Kanwisher et al.

    The fusiform face area: a module in human extrastriate cortex specialized for face perception

    J Neurosci

    (1997)
  • R. Epstein et al.

    A cortical representation of the local visual environment

    Nature

    (1998)
  • R.F. Schwarzlose et al.

    Separate face and body selectivity on the fusiform gyrus

    J Neurosci

    (2005)
  • P.E. Downing et al.

    A cortical area selective for visual processing of the human body

    Science

    (2001)
  • D.Y. Tsao et al.

    Faces and objects in macaque cerebral cortex

    Nat Neurosci

    (2003)
  • D.H. Hubel et al.

    Receptive fields and functional architecture of monkey striate cortex

    J Physiol

    (1968)
  • B.A. Olshausen et al.

    Emergence of simple-cell receptive field properties by learning a sparse code for natural Images 2

    Nature

    (1996)
  • W.E. Vinje et al.

    Sparse coding and decorrelation in primary visual cortex during natural vision

    Science

    (2000)
  • A. Pasupathy et al.

    Responses to contour features in macaque area V4

    J Neurophysiol

    (1999)
  • J.L. Gallant et al.

    Selectivity for polar, hyperbolic, and Cartesian gratings in macaque visual cortex

    Science

    (1993)
  • J. Hegde et al.

    A comparative study of shape representation in macaque visual areas V2 and V4

    Cereb Cortex

    (2007)
  • M. Ito et al.

    Representation of angles embedded within contour stimuli in area V2 of macaque monkeys

    J Neurosci

    (2004)
  • Cited by (0)

    View full text