Elsevier

Neural Networks

Volume 18, Issues 5–6, July–August 2005, Pages 620-627
Neural Networks

2005 Special Issue
Modelling divided visual attention with a winner-take-all network

https://doi.org/10.1016/j.neunet.2005.06.015Get rights and content

Abstract

Experimental evidence on the distribution of visual attention supports the idea of a spatial saliency map, whereby bottom-up and top-down influences on attention are integrated by a winner-take-all mechanism. We implement this map with a continuous attractor neural network, and test the ability of our model to explain experimental evidence on the distribution of spatial attention. The majority of evidence supports the view that attention is unitary, but recent experiments provide evidence for split attentional foci. We simulate two such experiments. Our results suggest that the ability to divide attention depends on sustained endogenous signals from short term memory to the saliency map, stressing the interplay between working memory mechanisms and attention.

Introduction

Attention is an old concept in psychology correlated with enhanced processing of objects or regions in space (Posner, Snyder, & Davidson, 1980). While attention is a multi-modal phenomenon (Cherry, 1953; Zelano et al., 2004), the majority of research has focused on selective visual attention (SVA). The limited capacity of the visual system necessitates a mechanism to select stimuli from the visual field, and Tsotsos pointed out that attention solves the complexity problem of sensory processing (Tsotsos, 1992).

A distinction can be drawn between pre-attentive and attentive visual processing (Neisser, 1967). Pre-attentive processing refers to bottom-up (BU) feature saliency of visual stimuli whereby items that differ from their surroundings ‘pop out’ to the viewer. Attentive processing refers to top-down (TD) influences on perception of stimuli determined by object and locational bias such as task instructions or foreknowledge of stimulus characteristics. Determining saliency, then, is both a BU and TD requirement, and computational models of SVA include maps that integrate BU salience across object features (Koch & Ullman, 1985), TD bias (Treisman, 1998), and the interplay of both (Deco, Pollatos, & Zihl, 2002; Wolfe, 1994).

Koch and Ullman (1985) provide a neural network model of SVA in which topographic feature maps are integrated by a winner-take-all (WTA) saliency map of BU stimuli. In their model, inhibiting the selected location causes a shift to the next most salient location. Wolfe (1994) builds on Neisser's pre-attentive/attentive distinction (Neisser, 1967), integrating BU and TD saliency criteria in his Guided Search model. Treisman (1998) provides a model of spatial attention to solve the Binding Problem, in which a TD saliency map determines object features selected for further processing, and suggests parietal cortex as the biological correlate of her ‘master’ map. Deco et al. (2002) use inhibition to mediate BU and TD influences in an instantiation of Duncan and Humphreys' biased competition model (Duncan & Humphreys, 2002), simulating saliency in posterior parietal cortex (PP) with a Continuous Attractor Neural Network (CANN). Spatial saliency in PP interacts with BU feature maps to converge on a winning location. See Shipp (2004) and Itti & Koch (2001) for a review of these and other models.

There is long-standing debate about the distribution of SVA. Many cognitive models propose a unitary focus of attention, likened to a roving spotlight over the visual field (Posner et al., 1980). Variants of the spotlight metaphor include gradient (Downing & Pinker, 1985; LaBerge & Brown, 1989) and zoom lens (Eriksen & James, 1986) models, suggesting that attention may be a graded phenomenon, attenuated around a central focus. A large body of evidence supports such unitary models (McCormick, Klein, & Johnston, 1998; Posner et al., 1980), but several more recent experiments have provided evidence for non-contiguous allocation of SVA (Awh & Pashler, 2000; Hahn & Kramer, 1998; Muller, Malinowski, Gruber, & Hillyard, 2003).

Here, we study how split attention can be achieved by a dynamic implementation of a WTA map. Despite their WTA nature, CANNs are able to account for split attention when network dynamics facilitate long transition states between regimes (Trappenberg & Standage, 2005) and when dominated by sustained inputs (Standage, Trappenberg, & Klein, 2005). We simulate the experiments of Muller et al. (2003) with a 1-dimensional (1D) CANN model. We build on simulations presented in Standage et al. (2005) that use a narrow weight profile, facilitating steeply sloped regions of activity that occupy a small portion of the network. Because we do not know the size of the active region of PP and its relation to coordinates in the visual field, we run similar experiments with a wide weight profile, resulting in activity that spans the majority of the network. We demonstrate that the ability of the model to account for divided attention does not depend on fine tuning this network parameter.

We simulate two experiments by Awh and Pashler (2000) with a 2-dimensional (2D) CANN model, demonstrating how the model accounts for their finding divided attention in one experiment and unitary attention in the other. Preliminary simulations in 1D are reported in Standage et al. (2005). Our simulations are consistent with their experimental findings, but our model offers an alternative conclusion.

Section snippets

Methods

In 1D and 2D simulations, we use a fully connected recurrent rate model with N nodes, where N=NxNy. We model only PP from the model by Deco et al. (2002). WTA is implemented by local cooperation and long distance competition in the laterally connected network. The average state ui of a node with index i is given byτdui(t)dt=ui(t)+jwijrj(t)a+Iiext(t),where τ is a time constant, Iiext is external input to the network, a=2π/Nx is a scale factor, and ri is a normalized square of ui given byg(ui)=u

Simulations

Muller et al. (2003) provide evidence for sustained division of visual attention by recording steady state visual evoked potentials (SSVEP) while subjects viewed a horizontal array of four stimulus elements following instructions to attend to two locations. On separate blocks of trials, subjects attended to adjacent and separated positions. The SSVEP is the electrophysiological response in visual cortex to a rapidly flickering stimulus, and has been shown to increase in amplitude when attention

Discussion

In our simulations of Müller's experiments, we address a parametric issue raised in Standage et al. (2005). By increasing the width of the model's weight profile, and comparing network output with similar experiments in Standage et al. (2005), we show that under transient input, the model outputs a flatter bubble, predicting that the magnitude of subjects' attention in Müller's adjacent trials was more evenly distributed across attended locations than predicted in Standage et al. (2005). This

Conclusions

The model of SVA by Deco et al. (2002) implements a saliency map in PP with a CANN network. This instantiation of biased competition (Duncan & Humphreys, 2002) integrates BU and TD influences in a biologically realistic computational architecture. Our simulations test this promising model's ability to explain behavioural and physiological evidence on the spatial distribution of SVA.

Our results demonstrate that CANNs provide a model of spatial attention in PP capable of explaining divergent

Acknowledgements

This work was supported in part by the NSERC grant RGPIN 249885-03.

References (27)

  • G. Deco et al.

    The time course of selective visual attention: Theory and experiments

    Vision Research

    (2002)
  • S. Shipp

    The brain circuitry of attention

    Trends in Cognitive Sciences

    (2004)
  • E. Awh et al.

    Evidence for split attentional foci

    Journal of Experimental Psychology, Human Perception and Performance

    (2000)
  • E. Cherry

    Some experiments on the recognition of speech, with one and with two ears

    The Journal of the Acoustical Society of America

    (1953)
  • C.L. Colby et al.

    Space and attention in parietal cortex

    Annual Review of Neuroscience

    (1999)
  • N. Cowan

    The magical number 4 in short-term memory: A reconsideration of mental storage capacity

    Behavioral and Brain Sciences

    (2001)
  • S. Deneve et al.

    Divisive normalization, line attractor networks and ideal observers

    Advances in Neural Processing Systems

    (1999)
  • C. Downing et al.

    The spatial structure of visual attention

  • J. Duncan et al.

    Visual search and stimulus similarity

    Psychological Review

    (2002)
  • C. Eriksen et al.

    Visual attention within and around the field of focal attention: A zoom lens model

    Perception and Psychophysics

    (1986)
  • S. Funahashi et al.

    Mnemonic coding of visual space in the monkey's dorsolateral prefrontal cortex

    Journal of Neurophysiology

    (1989)
  • S. Hahn et al.

    Further evidence for the division of attention between non-contiguous locations

    Visual Cognition

    (1998)
  • L. Itti et al.

    Computational modelling of visual attention

    Nature Reviews: Neuroscience

    (2001)
  • Cited by (34)

    • It's time for attentional control: Temporal expectation in the attentional blink

      2023, Consciousness and Cognition
      Citation Excerpt :

      One may occasionally miss significant stimuli if no sufficient attentional resources are allocated to them (Dux & Marois, 2009; Simons, 2000). Whereas such “blindness” is usually explained in the sense of resource competition, the issue is in fact a problem of temporal control; that is, the allocation of attentional resources to certain targets – following e.g., the winner-take-all manner (Itti & Koch, 2001; Standage, Trappenberg, & Klein, 2005) – renders the resources temporarily unavailable for other potentially relevant ones. Given the decent temporal resolution in visual processing (Holcombe, 2009; Lawrence, 1971), it is thus interesting to learn why there is such a temporal limitation and how to better optimize our cognitive processes in the temporal domain.

    • Spatial distribution of attentional bias in visuo-spatial working memory following multiple cues

      2014, Acta Psychologica
      Citation Excerpt :

      They suggested instead that the occurrence of a single focus or multiple foci is possibly the result of specific experimental conditions. More specifically, Standage, Trappenberg, and Klein (2005) used a neural network simulation to show that attentional split in more than one location is more likely to be observed when the distance between the attended locations is relatively high compared to the spread of each individual attentional distribution. In other words, it seems that the distance between the attended locations is crucial for observing or not effects indicative of attentional split.

    • Persistent storage capability impairs decision making in a biophysical network model

      2011, Neural Networks
      Citation Excerpt :

      Because persistent mnemonic activity has been recorded in cortices correlated with perceptual decisions, including PFC (Funahashi, Bruce, & Goldman-Rakic, 1989; Fuster, 1973), FEF (Bruce & Goldberg, 1985) and PPC (Gnadt & Andersen, 1988), it has been proposed that intrinsic excitation strong enough to support persistent mnemonic activity is a property of decision circuits (Wang, 2002, 2008; Wong & Wang, 2006), similar in principle to suggestions that persistent storage (PS) capability may be required for coordinate transformations in PPC (Salinas & Sejnowski, 2001). To address the hypothesis that decision making relies on local circuit PS capability (Wang, 2002, 2008; Wong & Wang, 2006), we model a decision-correlated circuit in LIP with a spiking implementation (Ardid, Wang, & Compte, 2007; Compte, Brunel, Goldman-Rakic, & Wang, 2000; Furman & Wang, 2008; Gutkin, Laing, Colby, Chow, & Ermentrout, 2001; Ma, Beck, Latham, & Pouget, 2006) of a local circuit model widely used in population and firing rate simulations of cortical circuits (Douglas & Martin, 2007; Pouget, Dayan, & Zemel, 2000; Wilson & Cowan, 1973), including visuospatial maps in PFC (Camperi & Wang, 1998), PPC (Standage, Trappenberg, & Klein, 2005) and frontoparietal cortex (Cisek, 2006). A spiking implementation provides synaptic resolution, enabling the manipulation of intrinsic NMDARs.

    View all citing articles on Scopus

    An abbreviated version of some portions of this article appeared in Standage, Trappenberg, and Klein (2005), published under the IEEE copyright

    View full text