Modelling divided visual attention with a winner-take-all network

doi:10.1016/j.neunet.2005.06.015

Neural Networks

Volume 18, Issues 5–6, July–August 2005, Pages 620-627

https://doi.org/10.1016/j.neunet.2005.06.015 Get rights and content

Abstract

Experimental evidence on the distribution of visual attention supports the idea of a spatial saliency map, whereby bottom-up and top-down influences on attention are integrated by a winner-take-all mechanism. We implement this map with a continuous attractor neural network, and test the ability of our model to explain experimental evidence on the distribution of spatial attention. The majority of evidence supports the view that attention is unitary, but recent experiments provide evidence for split attentional foci. We simulate two such experiments. Our results suggest that the ability to divide attention depends on sustained endogenous signals from short term memory to the saliency map, stressing the interplay between working memory mechanisms and attention.

Introduction

Attention is an old concept in psychology correlated with enhanced processing of objects or regions in space (Posner, Snyder, & Davidson, 1980). While attention is a multi-modal phenomenon (Cherry, 1953; Zelano et al., 2004), the majority of research has focused on selective visual attention (SVA). The limited capacity of the visual system necessitates a mechanism to select stimuli from the visual field, and Tsotsos pointed out that attention solves the complexity problem of sensory processing (Tsotsos, 1992).

A distinction can be drawn between pre-attentive and attentive visual processing (Neisser, 1967). Pre-attentive processing refers to bottom-up (BU) feature saliency of visual stimuli whereby items that differ from their surroundings ‘pop out’ to the viewer. Attentive processing refers to top-down (TD) influences on perception of stimuli determined by object and locational bias such as task instructions or foreknowledge of stimulus characteristics. Determining saliency, then, is both a BU and TD requirement, and computational models of SVA include maps that integrate BU salience across object features (Koch & Ullman, 1985), TD bias (Treisman, 1998), and the interplay of both (Deco, Pollatos, & Zihl, 2002; Wolfe, 1994).

Koch and Ullman (1985) provide a neural network model of SVA in which topographic feature maps are integrated by a winner-take-all (WTA) saliency map of BU stimuli. In their model, inhibiting the selected location causes a shift to the next most salient location. Wolfe (1994) builds on Neisser's pre-attentive/attentive distinction (Neisser, 1967), integrating BU and TD saliency criteria in his Guided Search model. Treisman (1998) provides a model of spatial attention to solve the Binding Problem, in which a TD saliency map determines object features selected for further processing, and suggests parietal cortex as the biological correlate of her ‘master’ map. Deco et al. (2002) use inhibition to mediate BU and TD influences in an instantiation of Duncan and Humphreys' biased competition model (Duncan & Humphreys, 2002), simulating saliency in posterior parietal cortex (PP) with a Continuous Attractor Neural Network (CANN). Spatial saliency in PP interacts with BU feature maps to converge on a winning location. See Shipp (2004) and Itti & Koch (2001) for a review of these and other models.

There is long-standing debate about the distribution of SVA. Many cognitive models propose a unitary focus of attention, likened to a roving spotlight over the visual field (Posner et al., 1980). Variants of the spotlight metaphor include gradient (Downing & Pinker, 1985; LaBerge & Brown, 1989) and zoom lens (Eriksen & James, 1986) models, suggesting that attention may be a graded phenomenon, attenuated around a central focus. A large body of evidence supports such unitary models (McCormick, Klein, & Johnston, 1998; Posner et al., 1980), but several more recent experiments have provided evidence for non-contiguous allocation of SVA (Awh & Pashler, 2000; Hahn & Kramer, 1998; Muller, Malinowski, Gruber, & Hillyard, 2003).

Here, we study how split attention can be achieved by a dynamic implementation of a WTA map. Despite their WTA nature, CANNs are able to account for split attention when network dynamics facilitate long transition states between regimes (Trappenberg & Standage, 2005) and when dominated by sustained inputs (Standage, Trappenberg, & Klein, 2005). We simulate the experiments of Muller et al. (2003) with a 1-dimensional (1D) CANN model. We build on simulations presented in Standage et al. (2005) that use a narrow weight profile, facilitating steeply sloped regions of activity that occupy a small portion of the network. Because we do not know the size of the active region of PP and its relation to coordinates in the visual field, we run similar experiments with a wide weight profile, resulting in activity that spans the majority of the network. We demonstrate that the ability of the model to account for divided attention does not depend on fine tuning this network parameter.

We simulate two experiments by Awh and Pashler (2000) with a 2-dimensional (2D) CANN model, demonstrating how the model accounts for their finding divided attention in one experiment and unitary attention in the other. Preliminary simulations in 1D are reported in Standage et al. (2005). Our simulations are consistent with their experimental findings, but our model offers an alternative conclusion.

Section snippets

Methods

In 1D and 2D simulations, we use a fully connected recurrent rate model with N nodes, where N=N_xN_y. We model only PP from the model by Deco et al. (2002). WTA is implemented by local cooperation and long distance competition in the laterally connected network. The average state u_i of a node with index i is given by $τ \frac{d u_{i} (t)}{d t} = - u_{i} (t) + \sum_{j} w_{i j} r_{j} (t) a + I_{i}^{ext} (t),$ where τ is a time constant, $I_{i}^{ext}$ is external input to the network, a=2π/N_x is a scale factor, and r_i is a normalized square of u_i given by $g (u_{i}) = u$

Simulations

Muller et al. (2003) provide evidence for sustained division of visual attention by recording steady state visual evoked potentials (SSVEP) while subjects viewed a horizontal array of four stimulus elements following instructions to attend to two locations. On separate blocks of trials, subjects attended to adjacent and separated positions. The SSVEP is the electrophysiological response in visual cortex to a rapidly flickering stimulus, and has been shown to increase in amplitude when attention

Discussion

In our simulations of Müller's experiments, we address a parametric issue raised in Standage et al. (2005). By increasing the width of the model's weight profile, and comparing network output with similar experiments in Standage et al. (2005), we show that under transient input, the model outputs a flatter bubble, predicting that the magnitude of subjects' attention in Müller's adjacent trials was more evenly distributed across attended locations than predicted in Standage et al. (2005). This

Conclusions

The model of SVA by Deco et al. (2002) implements a saliency map in PP with a CANN network. This instantiation of biased competition (Duncan & Humphreys, 2002) integrates BU and TD influences in a biologically realistic computational architecture. Our simulations test this promising model's ability to explain behavioural and physiological evidence on the spatial distribution of SVA.

Our results demonstrate that CANNs provide a model of spatial attention in PP capable of explaining divergent

Acknowledgements

This work was supported in part by the NSERC grant RGPIN 249885-03.

References (27)

G. Deco et al.
The time course of selective visual attention: Theory and experiments
Vision Research
(2002)
S. Shipp
The brain circuitry of attention
Trends in Cognitive Sciences
(2004)
E. Awh et al.
Evidence for split attentional foci
Journal of Experimental Psychology, Human Perception and Performance
(2000)
E. Cherry
Some experiments on the recognition of speech, with one and with two ears
The Journal of the Acoustical Society of America
(1953)
C.L. Colby et al.
Space and attention in parietal cortex
Annual Review of Neuroscience
(1999)
N. Cowan
The magical number 4 in short-term memory: A reconsideration of mental storage capacity
Behavioral and Brain Sciences
(2001)
S. Deneve et al.
Divisive normalization, line attractor networks and ideal observers
Advances in Neural Processing Systems
(1999)
C. Downing et al.
The spatial structure of visual attention
J. Duncan et al.
Visual search and stimulus similarity
Psychological Review
(2002)
C. Eriksen et al.
Visual attention within and around the field of focal attention: A zoom lens model
Perception and Psychophysics
(1986)

S. Funahashi et al.

Mnemonic coding of visual space in the monkey's dorsolateral prefrontal cortex

Journal of Neurophysiology

(1989)

S. Hahn et al.

Further evidence for the division of attention between non-contiguous locations

Visual Cognition

(1998)

L. Itti et al.

Computational modelling of visual attention

Nature Reviews: Neuroscience

(2001)

Cited by (34)

It's time for attentional control: Temporal expectation in the attentional blink
2023, Consciousness and Cognition
Citation Excerpt :
One may occasionally miss significant stimuli if no sufficient attentional resources are allocated to them (Dux & Marois, 2009; Simons, 2000). Whereas such “blindness” is usually explained in the sense of resource competition, the issue is in fact a problem of temporal control; that is, the allocation of attentional resources to certain targets – following e.g., the winner-take-all manner (Itti & Koch, 2001; Standage, Trappenberg, & Klein, 2005) – renders the resources temporarily unavailable for other potentially relevant ones. Given the decent temporal resolution in visual processing (Holcombe, 2009; Lawrence, 1971), it is thus interesting to learn why there is such a temporal limitation and how to better optimize our cognitive processes in the temporal domain.
The attentional blink (AB) reveals a limitation in conscious processing of sequential targets. Although it is widely held that the AB derives from a structural bottleneck of central capacity, how the central processing is constrained is still unclear. As the AB reflects the dilemma of deploying attentional resources in the time dimension, research on temporal allocation provides an important avenue for understanding the mechanism. Here we reviewed studies regarding the role of temporal expectation in modulating the AB performance primarily based on two temporal processing strategies: interval-based and rhythm-based timings. We showed that both temporal expectations can help to organize limited resources among multiple attentional episodes, thereby mitigating the AB effect. As it turns out, scrutinizing on the AB from a temporal perspective is a promising way to comprehend the mechanisms behind the AB and conscious cognition. We also highlighted some unresolved issues and discussed potential directions for future research.
Spatial distribution of attentional bias in visuo-spatial working memory following multiple cues
2014, Acta Psychologica
Citation Excerpt :
They suggested instead that the occurrence of a single focus or multiple foci is possibly the result of specific experimental conditions. More specifically, Standage, Trappenberg, and Klein (2005) used a neural network simulation to show that attentional split in more than one location is more likely to be observed when the distance between the attended locations is relatively high compared to the spread of each individual attentional distribution. In other words, it seems that the distance between the attended locations is crucial for observing or not effects indicative of attentional split.
When attention is focused on one location, its spatial distribution depends on many factors, such as the distance between the attended location and the target location, the presence of visual meridians in between them, and the way, endogenous or exogenous, by which attention is oriented. However, it is not well known how attention distributes when more than one location is endogenously or exogenously cued, which was the focus of the current study. Furthermore, the distribution of attention has been manly investigated in perception. In the present study we faced this issue from a different perspective, by examining the spatial distribution of the attentional bias in visuo-spatial working memory (VSWM), when attention is oriented either exogenously or endogenously, i.e., after two peripheral vs. central symbolic cues (also manipulating cue–target predictability). Results indicated a systematic difference between endogenous and exogenous attention regarding the distribution of the attentional bias over VSWM. In fact, attentional bias following endogenous cues was affected by the presence of visual meridians and by the split of the attentional focus, converging in a unipolar attentional distribution, independently of cue–target predictability. On the other hand, when pulled by exogenous cues, attention distributed uni-modally or multi-modally depending on the distance between the cued locations, with larger effects for highly predictive cues. Results are discussed in terms of space-based, object-based and perceptual grouping mechanisms.
A model of the neural substrates for exploratory dynamics in basal ganglia
2013, Progress in Brain Research
We present a model of basal ganglia (BG) that departs from the classical Go/NoGo picture of the function of its key pathways—the Direct and Indirect Pathways (DP and IP). Between the Go and NoGo regimes, we posit a third Explore regime, which denotes random exploration of action alternatives. Striatal dopamine (DA) is assumed to switch between DP and IP activation. The IP is modeled as a loop of the subthalamic nucleus (STN) and the Globus Pallidus externa (GPe). Simulations reveal that while the model displays Go and NoGo regimes for extreme values of DA, at intermediate values of DA, it exhibits exploratory behavior, which originates from the chaotic activity of the STN–GPe loop. We describe a series of BG models based on Go/Explore/NoGo approach, to explain the role of BG in three cases: (1) a simple action selection task, (2) reaching, and (3) willed action.
On the neural substrates for exploratory dynamics in basal ganglia: A model
2012, Neural Networks
We present a neural network model of basal ganglia that departs from the classical Go/NoGo picture of the function of its key pathways—the direct pathway (DP) and the indirect pathway (IP). In classical descriptions of basal ganglia function, the DP is known as the Go pathway since it facilitates movement and the IP is called the NoGo pathway since it inhibits movement. Between these two regimes, in the present model, we posit that there is a third Explore regime, which denotes random exploration of the space of actions. The proposed model is instantiated in a simple action selection task. Striatal dopamine is assumed to switch between DP and IP activation. The IP is modeled as a loop of the subthalamic nucleus (STN) and the globus pallidus externa (GPe), capable of producing chaotic activity. Simulations reveal that, while the system displays Go and NoGo regimes for extreme values of dopamine, at intermediate values of dopamine, it exhibits a new Explore regime denoting a random exploration of the space of action alternatives. The exploratory dynamics originates from the chaotic activity of the STN–GPe loop. When applied to the standard card choice experiment used in the imaging studies of Daw, O’Doherty, Dayan, Seymour, and Dolan (2006), the model favorably describes the exploratory behavior of human subjects.
Persistent storage capability impairs decision making in a biophysical network model
2011, Neural Networks
Citation Excerpt :
Because persistent mnemonic activity has been recorded in cortices correlated with perceptual decisions, including PFC (Funahashi, Bruce, & Goldman-Rakic, 1989; Fuster, 1973), FEF (Bruce & Goldberg, 1985) and PPC (Gnadt & Andersen, 1988), it has been proposed that intrinsic excitation strong enough to support persistent mnemonic activity is a property of decision circuits (Wang, 2002, 2008; Wong & Wang, 2006), similar in principle to suggestions that persistent storage (PS) capability may be required for coordinate transformations in PPC (Salinas & Sejnowski, 2001). To address the hypothesis that decision making relies on local circuit PS capability (Wang, 2002, 2008; Wong & Wang, 2006), we model a decision-correlated circuit in LIP with a spiking implementation (Ardid, Wang, & Compte, 2007; Compte, Brunel, Goldman-Rakic, & Wang, 2000; Furman & Wang, 2008; Gutkin, Laing, Colby, Chow, & Ermentrout, 2001; Ma, Beck, Latham, & Pouget, 2006) of a local circuit model widely used in population and firing rate simulations of cortical circuits (Douglas & Martin, 2007; Pouget, Dayan, & Zemel, 2000; Wilson & Cowan, 1973), including visuospatial maps in PFC (Camperi & Wang, 1998), PPC (Standage, Trappenberg, & Klein, 2005) and frontoparietal cortex (Cisek, 2006). A spiking implementation provides synaptic resolution, enabling the manipulation of intrinsic NMDARs.
Two long-standing questions in neuroscience concern the mechanisms underlying our abilities to make decisions and to store goal-relevant information in memory for seconds at a time. Recent experimental and theoretical advances suggest that NMDA receptors at intrinsic cortical synapses play an important role in both these functions. The long NMDA time constant is suggested to support persistent mnemonic activity by maintaining excitatory drive after the removal of a stimulus and to enable the slow integration of afferent information in the service of decisions. These findings have led to the hypothesis that the local circuit mechanisms underlying decisions must also furnish persistent storage of information. We use a local circuit cortical model of spiking neurons to test this hypothesis, controlling intrinsic drive by scaling NMDA conductance strength. Our simulations provide further evidence that persistent storage and decision making are supported by common mechanisms, but under biophysically realistic parameters, our model demonstrates that the processing requirements of persistent storage and decision making may be incompatible at the local circuit level. Parameters supporting persistent storage lead to strong dynamics that are at odds with slow integration, whereas weaker dynamics furnish the speed–accuracy trade-off common to psychometric data and decision theory.
Modeling the role of basal ganglia in saccade generation: Is the indirect pathway the explorer?
2011, Neural Networks
We model the role played by the Basal Ganglia (BG) in the generation of voluntary saccadic eye movements. The BG model explicitly represents key nuclei like the striatum (caudate), Substantia Nigra pars reticulata (SNr) and compata (SNc), the Subthalamic Nucleus (STN), the two pallidal nuclei and Superior Colliculus. The model is cast within the Reinforcement Learning (RL) framework, with the dopamine representing the temporal difference error, the striatum serving as the critic, and the indirect pathway playing the role of the explorer. Performance of the model is evaluated on a set of tasks such as feature and conjunction searches, directional selectivity and a successive saccade task. Behavioral phenomena such as independence of search time on number of distractors in feature search and linear increase in search time with number of distractors in conjunction search are observed. It is also seen that saccadic reaction times are longer and search efficiency is impaired on diminished BG contribution, which corroborates with reported data obtained from Parkinson’s Disease (PD) patients.

View all citing articles on Scopus

^☆: An abbreviated version of some portions of this article appeared in Standage, Trappenberg, and Klein (2005), published under the IEEE copyright

View full text

2005 Special IssueModelling divided visual attention with a winner-take-all network☆

Abstract

Introduction

Section snippets

Methods

Simulations

Discussion

Conclusions

Acknowledgements

Vision Research

Trends in Cognitive Sciences

Evidence for split attentional foci

Journal of Experimental Psychology, Human Perception and Performance

Some experiments on the recognition of speech, with one and with two ears

The Journal of the Acoustical Society of America

Space and attention in parietal cortex

Annual Review of Neuroscience

The magical number 4 in short-term memory: A reconsideration of mental storage capacity

Behavioral and Brain Sciences

Divisive normalization, line attractor networks and ideal observers

Advances in Neural Processing Systems

The spatial structure of visual attention

Visual search and stimulus similarity

Psychological Review

Visual attention within and around the field of focal attention: A zoom lens model

Perception and Psychophysics