Visual–spatial attention aids the maintenance of object representations in visual working memory

Williams, Melonie; Pouget, Pierre; Boucher, Leanne; Woodman, Geoffrey F.

doi:10.3758/s13421-013-0296-7

Visual–spatial attention aids the maintenance of object representations in visual working memory

Published: 31 January 2013

Volume 41, pages 698–715, (2013)
Cite this article

Download PDF

Memory & Cognition Aims and scope Submit manuscript

Visual–spatial attention aids the maintenance of object representations in visual working memory

Download PDF

Melonie Williams¹,
Pierre Pouget²,
Leanne Boucher³ &
…
Geoffrey F. Woodman¹

2711 Accesses
35 Citations
1 Altmetric
Explore all metrics

Abstract

Theories have proposed that the maintenance of object representations in visual working memory is aided by a spatial rehearsal mechanism. In this study, we used two different approaches to test the hypothesis that overt and covert visual–spatial attention mechanisms contribute to the maintenance of object representations in visual working memory. First, we tracked observers’ eye movements while they remembered a variable number of objects during change-detection tasks. We observed that during the blank retention interval, participants spontaneously shifted gaze to the locations that the objects had occupied in the memory array. Next, we hypothesized that if attention mechanisms contribute to the maintenance of object representations, then drawing attention away from the object locations during the retention interval should impair object memory during these change-detection tasks. Supporting this prediction, we found that attending to the fixation point in anticipation of a brief probe stimulus during the retention interval reduced change-detection accuracy, even on the trials in which no probe occurred. These findings support models of working memory in which visual–spatial selection mechanisms contribute to the maintenance of object representations.

Object-based selection in visual working memory

Article Open access 13 July 2021

The contributions of visual and central attention to visual working memory

Article 09 June 2017

The role of top-down suppression in mitigating the disruptive effects of task-irrelevant feature changes in visual working memory

Article 24 July 2017

As we make eye movements to explore our world, visual working memory maintains a stable representation of several objects across these saccades (Carlson-Radvansky, 1999; Carlson-Radvansky, & Irwin, 1995; Irwin, 1991, 1992, 1996), allowing us to behave adaptively in our surroundings (Hollingworth & Luck, 2009). Previous research has shown that humans have the ability to temporarily store an average of three to four object representations in visual working memory across brief retention intervals (Irwin & Andrews, 1996; Luck & Vogel, 1997; Vogel, Woodman, & Luck, 2001). Nevertheless, how we maintain these representations is still unclear. Models of visual working memory have proposed that spatial mechanisms play an important role in maintaining object representations. Specifically, Logie and Baddeley (Baddeley & Logie, 1999; Logie, 1995) proposed that object and spatial working memory are distinct components that work in a cooperative manner, with the spatial component enabling rehearsal of both spatial and object representations while information is maintained in a passive visual buffer. Guided by these previous theoretical proposals, the goal of the present study was to test the hypothesis that overt and covert mechanisms of visual–spatial attention are used to help maintain object representations in visual working memory.

The basic idea that visual working memory consists of separate mechanisms that specialize in handling spatial and object information is consistent with evidence from dual-task paradigms (Baddeley & Lieberman, 1980; Baddeley & Logie, 1999; Logie & Marchetti, 1991). Moreover, a large body of neurophysiological evidence has supported this idea. Specifically, a leading hypothesis is that spatial and object working memory functions are segregated in dorsal and ventral pathways for working memory, much as occurs for perception (Goldman-Rakic, 1996; Jonides & Smith, 1997; Ungerleider, Courtney, & Haxby, 1998). This hypothesis has received support from several sources. First, dorsal- and ventral-stream sensory processing areas project differentially to separate memory-related areas of the prefrontal cortex (Ungerleider & Mishkin, 1982). Second, single-unit recording studies in monkeys have found that neurons in the principal sulcus of dorsolateral prefrontal cortex generally maintain information about spatial locations (e.g., Funahashi, Bruce, & Goldman-Rakic, 1989), whereas neurons in the inferior convexity of dorsolateral prefrontal cortex tend to maintain information about object identity (e.g., Wilson, O’Scalaidhe, & Goldman-Rakic, 1993). Third, neuroimaging studies in humans have found that object and spatial working memory tasks activate different networks of brain areas (Courtney, Ungerleider, Keil, & Haxby, 1996; Jonides et al., 1993). Thus, this body of evidence is consistent with the proposals that distinct mechanisms maintain object representations and handle spatial information in working memory. However, do the neurophysiological data rule out the idea that these mechanisms might interact to maintain object representations?

Findings from several studies appear to provide neurophysiological support for the hypothesis that spatial and object mechanisms in visual working memory interact to store and maintain object representations across time. Rainer, Asaad, and Miller (1998) found that prefrontal neurons that code for spatial location were interspersed with neurons that coded for object identity, indicating a lack of anatomical segregation of spatial and object information. In addition, they found that a majority of the neurons that they sampled in prefrontal cortex provided information about both an object’s identity and its spatial location during memory retention intervals (see also Rao, Rainer, & Miller, 1997). Similarly, some neuroimaging studies have found frontal areas that appear to be activated by both spatial and object information. For example, Postle and D’Esposito (1999) found that spatial and object versions of a visual working memory task activated the same areas in prefrontal cortex, although different posterior areas were active during the two different tasks. More recently, Vogel and colleagues (Vogel & Machizawa, 2004; Vogel, McCollough, & Machizawa, 2005) have shown that when people are remembering objects during the short retention intervals of a change-detection task, a contralateral event-related potential is observed while these objects are maintained. The lateralization of this signal indicates an inherent spatial specificity of the process of maintaining these object representations. Thus, although the behavioral and physiological data tend to support the notion of separate object and spatial working memory systems, a number of results appear to support the proposal that these distinct mechanisms interact to maintain representations of objects. The next question that we consider is whether mechanisms of visual–spatial attention are the source of these interactions.

Abundant evidence has revealed that covert and overt mechanisms of visual attention are used to help maintain representations of spatial locations. Eye and head movements are considered measures of overt attentional selection (e.g., shifting the high-resolution fovea to an important part of the visual scene), whereas covert attentional selection is the enhanced processing of certain items to the detriment of others in the absence of movements of the eyes or body (Posner, 1980). To examine the role of attention mechanisms in maintaining representation of spatial locations in visual working memory, Awh and colleagues (Awh & Jonides, 2001; Awh, Jonides, & Reuter-Lorenz, 1998) presented probe stimuli during the retention interval of a spatial working memory task. This allowed them to test the hypothesis that observers direct their spatial attention to the remembered locations to aid memory maintenance. They found that probe detection reaction times (RTs) were faster when the probes appeared at the remembered location rather than elsewhere in the display. Moreover, when the probe stimulus was presented at the original memorized location, spatial memory performance was significantly better than when the memorized and probe locations did not overlap. These experiments show that visual attention is deployed to help maintain spatial information in working memory.

Another line of work has shown that irrelevant eye movements (i.e., overt shifts of attention) interfere with the ability to remember spatial locations (Pearson & Sahraie, 2003; Lawrence, Myerson, & Abrams, 2004; Smyth, 1996). Baddeley (1986) and colleagues examined whether irrelevant eye movements impaired the maintenance of spatial information in working memory. They found that irrelevant eye movements interfered with memory for a specific spatial location relative to when observers were not induced to make irrelevant eye movements (see also Postle, Idzikowski, Della Sala, Logie, & Baddeley, 2006). One explanation for these findings is that eye movements create spatial representations that interfere with the memory representations of locations that the participants are trying to maintain. Alternatively, it is possible that moving gaze away from a remembered location prevents the maintenance of the location information because attention cannot stay locked there. Regardless of which account best explains these effects on working memory for spatial location, it is clear that overt and covert mechanisms of visual–spatial attention play an important role in the storage of spatial locations in working memory, either by aiding their maintenance or by preventing interference from new information.

In sum, much is known about the relationship between mechanisms of visual–spatial attention and the visual working memory storage of spatial locations, but the theoretical proposals that visual–spatial selection contributes to the storage of objects in visual working memory have not been as thoroughly tested. The goal of the present study was to test some specific predictions that grow out of the theoretical proposals that we have described. For example, do goal-driven eye movements actually aid in the maintenance of object representations? If this is the case, we predict that observers should spontaneously make eye movements to the locations of previously presented objects while these object representations are held in visual working memory. In addition, preventing eye movements during memory retention intervals should decrease change-detection accuracy. A similar logic has been applied to eye-movement studies of long-term memory (e.g., Spivey & Geng, 2001).

In Experiments 1 and 2, we examined overt measures of visual–spatial attention by tracking saccadic eye movements during the memory retention intervals of an object change-detection task. Experiment 3 tested the hypothesis that drawing spatial attention away from the locations of objects being maintained would disrupt object maintenance. Thus, for this study we used a multipronged approach to test the proposal that visual–spatial attention mechanisms are involved in the maintenance of object representations in visual working memory.

Experiment 1

To test the hypothesis that visual–spatial attention mechanisms participate in the maintenance of object representations in visual working memory, we had participants perform a change-detection task while we measured where their gaze was directed. As is shown in Fig. 1a, the participants in Experiment 1A were required to remember simple colored squares while we compared their change-detection accuracies across two conditions. In the fixation condition, participants were required to fixate a central point throughout each trial and to remember one, three, or six colored stimuli shown in a memory array presented for 500 ms. After a 5,000-ms retention interval a test array was presented, and participants reported with a buttonpress whether a change had occurred in one of the items. In the eye-movement condition, participants performed exactly the same task, but instead they were instructed that they did not need to fixate the cross in the middle of the screen and should move their eyes as they naturally would. Experiment 1B was identical to Experiment 1A, except that a different group of participants performed the change-detection task only under the eye-movement condition. This allowed us to sample more trials from each individual than was possible in Experiment 1A. With this larger sample, we focused on the behavior of the observers in the critical eye-movement condition and asked specific questions about the consequences of the fixations during the retention intervals on change-detection accuracy.

The design of Experiments 1A and 1B allowed us to address three questions about the relationship between the deployment of overt visual attention (i.e., patterns of fixation) and visual working memory maintenance. These questions progressed from general empirical observations of eye-movement behavior to more detailed examinations of the impact of that behavior.

First, by analyzing the pattern of fixations during the eye-movement conditions of Experiments 1A and 1B, we were able to test the hypothesis that participants overtly selected the spatial locations that were occupied by the objects in the memory array during the maintenance period (i.e., when the objects were no longer visible). If movements of the eyes are sensitive to the deployment of visual attention mechanisms to perform rehearsal of the object representations, then we should find that during the retention intervals, observers would make eye movements to fixate the locations that the memory items had previously occupied. If we were to observe such behavior, we could then ask more specific questions about its impact.

Second, we tested the hypothesis that fixating the locations of the objects during the change-detection task would improve the accuracy of the memory representations. Specifically, if overt mechanisms of visual attention participate in the active rehearsal of objects, we should observe that participants in Experiment 1A would be more accurate at detecting changes between the memory and test arrays when they were free to move their eyes (i.e., the eye-movement condition) as compared to when the task conditions prevented this (i.e., the fixation condition). Even if we found that people moved their eyes during the retention interval, there might still be no link between these measures of overt visual–spatial attention and the maintenance of object representations in visual working memory. If so, we expected that performance would not be better when observers were allowed to freely saccade during the retention interval, as compared to when fixating centrally. Indeed, a plausible competing prediction is that change-detection performance would be better in the fixation condition than when eye movements were allowed, because the visual transients caused by saccades might interfere with visual working memory maintenance. For example, movements of the eyes shift the retinotopic reference frame away from the allocentric reference frame, which might interfere with the maintenance of the object representations. Given this hypothesis, participants might spontaneously avoid making eye movements during the retention interval, but when they did make eye movements, performance would be worse than when the instructions required fixation. This hypothesis is a plausible contender because when people are remembering spatial locations, eye movements interfere with the maintenance of that visual–spatial feature (Lawrence et al., 2004; Pearson & Sahraie, 2003; Smyth, 1996).

Third, in Experiment 1B we tested the more specific prediction that if overt selection aids maintenance, we should find that change detection was superior when a given object was fixated during the retention interval. That is, would participants be more accurate at detecting a change of an object between the memory and test arrays if they had fixated its location during the blank retention interval? If deployments of overt visual attention to previously occupied object locations improved the retention of the information in visual working memory, we should observe an item-specific benefit at the individual-object level of analysis.

Method

Participants

A group of 26 volunteers (18–32 years of age) from Vanderbilt University and the surrounding community participated in both the eye-movement and fixation conditions of Experiment 1A. A different group of 16 individuals participated in Experiment 1B. All of the participants reported having normal or corrected-to-normal visual acuity and normal color vision. They provided informed consent and were compensated either monetarily or with course credit.

Stimuli

The stimuli consisted of solid-colored squares (each 1.2° × 1.2°) presented on a gray background (48.5 cd/m²) and centered approximately 7.5° from fixation (a black plus sign, 0.3° × 0.3°, < 0.01 cd/m²) with a minimum interitem spacing of 7.5°. On trials of set sizes 1 and 3, stimuli were randomly placed (sampled from a square distribution of the six possible stimulus locations with 7.5° interitem spacing) along the edge of a virtual annulus surrounding the center fixation. On trials with a set size of 6, the stimuli were distributed across the six possible locations. The color of each square was randomly selected with replacement from a set of seven colors: white (95.0 cd/m²), black ( < 0.01 cd/m²), red (chromaticity coordinates of the CIE 1931 color space: x = .633, y = .334), blue (x = .144, y = .065), green (x = .278, y = .614), yellow (x = .420, y = .503), and magenta (x = .291, y = .146). The three different set sizes were randomly interleaved, and the one, three, or six squares were presented in both the memory and test arrays. The articulatory-suppression stimuli were two white numbers (95.0 cd/m², randomly selected from the digits 1 to 9 without replacement) centered 3.4° above the black fixation point (0.3° × 0.3°, < 0.1 cd/m²), with one number being centered 1.7° to the right and one the same distance to the left of the horizontal meridian.

Apparatus

Eye movements were measured using an EyeLink II infrared eyetracker (SR Research Ltd., Ontario, Canada) with eye position sampled at a rate of 250 Hz. We used a velocity criterion algorithm to automatically detect saccades (35°/s) that had been created by SR Research to be used with the EyeLink II tracker. Participants made all responses using two buttons on a hand-held gamepad.

Procedure

In Experiment 1A, each participant was fitted with the head-mounted eyetracker cameras and given the instructions for the condition that each would perform in that session. During the fixation condition, we instructed participants to keep their gaze on the fixation point and to move their eyes as little as possible while performing the task. In the eye-movement condition, participants were told to move their eyes naturally while performing the task. All observers performed the fixation and eye-movement conditions during different sessions, with the order of the conditions counterbalanced across participants. Each condition consisted of 60 experimental trials and one 12-trial practice block. The researcher sat adjacent to the participant, although out of view, to ensure that participants were engaging in the articulatory-suppression task on each trial and that the eyetracker was continuously calibrated. Experiment 1B was similar to Experiment 1A, except that observers only participated in the eye-movement condition and we increased the number of trials to 120. The concurrent articulatory-suppression task was required to prevent participants from verbally recoding the object identities and storing them in verbal working memory.

Once the eyes were calibrated and drift correction was performed, each trial began with the articulatory-suppression task (repeating approximately three to four numbers per second) as soon as the numbers appeared. The digits were presented for 500 ms with a 1,500-ms stimulus onset asynchrony (SOA) between the articulatory-suppression stimuli and the memory array. The memory array was then presented for 500 ms, followed by a 5,000-ms blank retention interval, and then a 2,000-ms test array presentation. The set size of the memory array varied randomly across trials in the session. The test array remained visible for 2,000 ms or until the observer made the buttonpress response on the gamepad indicating whether the test array was the same as or different from the sample array. When the color of an item changed, it always changed to a color not present in the initial memory array (i.e., colors were sampled without replacement). The probability of a color change of one of the objects was 50 %, and participants were instructed to remember only the color of the objects because their spatial locations would never change. These instructions stressed the accuracy of the manual change-detection response, not its speed.

Data analysis

Mean change-detection accuracy was analyzed using an analysis of variance (ANOVA) with the within-subjects factors Condition (fixation or eye movement) and Set Size (one, three, or six items). Eye-movement data were analyzed using custom MATLAB scripts. An eye movement counted as being directed to the object location if it fell within a 2.0° imaginary window centered on the location of an object in the memory-sample array. This allowed us to measure the number of saccades made to the objects during each trial and to determine which object locations were fixated. For analyses focusing on the maintenance period, we only counted saccades made during the 5,000-ms retention interval, when no physical stimuli were being presented. Data from the participants were discarded from the analysis if the number of saccades made during the fixation condition was greater than two standard deviations above the mean number of saccades made throughout the experiment. This criterion led to the replacement of one participant from Experiment 1A. For the Experiment 1B analyses, we focused primarily on the maintenance period. One observer who did not saccade during the retention interval was removed from the analysis, as well as a second observer who withdrew from the experiment before completing all of the trials because of boredom and fatigue. However, removal of these outliers was not necessary to obtain the pattern of results that we observed. When we entered all available data into the statistical analyses, the same results were obtained.

Results

The memory accuracies from the fixation and eye-movement conditions of Experiment 1A are summarized in Fig. 1b. As expected, change-detection accuracy decreased as the memory set size increased. Of primary importance, accuracy was consistently higher in the eye-movement condition (94 % correct, collapsed across set sizes) than in the fixation condition (92.3 % correct). These findings resulted in significant main effects of condition, F(1, 25) = 9.85, p < .05, and set size, F(2, 50) = 42.95, p < .01. However, the interaction of these factors was not significant, F < 1.0, p = .46. These results were as would be predicted if being free to devote the spatial selection mechanism of the fovea to object locations enhances the accurate maintenance of the objects.^{Footnote 1}

Our first observation while examining the eye-movement data was that when participants were free to make eye movements during the retention interval of the change-detection task in Experiments 1A and 1B, they spontaneously fixated the spatial locations that had been occupied by the objects in the memory array. This is illustrated with an example trial in Fig. 2a. To quantify this behavior, we measured the numbers of saccades made during the retention interval to object locations and to other locations on the screen. We found that 58.8 % of the saccades were made to the object locations during the memory retention intervals using our conservative, 2° measurement window centered on the objects (which spanned 1.2° × 1.2°). As is shown in Table 1, approximately 1–2 objects were fixated during the 5,000-ms retention interval of each trial. This eye-movement behavior was characterized by saccades to object locations interspersed with saccades back to the fixation point and to locations in the direction of the object locations, but outside our measurement windows. Note at that after the saccades to other onscreen locations, the eyes returned to the same couple of object locations during the retention interval. This observation demonstrates that overt eye movements do visit the previous locations of objects during a working memory task, similar to the natural eye-movement behavior reported by Spivey and Geng (2001) in a long-term memory task.

Table 1 Eyetracking metrics measured during the 5,000-ms retention interval of Experiments 1 and 2

Full size table

Figure 2b shows the latency histogram of saccades to the object locations and to nonobject locations during the memory retention intervals in Experiment 1B. We wanted to be sure that the saccades that we interpreted as being due to maintenance were not simply due to participants fixating the locations of the objects immediately before the test array, as this might indicate that such eye movements were in preparation for the comparison of items in the test array to those in memory. Alternatively, the saccades could have occurred almost exclusively in the short interval after the offset of the memory array, as would be expected if the saccades that we observed during the retention interval were residual effects of encoding into working memory. Although there are slight increases in the number of saccades made to object locations at the beginning and end of the maintenance period, we found that fixations of the object locations occurred throughout the 5-s retention interval, consistent with the idea that these acts of overt attentional selection were performed to help maintain the object representations.

Figure 3 shows change-detection accuracy in Experiment 1B as a function of whether the item that changed was fixated during the retention interval. Saccades on the half of the trials with changes were classified as either fixating or not fixating the item that would change. The trials were then sorted accordingly. Change-detection accuracies were similar during both trial types: 79.8 % on the change-fixated trials and 80.7 % on the change-not-fixated trials. We found neither a significant main effect of trial type, F(1, 15) = 0.14, p > .7, nor an interaction of trial type and set size, F(2, 30) = 1.63, p > .2, but a significant effect of set size did emerge, F(2, 30) = 33.6, p < .01. Planned comparisons revealed that change-detection accuracy was significantly higher at set size 3 when the item that would ultimately change was fixated during the retention interval, F(1, 15) = 13.82, p < .01. These findings do not clearly support the prediction that fixations of specific objects result in an individual-item benefit. If this were the case, we should have observed a significant main effect of change-detection accuracy based on whether or not the changed item was fixated. In addition, it is unclear why such an item-specific benefit would be evident at set size 3 but not at set size 6, when visual working memory was more heavily taxed. Unlike the general benefit that we observed on change-detection accuracy when participants made eye movements, we did not see clear evidence for an item-specific benefit, an issue to which we returned in Experiment 2.

Discussion

The findings of Experiment 1 supported the hypothesis that overt visual–spatial attention is used to aid the maintenance of object representations in visual working memory. Support for this hypothesis came in two forms. First, the participants were better at detecting changes in the colors of memoranda when they were allowed to make eye movements during the retention interval. Second, when participants were instructed that they were free to move their eyes naturally, they fixated the spatial locations previously occupied by objects in the memory array during the blank retention intervals. Our individual-item analysis suggested that fixating a particular item during the memory retention interval could result in better memory for that specific colored square in Experiment 1, at least at set size 3. However, because this potential effect was not systematic or strong, we reserved drawing conclusions but returned to this possibility in Experiment 2.

In Experiment 1, we required participants to remember simple colored squares across the retention intervals. We wondered whether the fixation behavior found in Experiment 1 would pale in comparison to when participants had to maintain more complex stimuli based on a conjunction of features. Previous work had suggested that attention (covert or overt) plays a special role in the maintenance of multifeature objects in visual working memory (Wheeler & Treisman, 2002). Thus, we wanted to determine the generality of the findings in Experiment 1 and to test the hypothesis that the reliance upon overt visual–spatial attention would increase when participants had to remember objects composed of a conjunction of features.

Experiment 2

In Experiment 2, we tested the hypothesis that visual–spatial attention is primarily used during visual working memory tasks to maintain conjunctions of object features (Wheeler & Treisman, 2002). If visual–spatial attention serves a role in the maintenance of feature conjunctions, above and beyond the basic role in maintaining simple feature representations that we demonstrated in Experiment 1, then we should see that eye movements to the items would be even more critical for correctly remembering the multifeature objects in Experiment 2. The design of Experiment 2 was essentially identical to that of Experiment 1, except that participants were required to remember objects that were composed of a conjunction of features. Figure 4a shows that each object was a colored Landolt square with a gap on one side. Participants had to remember both of these features to accurately perform the change-detection task, because either the color or the shape of the object could change between the memory and test arrays. We again tracked participants’ eyes during both fixation and eye-movement conditions. This allowed us to further test the hypothesis that the overt deployment of visual–spatial attention during the retention interval aids memory performance, using the same metrics that we had used in Experiment 1. In addition, a comparison of the utility of eye movements between Experiments 1 and 2 would allow us to test the hypothesis that overt selection is particularly important for maintaining conjunctions of object features. Although some have proposed that attention is used to maintain feature conjunctions in visual working memory (Wheeler & Treisman, 2002), other recent work has challenged this proposal (Johnson, Hollingworth, & Luck, 2008; Zhang, Johnson, Woodman, & Luck, 2012). Thus, the most recent empirical work suggested that we should find that fixating the object locations during the retention interval in Experiment 2 was essentially identical to that found in Experiment 1, because maintaining feature conjunctions is not particularly reliant upon attention.