Spatial attention is crucial for successful cognitive functioning, allowing us to select important objects in visual scenes for deeper cognitive processing, memory encoding, and action. It works so well most of the time that we take it for granted. And then occasionally it fails us, sometimes in dramatic fashion. In the laboratory, attentional failures contribute to robust phenomena like change blindness and inattentional blindness (Simons & Rensink, 2005). In the real world, failures range from humorous (such as failing to notice a close friend standing next to you in the grocery store line) to disastrous (such as failing to notice that a stoplight has changed from green to red).

Successful allocation of spatial attention requires a delicate balance between focus on the current goals and rapid response to unanticipated opportunities or dangers. This balance is achieved by the interaction of at least two opposing forces: filtering and capture. Spatial filtering allows observers to enhance visual processing of relevant objects at relevant regions of visual space while ignoring objects at other locations (e.g., Broadbent, 1958; Lachter, Forster, & Ruthruff, 2004). Attentional capture allows stimuli to automatically grab spatial attention. Researchers debate whether attentional capture is based on stimulus salience (e.g., Theeuwes, 1992; Gaspelin, Ruthruff, & Lien, 2016) and/or task-relevance (e.g., Folk, Remington, & Johnston, 1992).

The present study addresses how these opposing forces – spatial filtering and attentional capture – work in concert to support efficient human performance. Does focusing on a particular location, or set of locations, prevent capture by all stimuli outside this focus, no matter how potent? On the one hand, a main function of attentional capture is presumably to allow rapid orienting toward objects that, although important, were not anticipated in advance, such as when an errant baseball is about to strike us in the head (see, e.g., Lin, Franconeri, & Enns, 2008, for a study of capture by looming objects). Blindness to such objects could be hazardous. So, it might seem natural, or even critical, that capture should occur for objects outside the current focus of spatial attention. On the other hand, the purpose of a spatial filter is to allow sharp focus on locations relevant to the current task goals, while ignoring irrelevant locations. This filtering minimizes cross-talk while also preventing overload of limited capacity processing resources. So, it might be advantageous to be able to “turn off” capture at irrelevant locations, at least temporarily. In short, we have two powerful and opposing mechanisms for the control of spatial attention – spatial filtering and automatic capture – and it is unclear from first principles which should dominate the other.

A few early studies supported the view that spatial filtering dominates capture. For example, Yantis and Jonides (1990) had participants search among four letters for a target letter (E vs. H) and report its identity. Whereas most of the search display letters were offsets, revealed by removing segments of a premask, one randomly selected letter on each trial was an onset, appearing abruptly against a blank background. If this abrupt onset letter captured attention, then participants should respond especially quickly when it happens to be the target and especially slowly when it is a distractor. This is in fact what happened in many conditions. However, the effect disappeared when a central precue pointed to the upcoming target location on 100 % of trials, allowing participants to focus tightly on only a single spatial location. Thus, the data suggest that capture cannot occur at to-be-ignored locations.

Theeuwes (1991) reported similar results. When a 100 % valid cue (a centrally-presented arrow) appeared between 300 ms and 600 ms prior to the target display, the presence of irrelevant onsets and offsets in other locations had no impact on target response time (RT). Accordingly, Theeuwes and colleagues have argued for an attentional window account in which salient stimuli cannot capture attention if they appear outside the current focus of spatial attention (Belopolsky & Theeuwes, 2010; Belopolsky, Zwaan, Theeuwes, & Kramer, 2007; Theeuwes, 1994a; Theeuwes, 2004; but see Leber & Egeth, 2006; Gaspelin, Ruthruff, Lien, & Jung, 2012). Importantly, the size of the attentional window is under voluntary control, allowing participants to effectively avoid capture by shrinking the attentional window. Simply put, this attentional window account assumes that participants can avoid capture via spatial filtering.

Several other experimental paradigms, however, suggest the opposite conclusion: capture dominates spatial filtering. One prominent example is the spatial blink paradigm (Folk, Leber, & Egeth, 2002; Fukuda & Vogel, 2009, Experiment 4). Folk et al. (2002), for example, had participants search a rapid serial visual presentation (RSVP) stream for a target letter defined by color (e.g., red) and report its identity. A peripheral character preceding the target reduced target detection accuracy – an effect dubbed the spatial blink – when it possessed the color participants were searching for. The authors reasoned that the peripheral distractor captured spatial attention, despite appearing at an ignorable location that could never contain a target. However, just because the location could be ignored does not mean that it actually was ignored. When ignorable locations contain relatively few distractors, participants might not expend the effort to ignore them and instead spread spatial attention across a broad region of space. Relatedly, Leonard, Balestreri, and Luck (2015) found that the spatial blink effect declines as the target-distractor distances increase (but see Folk et al., 2002, Exps. 3 and 4). Another possible explanation is that participants attend the central location but do not fully engage there because the RSVP stream contains not only the target but also many distractors (cf. Lachter, Remington, & Ruthruff, 2009; Remington & Folk, 2001). Consistent with this account, Folk, Ester, and Troemel (2009) found that a central cue possessing the target color – presumed to trigger selection/engagement – eliminated capture effects from a subsequent peripheral cue.

Evidence of capture at to-be-ignored locations has also been found using a variant of the spatial cuing paradigm. Folk and Remington (1996) asked participants to search for an onset target (the abrupt appearance of an X or = sign) inside one of four boxes arranged in a cross formation. On each trial of Experiment 1, participants were told exactly where an onset cue – four dots surrounding a placeholder box – would appear and were assured that the target would not also appear there. Despite knowing that location could be safely ignored, the abrupt onset still produced robust capture effects. In Experiment 2, they reported similar results even when the location of the onset dots was fixed throughout a block, rather than varying randomly from trial-to-trial, making them even easier to ignore. In Experiment 3, onset cues were presented only at locations that never contained a target; that is, whereas the four target locations formed an imaginary cross, the four possible distractor locations formed an imaginary square. Capture effects by onsets were again observed. Altogether, this study suggests that people cannot easily avoid attentional capture via spatial filtering.

A similar pattern of results has been obtained from various investigations of the effect of cuing distractor locations (e.g., Munneke, Van der Stigchel, & Theeuwes, 2008; Chao, 2010). For example, Munneke et al. showed that an arrow precue indicating the likely location of a distractor reduced both the cost of presenting a distractor (Experiment 1) and target-distractor compatibility effects (Experiment 2). Although the main finding of this study was evidence for distractor suppression, the key for present purposes is that distractor interference was not eliminated by cuing (see also Chao, 2010). Taken at face value, this finding suggests that spatial filtering cannot eliminate attentional capture. On the other hand, one can question whether participants had sufficient incentive to strongly filter out distractor locations. In this line of research, the cues are often not 100 % reliable and/or the distractors do not always appear and/or do not resemble targets.

Reconciling the empirical discrepancy

In summary, previous studies paint a mixed picture regarding whether spatial filtering dominates attentional capture or vice versa. Whereas studies cuing a single target location have reported the absence of capture (e.g., Theeuwes, 1991; Yantis & Jonides, 1990), other studies have reported some residual capture (e.g., Folk & Remington, 1996; Folk et al., 2002; Johnston, Ruthruff, & Lien, 2015).

Although it might seem logical to propose that cuing a single target location (as in Yantis & Jonides, 1990) is the key ingredient for eliminating capture at to-be-ignored locations, we propose an alternative hypothesis. Specifically, failures to eliminate capture might simply reflect weak incentives to set up a complete spatial filter. In Folk and Remington (1996), for example, excluding just one of four possible target locations (in Experiments 1 and 2; see also Munneke et al., 2008) might not have benefitted search enough to warrant the costs of setting up the spatial filter (a possibility noted by the authors themselves). Even exclusion of four of eight positions (Experiment 3) might have provided insufficient benefit to warrant the more complicated attentional set. Furthermore, it might be relatively difficult to set up a cross-shaped filter that excludes the locations between the end points. An attentional set that simply included all possible locations might very well have sufficed. Thus, capture might have occurred at ignorable locations only because they were not actually ignored.

The present study

In the present study, we tested the hypothesis that a strong spatial filter can eliminate capture, even when there is more than just a single target location. Figure 1 shows an example stimulus display. We asked participants to look for a target letter defined by a conjunction of color and location and then report the target letter’s identity (E or H). For counterbalancing purposes, half of the participants looked for a red letter and the other half looked for a green letter. These sub-groups were further split in half, assigned to search the two vertical “top-bottom” positions or to search the two horizontal “left-right” positions.

Fig. 1
figure 1

The modified spatial cuing paradigm used in Experiment 1. Each participant was assigned to a conjunction of one target color (green in this example) and one set of locations (e.g., top/bottom in this example). Participants searched for a target letter defined by this conjunction of color and location, then reported its identity (E or H). In this example, the target would be the green H at the bottom location. To complete the task with high accuracy, participants had to ignore the other two locations (e.g., left/right in this example), which always contained a distractor letter in the target color. The search display was preceded by a nonpredictive onset cue (four white dots)

Importantly, the to-be-ignored positions always contained exactly one green letter (E or H) and one red letter (E or H). If participants merely searched for the assigned target color without excluding irrelevant locations, their search would turn up two possible targets and they would not know which was the real target. Likewise, participants could not merely search by location, because the two relevant positions contained one red letter and one green letter. Only a conjunction search (color and location) would yield high accuracy. Thus, unlike Folk and Remington (1996), this refined experimental design forces participants to establish a strong spatial filter. Failure to filter would result in unacceptably high error rates.

It is also worth mentioning that the current design discourages overt shifts of visual attention (i.e., eye movements) in response to instructions regarding target locations. Because the target letter can appear at one of two locations on opposite sides of the fixation point, participants should have no incentive to make an overt eye movement to a specific location in advance of the target display. Thus, any effect of spatial filtering in this paradigm cannot be attributed merely to foveation.

Experiment 1 investigated capture by abrupt onsets. We expected capture effects for abrupt onsets appearing at the attended locations based on the findings of Gaspelin et al. (2016) that capture effects from abrupt onsets can be latent in very easy visual searches but emerge in more difficult searches. The main question was whether capture would also occur for abrupt onsets appearing at to-be-ignored locations. Experiment 2 replicated the findings of Experiment 1 using a different neutral baseline against which to calculate capture effects. Experiment 3 then extended the investigation to capture by relevant cues that possess the target color (e.g., red), which normally produce very large capture effects. As will be seen, we found no evidence of capture at to-be-ignored locations in any of the present experiments, supporting the strong conclusion that truly ignored locations are immune to attentional capture.

Experiment 1

Participants searched a target display for a conjunction of color (red or green) and location (either top/bottom or left/right). Prior to two-thirds of the target displays, four white dots abruptly appeared around one of the four peripheral boxes marking (stimulus-onset asynchrony (SOA) = 150 ms; see Fig. 1). This cue location was chosen randomly, so that it was non-predictive of target location and participants would have no incentive to use it to find the target. The other one-third of trials had no cue.

The cue is considered valid when it appears in the same location as the subsequent target (one-quarter of cue present trials), and invalid when it appears in a different location (three-quarters of cue present trials). If the abrupt onset cue captures attention, then RT should be faster for valid cues than invalid cues, because the latter require a shift of spatial attention whereas the former do not. This phenomenon, called the cue validity effect, provides one index of whether the abrupt onset captured attention. It reflects both the benefit of a valid cue plus the cost of an invalid cue, relative to a hypothetical baseline condition with no attentional shift.

In the present design, we never allowed the target to appear at to-be-ignored locations (if we had, then participants might have begun attending to these “ignored” locations). Thus, cues at to-be-ignored locations could never be “valid” and it was not possible to measure the corresponding RT benefit. However, we can measure the cost of capture by invalid cues at ignored locations relative to a neutral baseline. Accordingly, we indexed capture by measuring the presence-absence cost (cf. Theeuwes, 1991): we compared RT on cue-present trials to RT on cue-absent trials. If the onset cue is completely ignored by the attentional system, then there should be no presence-absence cost. But, if the abrupt onset cue captures attention automatically, even at to-be-ignored locations, then there should be an RT cost for cue-present trials relative to cue-absent trials.

Methods

Participants

Participants were 39 undergraduate students from the University of New Mexico, who received partial course credit in exchange for their participation. Three participants had abnormally high error rates (more than 2 SDs from the group mean) and were excluded. Of the final sample of 36 participants, 20 were female and 16 were male, with a mean age of 20.0 years. All participants had normal color vision, as assessed by the Ishihara color vision test, and self-reported normal or corrected-to-normal visual acuity.

Stimuli

Stimuli were presented on 19-in. CRT monitors, controlled by PCs using E-Prime software (Psychology Software Tools Inc., Sharpsburg, PA, USA). Stimuli were the letters E and H in red (RGB: 255, 0, 0) or green (RGB: 0, 151, 0). Based on a typical viewing distance of 60 cm, the letters subtended a viewing angle of 2° in height and width. Placeholder boxes were gray, unfilled boxes, 2.5° in width and height. There were five placeholder boxes: four peripheral boxes in a cross formation and one central fixation box in the middle (see Fig. 1). The abrupt onset cue consisted of four white dots surrounding one of the peripheral boxes (above, below, left, and right).

Target identity and target location were chosen randomly, as were the identities of each distractor letter. The color of each distractor letter was chosen randomly with the constraint that each axis (horizontal and vertical) contain exactly one red letter and one green letter. The location of the onset cue was also chosen at random and therefore was non-predictive of target location.

Procedure

Each participant was randomly assigned to attend either the top and bottom positions (vertical) or to the left and right positions (horizontal) and to attend either red targets or green targets, with the restriction that each of the four possible conjunctions be used equally often. Participants were instructed to report the identity of the target letters (E/H) by pressing the keys labelled E and H (actual keys were Z and M, respectively). They were warned that, to achieve high accuracy, they would need to look for the instructed conjunction of location and color.

Each trial began with empty placeholder boxes for 1,000 ms. When the abrupt onset cue was present (two-thirds of trials), it appeared for 50 ms, followed by the empty placeholder boxes for another 100 ms, followed by the target display for 100 ms. The stimuli then disappeared, leaving only the empty placeholder boxes until response. If the participant made an incorrect response, a low-pitched error tone sounded for 500 ms. Cue absent trials (one-third of all trials) followed the same sequence, except that the 50-ms cue presentation was replaced with another 50-ms view of the placeholder boxes (i.e., there was no change relative to the preceding display).

Participants first completed a practice block of 64 trials, followed by 12 experimental blocks of 64 trials. Between blocks, they received performance feedback (average RT and accuracy) for the preceding block and were allowed to rest.

Results

We excluded trials with abnormal RTs, less than 200 ms or greater than 1,500 ms (0.37 % of trials). In addition, errors were excluded from RT analyses. The resulting means are shown in Fig. 2 and Table 1.

Fig. 2
figure 2

Response times (ms) in Experiment 1 by cue location and cue validity. Error bars represent the within-subject standard error of the mean (Cousineau, 2005; Morey, 2008)

Table 1 Mean response time (ms) and percent error by cue condition in Experiment 1

Response time

The overall pattern of RTs suggest capture by onsets at attended locations (a 20-ms cue validity effect) but not at ignored locations (no present-absence cost). To formally analyze this trend, we conducted a one-way, within-subject ANOVA on mean RT for absent, attended invalid, attended valid and ignored invalid trials. This ANOVA revealed a main effect of cue type, F(3, 105) = 16.37, p < .001, η 2= .319. Preplanned t-tests then compared mean RTs for attended locations and then for the ignored locations. At attended locations, participants were slower on invalid trials (531 ms) than valid trials (511 ms), t(35) = 5.196, p < .001, 95 % confidence interval (CI) [12.3–28.1], indicating a 20-ms cue validity effect. This finding is consistent with previous studies showing attentional capture by onsets (e.g., Gaspelin et al., 2016) and demonstrates the general sensitivity of our paradigm to detect attentional capture effects.

The central question, though, is whether onset cues at ignored locations also captured attention. Mean RT was similar for ignored-invalid trials and cue-absent trials, t(35) = 0.886, p = .382, the 95 % CI [-2.8–7.1]; argues against even a modest cost of the onset. Furthermore, mean RTs were faster for ignored-invalid cues than attended-invalid cues, t(35) = 2.710, p = .010, 95 % CI [1.7–12.1]. These results suggest that onset cues captured attention less effectively when appearing at ignored locations, and perhaps did not capture attention at all.

Error rates

The pattern of error rates was similar to that for RT. The same analyses reported above for RTs were also conducted on error rates. The overall one-way ANOVA was significant, F(3, 105) = 5.414, p = .003, η 2= .134. Follow-up t-tests showed that error rates were higher for attended-invalid cues (6.1 %) than for attended-valid cues (4.9 %), t(35) = 3.158, p = .003. However, error rates were similar for cue absent trials (4.8 %) and ignored-invalid trials (5.1 %), t(35) = 0.999, p = 0.325.

Discussion

In this experiment, abrupt onsets could appear at either attended or to-be-ignored locations. We ensured that to-be-ignored locations really were ignored by presenting target-like stimuli there on every trial. For attended locations, we found that RT was slower by 20 ms for invalid onset cues than for valid onset cues (a cue validity effect). Furthermore, invalid cues slowed overall RT compared to cue absent trials (a presence-absence cost). Altogether, this suggests that the onset cues at attended locations captured attention. For ignored locations, however, the picture was quite different. The presence of an invalid onset cue did not cause any noticeable slowing relative to cue absent trials. These data are consistent with the hypothesis that ignored locations are immune to capture.

Experiment 2

In Experiment 1, mean RT when the cue invalidly pointed to a to-be-ignored location was the same as when no cue was presented at all. Taken at face value, this failure to slow responses suggests that salient cues at to-be-ignored locations failed to capture attention. However, one can question whether cue absent trials represent the ideal baseline. Participants might use the precue as a kind of warning signal, alerting them that the target display is about to appear. Because cue-absent trials lack this warning signal, they might be artificially slow. Such an artificial slowing on cue absent trials could mask slowing due to capture on ignored-invalid cue trials.

It is unclear whether people actually do use the cue as an alerting signal, but it is a logical possibility that deserves investigation. In Experiment 2, therefore, we attempt to remedy this potential concern by including a different neutral baseline, suggested by a reviewer, in which the onset cue appeared at fixation. Because a cue was present, this neutral condition should produce an alerting benefit.

An additional change was made in Experiment 2 in the hopes of increasing the power to detect any capture effects from to-be-ignored cues. Specifically, we used a somewhat more difficult search task: looking for the target circle and rejecting the oval distractor. Previous research has shown that increasing search difficulty increases cue validity effects from abrupt onsets (Gaspelin et al., 2016, Experiment 7). An added benefit would be demonstrating that the findings of Experiment 1 generalize to a different search task.

Methods

The methods were identical to those of Experiment 1, except as noted below.

Participants

Thirty-eight undergraduates from the University of California, Davis participated in exchange for partial course credit. Of the final sample of 38 participants, three were male and 35 were female. The mean age was 19.9 years.

Stimuli

Stimuli were presented using Psychophysics Toolbox (Brainard, 1997) for Matlab on 24-in. LCD monitors with a black background, viewed from a distance of 70 cm. The target was a perfect circle (a diameter of 1.75°) whereas the distractor was an ellipse (1.9° × 1.6°). All shapes were gray (RGB: 119, 119, 119). Within each circle and ellipse was a black dot (0.1°) placed 0.3° from either the left or right edge. All other stimulus parameters (e.g., eccentricity, box placeholder dimensions, etc.) were identical to Experiment 1.

Procedure

As in Experiment 1, half of the participants were instructed to search only the left and right positions (i.e., the horizontal row) and the other half were instructed to search only the top and bottom positions (i.e., the vertical column). Participants were to indicate whether the black dot inside the target circle was located on the left or right side of the circle. They pressed the ‘Z’ key for ‘left’ and the ‘M’ key for ‘right’. On two-sevenths of the trials, the cue was absent. On the remaining five-sevenths of the trials, the cue was equally likely to occur in one of five possible positions (left, right, top, bottom, center).

Results

We excluded trials with abnormal RTs, less than 200 ms or greater than 1,500 ms (0.36 % of trials). In addition, errors were excluded from RT analyses. The resulting means are shown in Fig. 3 and Table 2.

Fig. 3
figure 3

Response times (ms) in Experiment 2 by cue location and cue validity. Error bars represent the within-subject standard error of the mean (Cousineau, 2005; Morey, 2008)

Table 2 Mean response time (ms) and percent error by cue condition in Experiment 2

Response time

The RT data showed capture effects by onsets at attended locations (a 23-ms cue validity effect). Yet there was virtually no evidence of capture by cues at ignored locations – mean RT for ignored cue trials, cue absent trials, and central cue trials are indistinguishable. We analyzed these data with a one-way, within-subjects ANOVA on mean RT on absent, center-cued, attended invalid, attended valid, and ignored invalid trials. This ANOVA revealed a main effect of cue type, F(4, 148) = 7.93, p < .001, η 2 =.177. Preplanned t-tests then compared mean RTs for attended locations and then for the ignored locations. At attended locations, participants were slower on invalid trials (653 ms) than valid trials (630 ms), t(37) = 4.454, p < .001, 95 % CI [12.5–33.5], indicating a significant 23-ms cue validity effect. Participants were also slowed on attended invalid trials relative to center cue trials, t(37) = 2.192, p = .035, 95 % CI [0.8–19.8], and relative to cue absent trials, t(37) = 3.236, p = .003, 95 % CI [5.7–24.6]. Regardless of which baseline condition is used, invalid cues at attended locations significantly disrupted target detection in this task.

The central question, though, is whether onset cues at ignored locations also captured attention. Mean RT was 4 ms slower for ignored-invalid trials than cue-absent trials, although this modest difference was not statistically significant, t(37) = 1.25, p = .220, 95 % CI [-2.5–10.9]. Meanwhile, mean RT on ignored-invalid trials was very similar to that obtained on center-cued trials, differing non-significantly by only 1 ms, t(37) = .204, p = .839, 95 % CI [-7.6–6.2]. Furthermore, mean RTs were faster for ignored-invalid cues than attended-invalid cues, t(37) = 2.684, p = .011, 95 % CI [2.7–19.3]. All of these results suggest that the onset cues at ignored locations did not disrupt target detection.

Error rates

The overall one-way ANOVA on error rates was non-significant, F(4, 148) = 1.83, p = .148, η 2= .047. Error rate trends were consistent with the RT effects.

Discussion

Experiment 2 replicated the findings of Experiment 1. Because there is no universally agreed upon “neutral” condition, we included two different neutral conditions: a cue-absent condition (as in Experiment 1) and a new central-cue condition. Regardless of which neutral condition is used, onset cues at to-be-ignored locations failed to produce a detectable cost on RT to the target. Incidentally, the current data provide no evidence that cues produce an alerting benefit (mean RT was very similar for cue absent and center cue trials). We conclude that salient onset cues at ignored locations not only produce less capture cost than those at attended locations, but might not produce any noticeable cost at all.

Experiment 3

Abrupt onsets, despite being generally regarded as the most potent type of a salient stimulus (Franconeri & Simons, 2003; Jonides & Yantis, 1988), had little or no effect when presented in a to-be-ignored region of visual space. To provide an even stricter test of the ability of spatial filtering to suppress capture, Experiment 3 examined an even more potent type of stimulus: those that match the observer’s attentional set. These stimuli – sometimes called relevant cues – consistently capture spatial attention even more strongly than salient-but-irrelevant cues such as abrupt onsets (e.g., Folk et al., 1992).

Therefore, instead of presenting abrupt onsets as cues, we instead presented color cues that could either match the target color or match the distractor color. More specifically, the cue consisted of a change in one peripheral box from white to either green or red. For attended locations, we expect to replicate previous research showing contingent capture: target-colored cues should strongly capture attention, producing a large cue validity effect, but distractor-colored cues should not.

The key question is whether target-colored cues will also capture attention when presented at to-be-ignored locations. As in Experiments 1 and 2, the target never appeared at to-be-ignored locations. Thus, these to-be-ignored locations can never have a valid cue and therefore we cannot calculate the benefit of a valid cue. Instead, we compared RT for ignored target-colored cues to RT for ignored distractor-colored cues. This is a variant of the presence-absence cost used in Experiments 1 and 2, except that here it is the target color that is either present or absent. If the target-colored cues do capture attention at ignored locations, then we should observe an RT cost for target-colored cues relative to distractor-colored cues. In fact, RTs for invalid target-color cues might be just as long at ignored locations as at attended locations. But, if cues at to-be-ignored locations cannot capture attention, and instead are successfully ignored, then it should make little or no difference whether these cues have the target color or distractor color (i.e., no RT cost).

Methods

The methods were identical to those of Experiment 1, except as noted below.

Participants

Thirty-seven undergraduates from the University of New Mexico participated in exchange for partial course credit. One participant had an abnormally high error rate (more than 2 SDs from the group mean; more than 25 %) and was excluded. Of the final sample of 36 participants, seven were male and 29 were female. Their mean age was 20.5 years.

Stimuli

Whereas the cues in Experiment 1 were four abruptly onsetting white dots around one of the possible target locations, here the cues consisted of a color change in one of the peripheral placeholder boxes. Specifically, one of the boxes changed to either the target color or the distractor color, and the other three boxes changed to either blue or yellow (i.e., neutral colors). By changing the colors at all four target locations, we ensure that capture by the cue is due specifically to the color (relevant or irrelevant) and not merely to color change. The new colors were presented for 100 ms, before changing back to white for 50 ms (i.e., the cue-target SOA was once again 150 ms). Then, the target display appeared for 100 ms, as in Experiment 1.

Results

We excluded trials with abnormal RTs, less than 200 ms or greater than 1,500 ms (0.53 % of trials). In addition, errors were excluded from RT analyses. The resulting means are shown in Fig. 4 and Table 3.

Fig. 4
figure 4

Response time (ms) in Experiment 3 by cue type. Error bars represent the within-subject standard error of the mean (Cousineau, 2005; Morey, 2008)

Table 3 Mean response time (ms) and percent error by cue condition in Experiment 3

Response time

Overall, the data indicate capture only for target-colored cues in attended regions of visual space. For attended locations, the cue validity effect was 33 ms for target-colored cues, but -1 ms for distractor-colored cues. A two-way within-subjects ANOVA (cue validity × cue relevance) on mean RT for attended cues confirmed that the cue validity effect was significant overall, F(1, 35) = 30.781, p < .001, \( {\eta}_p^2 \)= .468, and interacted significantly with cue relevance, F(1, 35) = 39.306, p < .001, \( {\eta}_p^2 \)=.529. Beyond modulating the effect of cue validity, cue relevance had no main effect, F(1, 35) = 1.824, p = .185, \( {\eta}_p^2 \)= .050. Follow-up t-tests confirmed that the 33-ms cue validity effect for target-colored cues was statistically significant, t(35) = 7.267, p < .001, 95 % CI [24.1–42.9], but the -1 ms cue validity effect for distractor-colored cues was not, t(35) = .200, p = .843, 95 % CI [-7.4–6.1].

The main question was whether target-color cued cues can capture attention even when appearing at ignored regions of visual space. The data indicate that they did not. At to-be-ignored locations, here was no detectable slowing for target-colored cues (541 ms) versus distractor-colored cues (542 ms), t(37) = .440, p = .662, 95 % CI [-7.3–4.7]. Similarly, there was no detectable slowing for invalid target-colored cues at ignored locations (541 ms) versus invalid distractor-colored cues at attended locations (544 ms), t(37) = .881, p = .384, 95 % CI [-9.9–3.9]. Just as distractor-colored cues fail to capture attention (regardless of whether they are at attended or ignored locations), so do target-colored cues at ignored locations. Meanwhile, RT for target-colored cues was 23 ms faster when presented in ignored-invalid locations than in attended-invalid locations, t(35) = -6.648, p < .001, 95 % CI [-30.3– -16.1], confirming greater capture effects at attended locations. In summary, there was no hint that cues presented at ignored locations could capture spatial attention, even when they were target-colored (relevant).

Error rates

Error rates (see Table 3) varied across a narrow range (4.8–5.8 %) but the pattern was generally consistent with the RT data. A two-way within-subjects ANOVA (cue validity × cue relevance) for attended cues revealed no significant overall main effect of cue validity, F(1, 35) = 1.632, p = .210, \( {\eta}_p^2 \) = .045, though the interaction with cue relevance nearly reached significance, F(1, 35) = 3.322, p = .077, \( {\eta}_p^2 \) = .087. Cue relevance had no main effect, F(1, 35) = .299, p = .588, \( {\eta}_p^2 \) = .008. Follow-up t-tests showed that the cue validity effect on error rates was significant for target-colored cues, t(35) = 2.399, p = 0.022, 95 % CI [-2.1– -0.2], but not for distractor-colored cues, t(35) = .384, p = .703, 95% CI [-0.9, 1.4]. For ignored cues, error rates for target-colored cues (5.3 %) were nearly identical to those for distractor-colored cues (5.4 %), t(35) = .316, p = .754, 95 % CI [-0.7–0.9].

Discussion

In this experiment, we presented color cues that could either match or mismatch the target color. Relevant cues possessing the target color produced a substantial cue validity effect (33 ms) when appearing at attended locations, indicating capture of spatial attention. Distractor-color cues, meanwhile, produced no detectable cue validity effect. This is the classic contingent capture effect, replicating many previous studies (e.g., Folk et al., 1992). Despite the apparent potency of target-colored cues when presented at attended locations, they produced no discernable cost on overall RT or error rates when presented at ignored locations. We conclude that spatial filtering can override attentional capture, even for relevant cues that normally produce large capture effects.

General discussion

Many previous studies have provided evidence that stimuli can rapidly and involuntarily capture spatial attention based on either salience (e.g., Gaspelin et al., 2016; Theeuwes, 1992; Yantis & Jonides, 1984) or relevance (e.g., Folk et al., 1992; Folk et al., 2002; Lien, Ruthruff, & Johnston, 2010). The present study investigated whether salient stimuli and relevant stimuli can also capture attention when appearing at ignored locations. In other words, which is the more potent force for guiding visual attention – attentional capture or spatial filtering?

A few previous studies have examined this issue, with mixed results. Some have found capture at to-be-ignored locations (Folk & Remington, 1996; Folk et al., 2002), whereas others did not (Theeuwes, 1991; Yantis & Jonides, 1990). We proposed that strong spatial filtering provides immunity to capture. However, strong filtering occurs only when people have sufficient incentive to establish a spatial filter. The present experiments created such an incentive by presenting target-like stimuli at both the to-be-attended and to-be-ignored locations during a visual search. A failure to instantiate a strong spatial filter would have resulted in unacceptably high error rates.

Experiment 1 provided evidence that abrupt onsets – a classic example of a salient stimulus – captured attention at attended locations but had little or no impact at ignored locations. At ignored locations, we found no presence-absence cost: RT when abrupt onsets cued an ignored location (524 ms) was indistinguishable from RT when the cue was absent (522 ms). The 95 % CI placed the effect between -2.6 ms and 7.0 ms, allowing us to argue against even a modest effect of abrupt onsets. This lack of an effect was not due to insensitivity of our paradigm to detect capture effects – when onsets cued at attended locations, they produced substantial capture effects. Experiment 2 replicated these findings with a different search task – searching for a perfect circle rather than an oval – and an additional “neutral” baseline condition in which the center box was cued.

Experiment 3 examined an even more potent kind of stimulus – one that possesses the critical feature used to locate the target (in this case, the color of the target). When presented at attended locations, relevant cues strongly captured attention. RT was 33 ms faster when this target-colored cue appeared at the location of the upcoming target (i.e., valid cues) than when it appeared at the other attended location (invalid cues). However, these relevant cues lost their potency when presented at ignored locations. Here, RT was indistinguishable between target-color cues (541 ms) and distractor-color cues (542 ms) – task relevance no longer produced any discernable cost.

Compatibility effects

If cues capture attention to their location, then one would expect enhanced processing of the distractor character that subsequently appears there. Enhanced processing of the cued distractor would then increase the effect of distractor-target compatibility on RT. As shown in Table 4, compatibility effects were large for cued distractors in attended locations: 36 ms for onset cues in Experiment 1 and 31 ms for target-colored cues in Experiment 3. (Note that the same compatibility effect does not apply to the circle vs. oval search task of Experiment 2.) Compatibility effects were negligible, meanwhile, for cued distractors in ignored locations: 8 ms for onset cues in Experiment 1 and 1 ms for target-colored cues in Experiment 3. This lack of compatibility effect provides converging evidence for the conclusion that attention was not captured effectively by cues in ignored locations.

Table 4 Mean response time (ms) by target-distractor compatibility, cue condition, and cue type in Experiments 1 and 3

The attentional window account

The present findings are consistent with the attentional window account (Belopolsky et al., 2007; Belopolsky & Theeuwes, 2010; Theeuwes, 1994a; Theeuwes, 2004), which assumes that bottom-up attention capture occurs only when the salient object falls within the observer’s attentional window. That account was designed in part to explain capture by color singletons under parallel search but serial search. The idea is that, in a serial search, the attentional window is narrow and therefore does not include the salient object (unless the search happens to come across the salient object by accident).

For the case of abrupt onsets, however, the actual pattern of results goes in the other direction (Gaspelin et al., 2012; Gaspelin et al., 2016): strong capture under difficult (arguably serial) search but miniscule capture under easy (arguably parallel) search. This pattern was obtained even when search difficulty varied randomly from trial-to-trial, so that participants could not adjust their attentional set for different levels of search difficulty. It has also been replicated using several difficult types of visual search (colors, letters, and shapes). Gaspelin et al. (2016) proposed that onsets generally capture attention, even under difficult (arguably serial) search, and that the costs of that capture scale up with search difficulty. In sum, the present findings are consistent with the core assumption of the attentional window account, although there is reason to question the additional assumption that capture cannot occur during serial search.

Can reward stimuli overpower the spatial filter?

Although we conclude that salient stimuli (abrupt onsets) and relevant stimuli appearing outside the focus of attention cannot capture attention, there is some evidence that reward-associated stimuli can override the spatial filter. Munneke, Belopolsky, and Theeuwes (2016) indicated the target location using a 100 % valid line cue. In blocks without reward, an abrupt onset simultaneous with the target display failed to produce any cost relative to cue absent trials, consistent with Yantis and Jonides (1990) as well as the present findings. With rewards, however, all onsets – high, low, and no reward – produced a substantial presence-absence cost (~22 ms in Experiment 1). To account for the emergence of capture even by non-rewarded onsets, the authors proposed that rewards induce participants to strategically attend to noncued locations. To deter this strategy, Experiment 3 shortened time deadlines and presented rewards on only 12.5 % of trials. This modification appeared to help, as no-reward onsets produced only a 3-ms presence-absence cost (n.s.), whereas high reward produced an ~8 ms cost (p < .05). If one assumes that participants no longer attended non-cued positions, then the results suggest that rewarded stimuli might have more power to break through the spatial filter than merely salient or relevant stimuli.

Relation to previous research

Our results clearly demonstrate that spatial filtering can dominate attentional capture. The present results support the original Yantis and Jonides (1990) finding and show that it is not limited to the case of a known target location, but rather extends to visual search as well. We also extended this finding from salient stimuli (abrupt onsets) to an even more potent type of capture cue: relevant stimuli. We therefore propose that capture by salient and relevant stimuli is prevented at excluded spatial locations and that previous findings to the contrary reflect weak (incomplete) spatial filtering due to insufficient incentives. This pattern – that processing of ignored items approaches zero as the incentives and opportunities for spatial filtering increase – is a recurring theme in the attention literature (for example, see Gaspelin, Ruthruff, & Jung, 2014; Lachter et al., 2004; Ruthruff & Miller, 1995).

It might seem paradoxical to claim that an object cannot capture attention unless it is already attended. The paradox fades, however, if spatial attention is assumed to be graded rather than all-or-none or, similarly, if there are two kinds of attention (diffuse vs. focused). Objects in locations that are completely filtered out cannot capture attention, but objects already subject to diffuse attention can capture a larger share of spatial attentional resources (focused attention). This view is consistent with the attentional window hypothesis of attention capture (Belopolsky et al., 2007; Leonard, Lopez-Calderon, Kreither, & Luck, 2013; Leonard et al., 2015).

It is unclear in the present experiments whether spatial filtering consists of enhancement (boosting) of processing at attended locations or suppression of processing at to-be-ignored locations. In line with the latter position is a new hybrid model of attentional capture called the signal suppression model (Gaspelin, Leonard, & Luck, 2015, 2017; Sawaki & Luck, 2010), which proposes that people can avoid capture by salient items via an active suppression mechanism. Research on signal suppression models has focused exclusively on suppression of salient features such as color. Future research might explore whether similar evidence of suppression of processing, below baseline levels, also occurs for spatial locations.

Suppression could prevent onsets and relevant cues from ever being captured at ignored locations. A related possibility is that capture occurred at both attended and ignored locations, but attention was then very rapidly repelled away from ignored locations (see, e.g., Theeuwes, 1994b). Kiss, Grubert, and Eimer (2013) reported a case in which behavioral capture effects were not statistically significant, yet N2pc effects (an electrophysiological measure thought to index attentional allocation) were still observed (see also Grubert & Eimer, 2016). For many practical purposes, it might matter little whether attention never goes to ignored locations, or is merely repelled very quickly. But, for theoretical purposes, this distinction does matter, so it deserves further investigation.

The present findings also help explain the phenomenon of inattentional blindness (e.g., Simons & Rensink, 2005). If salient and surprising events fail to capture attention while people are focusing attention on other locations, objects or streams, these salient events might not be processed sufficiently to be noticed and remembered.

Concluding remarks

The present data suggest that when observers have incentive to strongly filter out irrelevant spatial locations, those locations can become immune to capture by both salient stimuli and relevant stimuli. We propose that previous evidence of capture at to-be-ignored locations (e.g., Folk & Remington, 1996) was obtained only because those locations were not in fact filtered out much of the time, due to insufficient incentives.

The present conclusion is somewhat surprising given that capture is often beneficial. The main goal of capture would seem to be to allow observers to rapidly orient to unexpected but important events. Turning off this beneficial function at ignored locations could be hazardous, causing an organism to fail to avoid dangerous objects and events (e.g., flying rocks and spears), or to miss out on unexpected opportunities (e.g., fruit or prey). One possible reconciliation is that people typically maintain strong spatial filters only for brief periods of time, only under strong incentives, and only when deemed safe to do so.