Numerous studies have provided evidence that salient stimuli can capture attention against our will (e.g., Jonides & Yantis, 1988; Theeuwes, 1992; Yantis & Jonides, 1984). These findings support stimulus-driven theories, which assert that visual attention is at the whim of the most salient feature in the environment. The fact that Internet advertisements and marketing billboards routinely feature colorful, moving, or otherwise salient visual images suggests widespread belief that salience guides attention. Traditionally, research on attentional capture has focused on two types of salience: a uniquely colored item among homogeneously colored items (called color singletons) and an object that appears suddenly within a visual scene (called abrupt onsets).

Puzzlingly, some laboratory studies have found evidence of capture by salient visual objects while many others have not (e.g., Folk, Remington, & Johnston, 1992; Lien, Ruthruff, Goodin, & Remington, 2008; Lien, Ruthruff, & Cornett, 2010a; Lien, Ruthruff, & Johnston, 2010b). Resolving much of this empirical discrepancy, a recent study by Gaspelin, Ruthruff, and Lien (2016) showed that—for the case of abrupt onsets—studies typically obtain robust capture effects from abrupt onsets only when the visual search is relatively difficult. To explain this pattern of results, they proposed the attentional dwelling hypothesis: abrupt onsets tend to capture attention, but the costs and benefits of capture depend critically on how long spatial attention dwells on the nontarget items during visual search (which is a function of search difficulty). On this view, onsets routinely capture attention but capture effects become noticeable only under difficult search. The present study investigated whether a similar experimental approach would also reveal evidence of capture by color singletons under difficult search. Relatedly, we also investigated whether attentional dwelling applies generally, to any salient stimulus that captures attention.

Background: The attentional dwelling hypothesis

Gaspelin et al. (2016) pointed out that the literature on capture by abrupt onsets contains an odd—and theretofore unexplained—mixture of strong effects and null effects. After surveying this literature, they noticed an important trend: Capture by abrupt onsets had been reported in 88% of studies using visual search for a letter (a relatively difficult search), but in only 17% of studies using visual search for a color (a relatively easy search). They therefore hypothesized that abrupt onsets captured attention in all studies, but the impact of capture on visual search depended strongly on search difficulty. In an easy search, the nontarget items within the search display (which we henceforth refer to as distractors) are dissimilar from the target. Accordingly, there might be very little benefit of capturing attention to the target location, because the target would have been found quickly even without capture. Likewise, there might be very little cost of capturing attention to a nontarget location, because that nontarget item would be rejected quickly (as would any other nontargets searched on the way to finding the target). In other words, easy search suffers from a kind of floor effect that makes capture by abrupt onsets difficult to detect (latent).

To support this theory, Gaspelin et al. (2016) first needed to confirm the importance of search difficulty in determining capture effects, because no previous study had ever directly compared the different kinds of visual searches side by side. They relied primarily upon a spatial cuing paradigm (Folk et al., 1992), in which four white dots abruptly onsetted 150-ms prior to the search display. These dots are the cue, defined as the item whose ability to capture attention is being assessed. We will use the more specific term precue whenever the cue appears prior to the target display. Importantly, the precue was presented at a random location in the search display, so that it was nonpredictive of the upcoming target location, giving participants no incentive to attend the precue location over any other location. The critical index of capture was the cue validity effect—the degree to which response time (RT) is slower when the precue is invalid (i.e., not at the target location) rather than valid (at the target location). The underlying assumption is that, when a precue captures spatial attention, it shortcuts the search on valid trials and prolongs the search on invalid trials (see Folk et al., 1992).

Gaspelin et al. (2016) confirmed that, despite using identical search displays of colored letters across search types, letter search consistently produced much larger capture effects than did color search (e.g., 26 ms for letter search but only 2 ms for color search in Experiment 1). Importantly, they further showed that the critical variable was not search type (letter vs. color) per se, but rather search difficulty (easy vs. difficult). Across a series of experiments using different stimulus categories (letter, color, and shape), difficult search consistently yielded larger capture effects by abrupt onsets than did easy searches. Furthermore, the relationship across search conditions between cue validity effects and search difficulty (as indexed by overall RT) was remarkably linear (r = .97).

Importantly, this finding held even when difficulty varied randomly from trial to trial. Thus, at the time of the abrupt onset precue, the upcoming search difficulty for that trial was as yet unknown to the participant. This finding is crucial because it shows that the search difficulty effect cannot be attributed to strategic adjustments to the attentional set based on anticipated search difficulty. More generally, it necessitates that search difficulty did not modulate the probability of capture at the time of the precue. Instead, search difficulty must have modulated cognitive processes during the visual search itself. Specifically, Gaspelin et al. (2016) proposed that search difficulty influenced how long attention dwelled on distractors prior to locating the target. When visual search is difficult, capture costs are high on invalid trails because the cued distractor closely resembles the target and takes additional time to reject. Even after the cued distractor item is rejected, it also takes additional time to reject any other distractors searched along the way to locating the target. Conversely, capture benefits are large on valid trials because the target would be the first item searched, avoiding what could otherwise have been a very time-consuming search.

The attentional dwelling hypothesis assumes that attention largely remains at the cued location until the search display appears. Attention would eventually disengage completely, but does not do so over the short 150-ms stimulus-onset asynchrony (SOA) commonly used in spatial cuing studies. Even if very rapid disengagement were possible (see Theeuwes, Atchley, & Kramer, 2000), there is arguably no great incentive to do so during the SOA given that the cued location is just as likely as any other location to contain the future target. Once the search display appears, and it becomes clear that the cued location contains a distractor rather than a target, then disengagement can begin.

The authors reached several conclusions. First, abrupt onsets frequently do capture attention. Second, spatial attention does not fully disengage from that location prior to the onset of search display. Third, variation in capture effects between experiments in the literature might not reflect differences in the probability of capture (as is often assumed), but rather differences in the costs of capture. Fourth, once search difficulty is accounted for, the literature on capture by abrupt onsets, which once seemed puzzlingly inconsistent, actually turns out to be remarkably consistent. Fifth, it can be difficult to detect capture effects under easy visual search, so an important practical recommendation is that capture studies should employ difficult visual search. The present study adopts this advice to determine whether color singletons can capture attention.

Capture by color singletons?

In addition to abrupt onsets, another widely studied class of salient stimuli are color singletons, which is a uniquely colored object presented among homogeneously colored background objects. A real-world example would be a lone yellow daisy in a field of green grass, or a yellow “wet floor” sign in a white hallway. These singletons do seem to “pop out” of a display and are extremely easy to find when one actively looks for them (e.g., Jonides & Yantis, 1988; Theeuwes, 1990; Yantis & Egeth, 1999). But do they capture attention when one is not looking for them (i.e., involuntarily)?

Early studies by Theeuwes (1991, 1992) showed that when participants searched for a shape target among homogeneous distractors (e.g., a target circle among many diamond distractors), RT was slowed by the presence of a seemingly irrelevant color singleton at a nontarget location. This additional singleton cost has been taken as evidence that color singletons can involuntarily capture spatial attention.

Bacon and Egeth (1994), however, later questioned whether this capture was truly stimulus driven. They pointed out that participants might have located the shape singleton target not by looking for a particular shape feature but rather by looking for any singleton (following a suggestion by Pashler, 1988). If participants did use this shortcut strategy, known as singleton-detection mode, then the color singleton might have captured attention based upon task relevance rather than based on pure salience, and therefore the findings would be consistent with goal-driven models. Thus, to isolate capture induced purely by salience, it was necessary to discourage singleton-detection mode. After replicating the results of Theeuwes (1992), Bacon and Egeth then tried to eliminate singleton-detection in mode in two different ways. Experiment 2 presented multiple target circles on most trials, so that the target was no longer defined by being a singleton. Experiment 3, meanwhile, made the search distractors heterogeneous (replacing some of the background diamonds with triangles and squares) rather than homogeneous. Both attempts to deter singleton-detection mode nearly eliminated capture effects by color singletons. Hence, Bacon and Egeth concluded that color singletons do not have the inherent power to capture attention against our will (see also Leber & Egeth, 2006).

Overall, the literature supports the conclusion that heterogeneity of the search display is an important determinant of whether color singletons produce capture costs. Experiments with homogeneous distractors consistently report capture by color singletons (e.g., Bacon & Egeth, 1994, Experiment 1; Gaspelin, Leonard, & Luck, 2015, Experiment 1; Gaspelin, Leonard, & Luck 2017, Experiment 1; Kim & Cave, 1999; Lamy, Carmel, Egeth, & Leber, 2006, mixed singleton condition; Lamy, Tsal, & Egeth, 2003, Experiment 1; Leber & Egeth, 2006, singleton group; Moher, Abrams, Egeth, Yantis, & Stuphorn, 2011; Theeuwes, 1991, 1992, 1994), whereas those with heterogeneous distractors consistently report much smaller capture costs (e.g., Bacon & Egeth, 1994, Experiments 2–3; Folk & Annett, 1994; Franconeri & Simons, 2003, Experiment 1; Gaspelin et al., 2015, Experiments 2–4; Gaspelin et al., 2017, Experiments 2–3; Gibson & Jiang, 1998; Gibson & Peterson, 2001; Jonides & Yantis, 1988; Lamy et al., 2006, multiple target condition; Lamy & Tsal, 1999; Lamy, Tsal, & Egeth, 2003, Experiment 2; Leber & Egeth, 2006, feature group; Lien et al., 2008; Lien, Ruthruff, & Cornett, 2010a; Lien, Ruthruff, & Johnston, 2010b; Theeuwes, 2004, Experiment 2; von Mühlenen & Conci, 2009, 2016).

The present study

The present study varied search difficulty by parametrically manipulating target–distractor similarity, with both homogeneous distractors (Experiments 1 and 2) and heterogeneous distractors (Experiment 3) to answer two questions. First, do color singletons show a pattern of increasing capture costs with search difficulty, as already observed with abrupt onsets? As mentioned above, the attentional dwelling hypothesis asserts that RT-based capture effects reflect both (a) the probability of capture by the salient stimulus, and (b) the magnitude of the costs/benefits following capture. Accordingly, capture costs should scale up with search difficulty because, as search becomes more difficult, distractors look more like a target and thus take more time to reject as a potential target. This applies not only to rejecting the cued distractor, but also to rejecting every other distractor searched on the way to locating the target. Looked at the other way, the benefit of being captured to the target location on valid trials is much larger under difficult search, where a prolonged search can be avoided. We have previously shown this pattern of increasing capture costs with increasing search difficulty for the case of abrupt onsets (Gaspelin et al., 2016). So, at stake is whether the attentional dwelling hypothesis applies generally to all salient objects or reflects something peculiar about abrupt onsets.

Second, can color singletons capture spatial attention based purely on salience? As noted above, only with heterogeneous distractors can we be confident that capture effects reflect salience rather than task relevance (Bacon & Egeth, 1994). Heterogeneous distractors have typically yielded small capture effects. However, a key insight from Gaspelin et al. (2016) is that difficult search provides the most sensitive test of capture. Interestingly, a few studies that did employ relatively difficult searches showed hints of capture effects (e.g., Lamy & Tsal, 1999; Todd & Kramer, 1994; Yantis & Egeth, 1999). Lamy and Tsal (1999), for example, observed significant singleton costs (51 ms) at the largest set size (10 items) on target-absent trials, which had by far the longest average RT. However, the finding is ambiguous because target-absent trials might be qualitatively different from target-present trials; the authors themselves proposed that such effects merely reflected postperceptual processes.

Figure 1 illustrates predictions from two different accounts. Both accounts assume attentional dwelling after initial capture, but differ in whether color singletons capture attention based on salience (Fig. 1b) or not (Fig. 1a).

Fig. 1
figure 1

Predictions for two competing accounts of color singleton capture, when combined with the attentional-dwelling account: a contingent capture. b salience-based capture

Contingent capture with attentional dwelling

Following Bacon and Egeth (1994), color singletons might have no inherent power to capture spatial attention, except when the distractors are homogeneous, promoting use of singleton-detection mode (see also Gaspelin et al., 2015, 2017; Gaspelin & Luck, 2018b). According to this account, capture effects from color singletons should increase with search difficulty when distractors are homogeneous (promoting singleton-detection mode). However, capture costs should be negligible with heterogeneous distractors (promoting feature-search mode) and should not increase much with search difficulty. This prediction is shown in Fig. 1a.

Salience-based capture with attentional dwelling

Following Theeuwes (1992, 2004, 2010), color singletons might generally capture attention based purely on salience. According to this account, capture costs should increase with search difficulty for both kinds of searches. However, we might expect a steeper slope with homogeneous distractors because they capture attention based on both salience and relevance (due to use of singleton-detection mode). This prediction is shown in Fig. 1b.

Other possible outcomes

In addition to the predictions shown in Fig. 1, there are many other possible outcomes. For example, difficult search might actually inhibit capture and thus cause capture effects to decrease with difficulty, rather than increase. As a concrete example of how this could happen, consider the attentional window account (Belopolsky, Zwaan, Theeuwes, & Kramer, 2007; Theeuwes, 2004, 2010). Belopolsky et al. (2007, p. 935) proposed that “in the case of a serial search task, the window does not encompass the whole display . . . the unique element is not included in the salience computations and does not capture attention.” Theeuwes (2004) suggested that even a modest search slope of 10+ ms could reflect at least partially serial search. We cannot draw firm predictions from this model, because we do not know whether especially difficult (i.e., time-consuming) searches necessarily involve serial search. However, if more difficult (inefficient) searches are serial, then this model would predict declining capture effects with search difficulty (see also Barras & Kerzel, 2017). This would yield the opposite pattern to that shown in Fig. 1a–b: a negative slope rather than a positive slope.

Previous studies

As far as we know, no previous study has systematically manipulated both search difficulty and distractor heterogeneity, so it is unclear which of the above predictions are correct. Barras and Kerzel (2017) compared two levels of search difficulty with homogeneous distractors and found that capture costs increased with difficulty (consistent with both Fig. 1a and b), but they did not study heterogeneous distractors.

Experiment 1: Homogeneous distractors

Our search difficulty manipulation (see Fig. 2) was modeled after Gaspelin et al. (2016, Experiment 7), in which participants searched for the perfect circle among ovals, with three levels of target–distractor similarity (see also Duncan & Humphreys, 1989). Once participants found the target circle, they reported whether the dot within that circle was located on the left or the right. Search difficulty was manipulated by parametrically varying the shape similarity between the distractor ovals and the target circle. Importantly, the three difficulty levels were mixed randomly within blocks, which ensured that the preparatory state (i.e., attentional set) was equivalent for each difficulty level.

Fig. 2
figure 2

Examples of the search displays used in Experiment 1. Participants searched for the perfect circle (indicated by the white arrow, which was not visible to participants) and reported the location (left vs. right) of the small dot inside. The color singleton location was chosen randomly on each trial, as was the level of search difficulty

The paradigm shown in Fig. 2 resembles the additional singleton paradigm (e.g., Theeuwes, 2010) in that the color singleton appears within the target display itself. However, this paradigm also resembles the spatial cuing paradigm in that a color singleton appeared on every trial and could appear in any location, including the target location. The color singleton was nonpredictive of the target location. Participants were told that the color singletons “will usually point to the wrong location. Try to ignore them.” Our primary index of capture was the cue validity effect: RT on invalid trials minus RT on valid trials (Folk et al., 1992). Because the distractor ovals were homogeneous, we expected participants to employ singleton-detection mode to find the target circle (at least to some degree), causing the color singleton to capture attention. The key question is whether cue validity effects will increase with search difficulty (Gaspelin et al., 2016).

Method

Participants

Thirty University of New Mexico students participated for course credit. G*Power (Faul, Erdfelder, Lang, & Buchner, 2007) indicated that, for a paired-sample two-tailed t test, a sample size of 30 would provide adequate power (.8) to detect a 20-ms cue validity effect with an alpha level of .05. For this calculation, we used the within-subjects error variance obtained at the medium search difficulty level of the comparable experiment of Gaspelin et al. (2016, Experiment 7, which had an N of 29). No participants failed to meet our inclusion criterion of normal accuracy (>85% correct) and normal mean RT (not more than 2.5 standard deviations above the group mean). The mean age was 20.5 years; 22 were female, and eight were male. All participants in this experiment (and the subsequent ones) demonstrated normal color vision via an Ishihara color vision test and reported normal or corrected-to-normal visual acuity.

Apparatus and stimuli

Stimuli were presented using Psychtoolbox (Brainard, 1997) for MATLAB on a 19-in. Dell M993 CRT monitors at an average viewing distance of 60 cm. The fixation display consisted of eight gray (RGB: 119, 119, 119) unfilled placeholder boxes 2.0° (width) × 2.0° (height) arranged along an imaginary circle (radius = 3.3°), centered on the screen center, plus a ninth “fixation” box in the screen center. The background color was black (RBG: 0, 0, 0). The target display consisted of eight colored stimuli within the eight peripheral placeholder boxes: seven distractor ovals and one target circle (diameter of 1.0°). Each stimulus contained a square black dot (0.05° by 0.05°) on either the left or right side. To achieve different levels of search difficulty, distractor ovals (elongated horizontally) varied in their similarity to the target shape (circular), as shown in Fig. 2. Specifically, on each trial distractor ovals had one of three shapes, labelled based on how easy it was to distinguish them from the target: easy (1.6° × 0.4°), medium (1.4° × 0.6°), or difficult (1.2° × 0.8°).

On every trial, seven of the eight target display elements had the same color, and one element (the color singleton cue) was uniquely colored. We used the colors pink (RGB: 186, 70, 187), green (RGB: 0, 135, 16), blue (RGB: 20, 115, 230), and orange (RGB: 194, 84, 19). To maximize color contrast, on each trial we chose the colors of the singleton and its neighbors to maximize contrast within color space: green–pink, pink–green, blue–orange, or orange–blue. One of these color pairs was chosen at random for each trial.

Procedure

Participants were instructed to locate the target circle within the search display and report whether the dot within that target circle was on the left or right side by pressing the key labeled L or R (actual keys: Z and M, respectively). Participants were told to respond as quickly and accurately as possible, without making too many mistakes (more than 5%). The color singleton location was nonpredictive of target location and thus validly cued the target circle on only one out of every eight trials. Because stimulus locations and colors were random, participants needed to find the target circle based on its shape.

Each trial began with the fixation display for 1,000 ms followed by the search display until the participant responded. If no response was registered within 2,000 ms, a 200-Hz error beep sounded for 500 ms and a “too slow” message was displayed for 1,000 ms. If participants made an error, a 200-Hz error beep sounded for 300 ms.

Participants completed two blocks of practice trials (not analyzed) followed by 14 blocks of regular trials. Each block consisted of 48 trials, for a total of 672 regular trials. After each block, participants received feedback on mean RT and accuracy for that block. If accuracy within a block fell below 80%, a “low-accuracy” screen was displayed with instructions for the participant to notify the experimenter, who would then reinforce the task instructions.

Analysis

Exclusion criteria were adopted a priori from Gaspelin et al. (2016). Trials with a RT less than 200 ms or greater than 2,000 ms (1.6% of trials in Experiment 1; 1.3% in Experiment 2; 0.8% in Experiment 3; 0.7% in Experiment 4) were excluded. Additionally, trials with an error were excluded from RT analyses. For each experiment, we conducted a 3 × 2 within-subjects analyses of variance (ANOVA) on both mean RTs and error rates with the factors search difficulty (easy vs. medium vs. difficult) and cue validity (invalid vs. valid). ANOVAs were Greenhouse–Geisser corrected for potential violations of sphericity. We used ds for between-subjects comparisons and dz for within-subject comparisons (for the exact formulas, see Lakens, 2013).

Results

Mean RTs are shown in Fig. 3a, and cue validity effects are shown in Fig. 3b. The error bars are within-subjects standard errors, calculated using the normalization method outlined by Cousineau (2005) and Morey (2008). The resulting mean error rates are provided in Table 1.

Fig. 3
figure 3

Results from Experiment 1: a Mean response time in milliseconds (ms) by search difficulty (easy vs. medium vs. difficult) and cue validity (invalid vs. valid). a Cue validity effects as a function of mean response time for each level of search difficulty. Error bars represent within-subject standard errors

Table 1. Error rates by search difficulty and cue validity for each experiment

RT analysis

The data confirm that, by parametrically varying target–distractor similarity, we successfully manipulated visual search difficulty across a very wide range. Mean RTs differed significantly between the easy (610 ms), medium (732 ms), and difficult conditions (1,041 ms), F(2, 58) = 701.96, p < .001, ηp2 = 0.960.

One of the main assumptions when interpreting cue validity effects is that, when attention capture occurs, participants should respond faster when the cue is in the same position as the target (valid trial) than when the cue is in a different position (invalid trial). Consistent with capture by color singletons, overall mean RT differed significantly between the valid (761 ms) and invalid (828 ms) conditions, F(1, 29) = 38.04, p < .001, ηp2 = 0.567.

The attentional dwelling account predicts that, if color singletons capture attention, cue validity effects should increase with search difficulty. Indeed, there was a strong increase in cue validity effects from easy (18 ms) to medium (56 ms) to difficult search (128 ms), F(2, 58) = 21.91, p < .001, ηp2 = 0.430. The overall trend is remarkably linear with mean RT (see Figure 3B), just as it was in Gaspelin et al. (2016). We also conducted pre-planned t-tests on the cue validity effects at each difficulty level. These revealed significant effects for easy, t(29) = 3.26, p = .003, d = .271, medium, t(29) = 4.99, p < .001, d = .614, and difficult search, t(29) = 5.80, p < .001, d = 1.081.

Error rate analysis

Consistent with the RT results, error rates differed significantly between easy (2.5%), medium (2.4%), and difficult (8.2%) visual searches, F(2, 58) = 28.30, p < .001, ηp2 = .494. Also, participants committed more errors on invalid trials (5.3%) than valid trials (3.4%), F(1, 29) = 13.54, p = .001, ηp2 = .318. The interaction of search difficulty and cue validity was nonsignificant, F(2, 58) = 2.23, p = .138, ηp2 = .071.

Discussion

Experiment 1 examined attentional capture by color singletons with homogeneous distractors. We parametrically varied target–distractor similarity to test competing predictions about how difficulty would influence cue validity effects (see Fig. 1). We found that cue validity effects increased dramatically with search difficulty, just as in Gaspelin et al.’s (2016) study of abrupt onsets, reaching a value of 128 ms at the highest difficulty level (see Fig. 3b). Thus, the large impact of search difficulty applies not only to abrupt onsets but also color singletons. This finding replicates the pattern reported by Barras and Kerzel (2017) for color singletons, albeit over a narrower range of search difficulty. Our interpretation is that RT-based capture effects are modulated not only by the probability of capture but also by the costs of capture (i.e., by the time that it takes to reject a salient distractor items as a potential search target and locate the target; the attentional dwelling hypothesis).

Note that, because the distractors were homogeneous, they might have encouraged participants to adopt singleton-detection mode (Bacon & Egeth, 1994). This may have caused the salient color singleton to become task relevant, thereby capturing attention based on its relevance rather than its salience. Thus, the results of Experiment 1 are consistent both with contingent capture plus attentional dwelling (Fig. 1a) and with salience-based capture plus attentional dwelling (Fig. 1b). Experiment 3 will later test between these two accounts.

Experiment 2: Homogeneous distractors and spatial precues

Experiment 1 presented color singletons within the search display itself, as in the additional singleton paradigm (Bacon & Egeth, 1994; Gaspelin et al., 2015; Theeuwes, 1992). An advantage of this approach is that the short SOA of 0 ms between cue and target eliminates the opportunity for rapid disengagement from the cue prior to target onset (e.g., Theeuwes, 1994; but see also Anderson & Folk, 2012; Chen & Mordkoff, 2007). A disadvantage of this approach, however, is that the color singleton must compete for attention with the simultaneously presented target, a problem noted by von Mühlenen, Rempel, and Enns (2005). This observation regarding competition raises the possibility, noted by Barras and Kerzel (2017), that difficulty effects could reflect changes in the probability of capture rather than changes in the costs of capture (as assumed by attentional dwelling). As search difficulty decreased, the target might have become more salient (due to standing out better against highly dissimilar distractors), causing it to more often win the battle for attention with the color singleton.

Experiment 2 addressed the issues mentioned above by presenting the color singleton as a precue (see Fig. 4). Thus, there was an extended period of time (SOA = 150 ms) during which the singleton did not directly compete with the target for attention. Therefore, changes in search difficulty could no longer modulate the probability of capture. Instead, difficulty presumably modulated only the cost of capture via increased attentional dwell time on distractors during visual search.

Fig. 4
figure 4

Example stimulus displays from Experiment 2, with the color singleton presented as a precue, prior to the target display. Participants again searched for the perfect circle (depicted in the 12 o’clock position) among oval distractors, all presented in gray. Search difficulty (easy, medium, or difficult) varied randomly from trial to trial

In Experiment 1, the target color changed randomly from trial to trial (blue, pink, purple, or green), which discourages participants from establishing a strong attentional set for or against any one specific color feature. By using precues in Experiment 2, we were able to hold constant the target color (now gray), thus allowing participants to establish an attentional set exclusively for gray. This might have made it easier for them to ignore singletons in other colors. However, if participants are using singleton-detection mode, then any color singleton would likely capture attention because of its task relevance. Note that previous studies finding no evidence of capture by color singletons—and sometimes even suppression—have typically used fixed target colors (see Gaspelin & Luck, 2018a; Graves & Egeth, 2015; Kerzel & Barras, 2016; Vatterott & Vecera, 2012). However, these studies have typically also used a fixed singleton color (which permits suppression of color singletons via suppression of that specific color feature), as well as inducing feature-search mode rather than singleton-detection mode (for a review, see Gaspelin & Luck, 2018c). In the present Experiment 2, the distractors were homogeneous in shape, just as in the present Experiment 1 (see Fig. 4).

Method

Participants

Thirty University of New Mexico students participated for course credit. One participant was replaced for failing to meet our accuracy criterion (>85%). Of the final sample of participants, 24 were female and six were male; their mean age was 20.4 years.

Apparatus and stimuli

The methods were identical to those of Experiment 1, with two exceptions. The color singleton precue was presented within a precue display by filling the eight boxes with colors. The same contrasting colors from Experiment 1 were used here as well (i.e., orange vs. blue or green vs. pink). This precue was displayed for 100 ms followed by the fixation display for 50 ms. Thus, the SOA was 150 ms, which is a typical value in the precuing paradigm. The target display was now set to gray (RGB: 119, 119, 119, same as the placeholder boxes).

Results

Mean RTs are shown in Fig. 5a, and cue validity effects are shown in Fig. 5b. Error rates are provided in Table 1.

Fig. 5
figure 5

a Mean response time in milliseconds (ms) by search difficulty (easy vs. medium vs. difficult) and cue validity (invalid vs. valid) in Experiment 2. b Cue validity effects as a function of mean response time for each level of search difficulty in Experiments 1 and 2. Error bars represent within-subject standard errors

RT analysis

The pattern of results was very similar to that of Experiment 1. RT again increased sharply with search difficulty. Mean RT differed significantly between the easy (608 ms), medium (703 ms), and difficult searches (1,038 ms), F(2, 58) = 1,138.44, p < .001, ηp2 = .975.

The results again indicated that attention capture occurred: Overall RT differed significantly between the valid (764 ms) and invalid (802 ms) conditions, F(1, 29) = 43.07, p < .001, ηp2 = .598. Furthermore, there was again a consistent increase in cue validity effects from easy (11 ms) to medium (31 ms) to difficult search (70 ms). The interaction between search difficulty and cue validity was significant, F(2, 58) = 10.74, p = .001, ηp2 = .270. Preplanned t tests revealed significant cue validity effects for easy, t(29) = 3.16, p = .004, d = .181, medium, t(29) = 5.13, p < .001, d = .367, and difficult visual searches, t(29) = 4.81, p < .001, d = .732.

Error-rate analysis

The error-rate pattern was consistent with the RT data. Error rates differed significantly between easy (2.9%), medium (2.6%), and difficult (5.7%) trials, F(2, 58) = 5.42, p = .022, ηp2 = .158. Also, participants committed more errors on invalid trials (4.3%) than valid trials (3.2%), F(1, 29) = 9.43, p = .005, ηp2 = .245. The interaction of search difficulty and cue validity was significant F(2, 58) = 7.75, p = .002, ηp2 = .211.

Discussion

Rather than placing the color singleton in the target display itself, as in Experiment 1, here, we presented the color singleton as a precue, using the standard SOA of 150 ms (Folk et al., 1992, Lien et al., 2008). Cue validity effects again increased sharply as search difficulty increased. Although cue validity effects (70 ms) in the difficult search condition were reduced relative to Experiment 1 (128 ms), t(58) = 2.17, p < .05, d = .396, the overall pattern remained the same. Because the color singleton was presented as a precue in Experiment 2 (150 ms prior to the target display), these results cannot be explained in terms of modulating target–cue competition (Barras & Kerzel, 2017). Rather, we propose that the probability of capture was approximately constant across difficulty levels, but the impact of capture during visual search depended critically on search difficulty (i.e., how long attention dwelled on distractors).

Experiment 3: Heterogeneous distractors and spatial precues

Because the homogeneous distractors of Experiments 1 and 2 potentially encouraged singleton mode, the observed capture effects could be caused by either contingent capture (Fig. 1a) or salience-based capture (Fig. 1b). To test between these accounts, we need to prevent singleton-detection mode by using heterogeneous distractor shapes within the search displays. This was the goal of Experiment 3.

Whereas difficulty level was fixed for all distractors within a search display in Experiments 1 and 2, here, each search display contained at least two distractors from each difficulty level. Additionally, we rotated half of the distractor ovals 90 degrees (i.e., vertical rather than horizontal), as shown in Fig. 6. This heterogeneity increased overall search difficulty. More importantly, it ensured that the target circle was no longer a shape singleton, which increased the overall search difficulty and made singleton-detection mode a very impractical search strategy. According to the contingent capture account, therefore, capture effects should disappear (see Fig. 1a).

Fig. 6
figure 6

Example event sequence in Experiment 3. All methods were identical to Experiment 2, except that the distractors were heterogeneous, discouraging use of singleton-detection mode. Participants again searched for the perfect circle, indicated here by the red arrow (arrow not visible to participants). There were three different types of distractors within each search display (easy, medium, and difficult to reject). The difficulty level on a given invalid trial now refers only to the difficulty of the precued distractor (easy, in the example shown). (Color figure online)

As in Experiments 1 and 2, we manipulated search difficulty in Experiment 3 via target–distractor similarity, though in a much subtler manner. Each search display contained a target and seven distractors. Thus, there was always one extra distractor from one of the difficulty levels, which (on invalid trials) was always the distractor that appeared at the location cued by the color singleton. The attention dwelling account specifically predicts that if the color singletons still capture attention, the difficulty of the cued distractor should modulate RT. The reason for this is that, when capture occurs, the cued item will always be attended, and rejected, whereas the other distractors are searched only some of the time.

Method

Participants

Thirty University of New Mexico students participated for course credit. One was replaced due to high error rate. In the final sample, 17 were female and 13 were male; their mean age was 20.1 years.

Apparatus and stimuli

The methods were identical to those of Experiment 2, with two exceptions. First, half of the distractors were vertically skewed (instead of horizontally). Also, the three distractor difficulty levels were intermixed within the search display. On trials labelled as “easy,” there were three easy distractors, two medium distractors, and two difficult distractors. “Medium” trials contained two easy distractors, three medium distractors, and two difficult distractors. “Difficult” trials contained two easy distractors, two medium distractors, and three difficult distractors. On invalid trials, the extra (i.e., third) distractor was always placed at the location previously occupied by the color singleton precue.

Results

Mean RTs are shown in Fig. 7a and cue validity effects are shown in Fig. 7b. Error rates are provided in Table 1.

Fig. 7
figure 7

a Mean response time in milliseconds (ms) by search difficulty (easy vs. medium vs. difficult) and cue validity (invalid vs. valid) in Experiment 3. b Cue validity effects as a function of mean response time for each level of search difficulty in Experiments 1–3. Error bars represent within-subject standard errors

RT analysis

Mean RT once again increased with search difficulty, although the effect was less dramatic because (as explained above) the difficulty manipulation was far more subtle, affecting only one of the seven heterogeneous distractors: mean RT was 877 ms in the easy condition, 891 ms in the medium condition, and 946 ms in the difficult conditions, F(2, 58) = 59.53, p < .001, ηp2 = .672.

Unlike the previous experiments, mean RT did not differ significantly between the valid (901 ms) and invalid (908 ms) conditions, F(1, 29) = .834, p = .369, ηp2 = .028, consistent with little or no capture of spatial attention. Furthermore, cue validity effects did not vary significantly between the easy (9 ms), medium (10 ms), and difficult (1 ms) searches, F(2, 58) = .042, p = .658, ηp2 = .014. Preplanned t tests for each search difficulty level revealed nonsignificant cue validity effects for easy, t(29) = 0.83, p = .411, d = .084, medium, t(29) = 1.12, p = .271, d = .136, and difficult trials, t(29) = 0.06, p = .954, d = .007.

A JZS Bayes factor ANOVA (Love et al., 2015; Morey & Rouder, 2015; Rouder, Morey, Speckman, & Province, 2012) with default prior scales revealed that a model with only the difficulty effect was preferred over a model with both difficulty and validity by a Bayes factor of 3.38. The data therefore provide substantial evidence against the hypothesis that cue validity has an effect. The model with only the difficulty effect was preferred over the model with a difficulty effect, a validity effect, and a Validity × Difficulty interaction by a Bayes factor of 26.2.

Error-rate analysis

Error rates differed significantly between easy (4.2%), medium (5.4%), and difficult searches (5.8%), F(2, 58) = 3.51, p = .050, ηp2 = .108. Consistent with the RT data, there was little difference in errors between invalid trials (4.9%) and valid trials (5.3%); the nonsignificant trend was opposite in direction to that predicted by capture, F(1, 29) = 1.50, p = .231, ηp2 = .049. The interaction of search difficulty and cue validity was nonsignificant, F(2, 58) = .57, p = .555, ηp2 = .019.

Discussion

Experiment 3 was specifically designed to prevent singleton-detection mode by making the distractors heterogeneous within each search display (see Fig. 6). Accordingly, salience-based accounts predict capture (Fig. 1b), but the contingent capture account (Bacon & Egeth, 1994; Folk et al., 1992) does not (Fig. 1a). Confirming the contingent-capture prediction, capture effects were negligible, even when the precued item was a difficult-to-reject distractor (cue validity effect of 1 ms; see Fig. 7).

Some previous authors have found that cue validity effects were greatest early in the experiment, or early within blocks, then declined (e.g., Vatterott & Vecera, 2012). We found no evidence of such a trend in Experiment 3. The cue validity effects (averaging across difficulty levels) for the first, second, third, and fourth parts of the session were −22, 12, 14, and 7 ms, respectively. Similarly, the cue validity effects for the first, second, third, and fourth segments within each block of Experiment 3 were 7, 12, −1, and 13 ms, respectively. Note that we chose our singleton and background colors at random from a set of four possibilities (green–pink, pink–green, blue–orange, and orange–blue). We might have observed more evidence of learning across or within blocks had we fixed the colors, as in many previous studies (e.g., Gaspelin, Gaspar, & Luck, 2019, Experiment 3; Gaspelin & Luck, 2018a, Experiment 4; Graves & Egeth, 2015; Kerzel & Barras, 2016; Vatterott & Vecera, 2012; see Gaspelin & Luck, 2018c, for a review).

The contrast between this lack of capture effects with heterogeneous distractors and the strong capture effects obtained with homogeneous distractors in Experiment 2 is striking; despite using identical precue displays and identical targets, the cue validity effect for the most difficult searches shrank from 70 ms to 1 ms. We conclude that the only reason we obtained evidence of capture in Experiments 1 and 2 is that the homogeneous displays encouraged singleton-detection mode, making the color singletons task relevant (Bacon & Egeth, 1994).

Experiment 4

We have argued here that color singletons capture attention based on relevance (i.e., only with homogeneous distractors that encourage singleton-detection mode) rather than on pure salience. Meanwhile, we have previously argued (Gaspelin et al., 2016) that abrupt onsets capture attention based purely on salience. However, the experiment from that study with the largest difficulty effect (Experiment 7) actually used a variant of the homogeneous search displays employed here in Experiments 1 and 2. The proposed dissociation between color singletons and onsets would be much more compelling if we could demonstrate that abrupt onsets do produce capture effects with the exact same heterogeneous distractor displays that yielded little or no capture effects from color singletons.

The present experiment therefore replicated Experiment 3, but with abrupt onset precues rather than color singleton precues (see Fig. 8). Our goal was to choose a precue that was very dissimilar from the target, so that it would not be task relevant. Whereas the target was a small, filled, gray circle, the onset precue was a large, unfilled, purple diamond. If abrupt onsets can capture spatial attention based purely on salience, then capture effects from onsets should remain robust despite distractor heterogeneity.

Fig. 8
figure 8

Example event sequence in Experiment 4. Participants searched for the perfect circle, indicated here by the red arrow (arrow not visible to participants). The precue frame contained an unfilled purple diamond, which was the only abrupt onset within the precue display. The difficulty level on a given invalid trial now refers only to the difficulty of the precued distractor (easy, in the example shown). (Color figure online)

Method

Participants

Thirty University of New Mexico students participated for course credit. Three were replaced due to accuracy below our criterion of 85% correct. Of the final sample, 19 were female and 11 were male; their mean age was 23.1 years.

Apparatus and stimuli

The methods were identical to those of Experiment 3, except for the precue. Specifically, a purple (RGB: 255, 0, 255) diamond was used as an abrupt onset precue instead of the color singleton precue (see Fig. 8). The color (purple) and shape (diamond) of the precue were chosen to ensure that the precue would be maximally dissimilar from the target (a gray circle) and therefore any capture would presumably be due to salience rather than relevance. The diamond precue (2.25° × 2.25°) was presented for 50 ms followed by a 100-ms display of the empty frames.

Results

Mean RTs are shown in Fig. 9a and cue validity effects are shown in Fig. 9b. Error rates are provided in Table 1.

Fig. 9
figure 9

a Mean response time in milliseconds (ms) by search difficulty (easy vs. medium vs. difficult) and cue validity (invalid vs. valid) for Experiment 4. b Cue validity effects as a function of mean response time for each level of search difficulty in Experiments 3 and 4 (both of which used heterogeneous distractor displays). Error bars represent within-subject standard errors

RT analysis

Once again, the data confirmed that our manipulation of search difficulty was successful. RT varied significantly between the easy (840 ms), medium (870 ms), and difficult searches (905 ms), F(2, 58) = 42.09, p < .001, ηp2 = .592.

Mean RTs also differed significantly between the valid (813 ms) and invalid (930 ms) conditions, F(1, 29) = 200.12, p < .001, ηp2 = .873, suggesting that attention was captured by the abrupt onset precues. This cue validity effect was largest for difficult searches (148 ms), as expected, but was similar for easy (105 ms) and medium searches (95 ms); the interaction between search difficulty and cue validity was statistically significant, F(2, 58) = 9.52, p < .001, ηp2 = .247. Preplanned t tests for each difficulty level revealed significant cue validity effects for easy, t(29) = 8.63, p < .001, d = .84, medium, t(29) = 9.25, p < .001, d = .75, and difficult searches, t(29) = 12.32, p < .001, d = 1.24.

Error-rate analysis

The difference in error rate for easy (2.7%), medium (2.9%), and difficult (2.7%) search was nonsignificant, F(2, 58) = .316, p = .728, ηp2 = .011. Furthermore, there was a nonsignificant difference between invalid trials (3.1%) and valid trials (2.4%), F(1, 29) = 2.66, p = .114, ηp2 = .084. The interaction between search difficulty and cue validity was significant, F(2, 58) = 4.01, p = .025, ηp2 = .121.

Discussion

The large cue validity effects obtained from abrupt onsets in Experiment 4 stand in stark contrast with the negligible cue validity effects obtained from color singletons in Experiment 3, using the exact same search displays. The straightforward conclusion is that whereas abrupt onsets capture attention based on salience, color singletons do not. The contrasting findings between abrupt onsets and color singletons support previous conclusions that onsets are more powerful attractors of spatial attention than are color singletons (Franconeri & Simons, 2003; Jonides & Yantis, 1988).

Experiment 4 afforded an additional test of the attentional dwelling account. This account predicts that the time to reject the precued item on invalid trials (i.e., the difficulty level) should strongly influence RT (see also Lamy, Darnell, Levi, & Bublil, 2018). We selectively manipulated how difficult it was to reject the distractor appearing at the precued location (easy vs. medium vs. difficult), whereas the other six noncued distractors always contained the same mixture of two easy, two medium, and two difficult distractors. The data confirmed the predicted difficulty effect on invalid trials and that it was greater than the difficulty for the identical display configuration on valid trials (when attention was not routinely directed to that manipulated distractor), producing a significant Difficulty × Validity interaction (p < .001; see also Lamy et al., 2018). Note that, in Experiment 3, in which color singletons failed to capture attention, there was not even a trend toward an interaction between difficulty and validity (p = .7).

General discussion

The goal of the present study was to jointly determine (a) whether the impact of search difficulty on attentional capture effects—supporting the attentional dwelling account—are specific to abrupt onsets or apply generally (to any stimulus that captures attention) and (b) whether color-singleton capture is based on relevance (i.e., contingent capture; see Fig. 1a) or salience (Fig. 1b). To do so, we manipulated both target–distractor similarity and distractor heterogeneity.

In Experiment 1, participants searched for a target circle among homogeneous distractor ovals (perhaps encouraging participants to adopt singleton-detection mode). We manipulated search difficulty trial by trial by varying target–distractor similarity across three widely spaced levels. We embedded the color singleton within the search display itself on every trial (see Fig. 2). Even though search difficulty level was unknowable during the precue display, cue validity effects increased strongly with search difficulty. This key finding supports the generality of the attentional dwelling hypothesis (Gaspelin et al., 2016). When target–distractor similar is high (difficult search), attention dwells longer on the distractors while rejecting them as possible targets, greatly amplifying capture costs on invalid trials as well as capture benefits on valid trials.

Experiment 2 replicated this finding when the color singleton was presented as a precue (i.e., in the spatial cuing paradigm), 150 ms prior to the target display (see Fig. 4), rather than being embedded within the target display (as in Experiment 1). In both experiments, the increase in cue validity effects was remarkably linear with mean RT, as it was in Gaspelin et al. (2016). The cue validity effects were also very large, reaching 128 ms in Experiment 1 and 70 ms in Experiment 2.

The above findings with homogeneous distractors are consistent both with contingent capture—assuming the use of singleton-detection mode—combined with attentional dwelling (see Fig. 1a) and also with salience-based capture combined with attentional dwelling (see Fig. 1b). However, these accounts make divergent predictions for heterogeneous distractor displays (see also Bacon & Egeth, 1994; Gaspelin et al., 2015). The salience-based account predicts that capture effects should remain large, but the contingent capture account predicts that they should disappear. Experiment 3 verified the latter prediction. Cue validity effects were large in Experiments 1 and 2 (with homogeneous distractors), but plummeted to only 1 ms in the difficult search condition of Experiment 3 (with heterogeneous distractors). In other words, we found negligible capture effects even under conditions (i.e., relatively difficult search) that are highly sensitive to capture. The straightforward conclusion is that color singletons cannot generally capture attention based purely on salience, but the common practice of using homogeneous distractors promotes at least occasional use of singleton-detection mode, making the color singletons task relevant (Bacon & Egeth, 1994; but see also Lamy et al., 2006). This conclusion fits with several previous reports that task-irrelevant color singletons cannot capture attention, but instead are suppressed (for a review, see Gaspelin & Luck, 2018c).

To put the lack of capture effects by color singletons in Experiment 3 into context, Experiment 4 compared it against the capture effect produced by abrupt onsets, using the exact same heterogeneous displays. Here, cue validity effects reemerged. The difference in capture effects between abrupt onsets and color singletons under the most difficult search condition is remarkable: 148 versus 1 ms, t(58) = 8.98, p < .001, d = 2.32. Note that the 148-ms capture effect by abrupt onsets reported here, with heterogeneous distractors, is very similar to the 141-ms capture effect reported by Gaspelin et al. (2016) with homogeneous distractors. The apparent insensitivity to distractor heterogeneity for abrupt onsets presumably occurs because they capture spatial attention based on pure salience. The proposed dissociation between abrupt onsets and color singletons is consistent with several previous reports, supporting the theory that behaviorally urgent events receive attentional priority (Franconeri & Simons, 2003; see also Jonides & Yantis, 1988).

The attentional window account

To explain why capture effects tend to disappear with heterogeneous distractors (see the present Experiment 3), Theeuwes (2004, 2010) has proposed the attentional window account. He argued that the key determinant of capture is not the use of singleton-detection mode, but rather whether search is parallel or serial. Heterogeneous distractors could promote serial search, in which case the color singleton might fall outside the so-called attentional window and thus fail to capture attention. Consistent with this attentional window account, capture effects have been found to disappear when a salient cue appears in a region of space that participants are selectively ignoring (Ruthruff & Gaspelin, 2018; Yantis & Jonides, 1990). However, it is questionable whether serial search would necessarily involve filtering out all locations outside of the one being searched; instead, attention might spread diffusely over the search display to locate the most promising candidate to be searched next.

Even if the attentional window model were correct about serial searches preventing capture, it is not clear that this would apply to the present Experiment 3, given that the precue display was presented prior to the start of serial search for the target. If it did apply, one possible prediction from this account is that capture effects would decline as search difficulty increases, on the assumption that search is increasingly more likely to be serial and thus prevent capture. With homogeneous distractors (Experiments 1 and 2), the exact opposite was found: Cue validity effects only increased with search difficulty, reaching very large values (see Fig. 7b; see also Barras & Kerzel, 2017). With heterogeneous distractors (Experiment 3), meanwhile, cue validity effects were negligible even though the search was noticeably easier (if mean RT is any indication) than it was in the difficult search condition with homogeneous distractors. Nevertheless, it has proven difficult to conclusively determine whether any given search is truly parallel or serial, making the model difficult to test conclusively. The attentional window model could assert that our difficult homogeneous searches were actually parallel searches despite being even slower than the heterogeneous searches labelled as serial.

Shedding new light on old findings

The present findings shed new light on a few issues in the color-singleton literature. Typically, when a researcher observes greater RT-based capture effects in one condition compared with another, it is commonly assumed that capture was greater (i.e., stronger or more frequent). In many cases, this is probably the correct interpretation. For example, Matusz and Eimer (2011) found that a tone increased the cue validity effects from a simultaneous color singleton. Because the tone actually decreased overall RT, one could not easily argue that it increased search difficulty and increased capture costs. Instead, the boost likely reflects increased probability of capture.

However, not all increases in RT-based capture effects are due to increases in the probability of capture. The present study demonstrates that capture costs can vary sharply even when the probability of capture is presumably held constant. Note that, in the present experiments, the difficulty of the upcoming visual search was unknown at the time of the precue display. Therefore, the probability of capture at the time of the precue must have been constant across difficulty levels, despite the great differences in cue validity effects.

As a concrete example, several studies (e.g., Lamy et al., 2006) have reported that capture effects increased with display set size and attributed this effect to increased cue salience (see also Barras & Kerzel, 2017). Although that interpretation is certainly plausible, increasing capture costs with increasing search difficulty (i.e., attentional dwelling) offers an attractive alternative explanation. Note, meanwhile, that increased salience cannot explain the present findings, because there is no obvious reason why target–distractor similarity would strongly influence the salience of the color singleton. This is especially true in our spatial cuing experiments (Experiments 2 and 3), given that the precue displays were actually identical across the three levels of search difficulty.

Selection history

Several authors have pointed out that capture is determined not only by current task goals and salience but also by selection history (e.g., Anderson, 2016; Awh, Belopolsky, & Theeuwes, 2012; Becker, 2007; Gaspelin et al., 2019; Graves & Egeth, 2015; Kerzel & Barras, 2016). For example, even though target locations are randomly chosen on each trial, the previous target location appears to be searched preferentially. We found the same pattern here. In Experiment 1, for example, the RT speedup for repeated versus nonrepeated target locations was 50, 85, and 204 ms in the easy, medium, and difficult search conditions, respectively (see Table 2). The other experiments showed similarly large effects. The sharp increase in these target location repetition effects with search difficulty mirrors the increases in cue validity effects, as would be expected (the costs and benefits of altering the search process should scale with the overall search time). Note that the cue validity effects reported earlier remain more-or-less unchanged after removing all target location repetitions (one-eighth of trials); so, it is certainly not the case that color singleton cues captured attention only when appearing in the same location as the previous target.

Table 2. Speedup in response time (in milliseconds) due to repeating the target location from one trial to the next

We also observed effects of intertrial relationships for color, even though color was always task irrelevant. To analyze this data, we first eliminated the target location repetitions mentioned above. The results, averaged across difficulty levels, are shown in Table 3. In Experiment 1, the color singleton appeared within the target display itself, so that the selected target always had a color. Here, cue validity effects (averaged across difficulty levels) were much greater (p < .01) when the color singleton matched the previous target color (89 ms) than when it matched the previous singleton color (31 ms). These data support previous suggestions that even irrelevant features of the selected target become primed, and can subsequently attract attention. Note that even when there was no overlap in colors between consecutive trials (e.g., pink–green then blue–yellow), the cue validity effects were still substantial (60 ms; p < .001).

Table 3. Cue validity effect (in milliseconds) for each experiment as a function of the relationship between the colors of the singleton and the background objects on the current trial relative to the previous trial

In Experiments 2 and 3, in contrast, we presented color singletons within a precue display and the search display was always gray, so participants never actually selected a colored target. Both of these experiments failed to yield a significant effect of intertrial color relationship. This finding reinforces the conclusion that it is target selection, specifically, that produces attentional biases, not merely whether the singleton color repeats or switches.

Capture versus priority accumulation

Researchers routinely assume that salient cues produce capture costs because they trigger a shift in spatial attention (Folk et al., 1992; Gaspelin et al., 2016; Lien et al., 2008). A recently proposed alternative, however, is that cues merely increase the priority weighting at the cued location, without necessarily triggering a shift in attention to the cue (Lamy et al., 2018). When the cue and target are in the same location (i.e., valid trials), the priority boost from the cue helps the target to recruit spatial attention especially fast. But when the cue and target are in different locations (invalid trials), the boosted priority at the cued location could create strong competition with the target, slowing the decision about where to shift attention. This account can potentially explain the sharp increase in capture costs with search difficulty reported by Gaspelin et al. (2016). In easy searches with low target–distractor similarity, the distractors have much lower priority than the target, and so the priority boost from the cue might be insufficient to create strong competition with a target. But, in difficult searches, the distractors and the target have roughly similar priority, so the boost from a salient cue could create very strong competition. Thus, one thing this priority accumulation account has in common with attentional dwelling is that they both attribute greater cue validity effects with greater search difficulty to the costs encountered during difficult visual search, and not to changes in the probability of capture.

The priority accumulation account (Lamy et al., 2018), if true, is important because it would require a radical reinterpretation of nearly all previous studies showing capture effects (e.g., Folk et al., 1992; Folk & Remington, 1998; Gaspelin et al., 2016; Lien et al., 2008). Effects previously assumed to index attention capture would not actually indicate attention capture. The priority accumulation account does not challenge the present main conclusion that color singletons cannot capture attention when not task relevant. Nor does it challenge our claim that search difficulty is a critical factor in capture experiments. But it does challenge our conclusion that color singletons can capture attention under singleton-detection mode and our conclusion that onsets capture attention generally. Attentional dwelling following capture and priority accumulation are both plausible mechanisms and both fit much of the data; in fact, there is no obvious reason why they could not both contribute to capture costs. Determining which predominates will be an important goal for future research.

Concluding remarks

The present data support the following conclusions. First, the dramatic linear increase in capture effects with increasing search difficulty occurs not only for abrupt onsets but also for color singletons (at least, under singleton-detection mode, when color singletons become task relevant), suggesting the generality of the attentional dwelling account. Capture effects depend not only the probability of capture by a cue, but also on the costs of capture incurred during visual search. We also supported a specific prediction of that account, which is that RT should increase with the rejection difficulty of the cued distractor (holding constant the difficulty of other display items). These findings further reinforce the importance of using difficult visual searches to maximize sensitivity to attentional capture. Relatedly, understanding this point helps one to make sense of the capture literature, which would otherwise consist of a perplexing mix of large effects and null effects (see Gaspelin et al., 2016).

Second, whereas abrupt onsets (and other behaviorally urgent stimuli) can capture attention based purely on salience, we find that static color singletons cannot. Using identical search displays, we found a 148-ms capture effect following an abrupt onset precue, but only a 1-ms capture effect following a color-singleton precue. Whereas many previous color singleton studies have used relatively easy searches that might have been insensitive to capture, we demonstrated that capture effects by color singletons fail to emerge even under a difficult visual search. Furthermore, we provided evidence that many previous reports of capture by color singletons are due to the use of homogeneous distractors, promoting singleton-detection mode and causing capture based on relevance rather than salience.