The human visual system is commonly faced with situations in which it needs to monitor the surrounding environment for relevant stimuli while ignoring stimuli that are known to be irrelevant. In situations where there are several potentially relevant (target) stimuli, there often are several ways in which target information can be represented by the system. Targets could be represented at the level of their individual features, requiring a distinct template for each target. For example, in a search for targets that could be red or green against a background of white nontargets, the attentional system could simultaneously maintain two separate target templates: one for red and one for green (Irons, Folk, & Remington, 2012). Alternatively, targets could be represented at the level of their superordinate similarities, requiring only one template to represent all targets in a given feature dimension (e.g., colour). Continuing the previous example of a search for targets that could be red or green among white nontargets, rather than distinct sets for red and green, the system could instead maintain a single template for coloured items without specifying individual colours (Folk & Anderson, 2010) or could simply maintain a template for any discrepant item that appears, regardless of its feature dimension (Bacon & Egeth, 1994). Because it would be advantageous for attention to be flexible in selection criteria, it seems reasonable to suppose that we could adopt any of these attentional modes depending on the demands of the context. However, whereas there is ample evidence for attentional capture by feature values or discrepant feature singletons, the evidence for attention being influenced at the level of the feature dimension comes from visual search (Found & Müller, 1996; Müller, Heller, & Ziegler, 1995), and its role in attentional capture is open to alternative interpretations. The present experiments explore the flexibility and specificity of attentional control, specifically examining the degree to which attentional capture can operate on target templates representing stimulus dimensions (colour, transients) compared with specific instances of a stimulus class (red, green) or to a template for discrepant items more generally.

Attentional capture

Early work on the guidance of visual attention suggested that exogenous attention (attention guided involuntarily by external stimuli) was deployed only to stimuli that held certain properties, such as new objects appearing in a display (onset stimuli; Jonides & Yantis, 1988), or items with high feature contrast to their surroundings (Theeuwes, 1991, 1992). Importantly, these accounts argued that the properties that “capture” attention (Yantis & Jonides, 1984) do so in a bottom-up, stimulus-driven manner that is hardwired into the attentional system. The key prediction of these bottom-up accounts is that these stimuli capture attention irrespective of the momentary goals and intentions of the observer.

Folk, Remington, and Johnston (1992) challenged these accounts by demonstrating that involuntary attentional capture still depends on the current goals of the observer. In their experiments, one group of participants reported the identity of an onset target, while others reported the identity of a target defined by having a particular colour. Just prior to presenting these targets, they presented an irrelevant cue display that either contained an onset or a colour cue, which participants were instructed to ignore. They compared the response times (RTs) to the target when it appeared at the same location as the cue (valid trials) with when the target appeared at a different location to the cue (invalid trials), reasoning that if the cue captured attention to its location participants should be faster to respond on valid than on invalid trials. This holds because, on invalid trials, attention would need to be reallocated from the location of the cue to the location of the target before the target could be processed, and this reallocation is a time-consuming process. The difference between reaction times on valid and invalid trials is termed a cueing effect and is the standard measure of attentional capture in this paradigm (Posner, 1980). The results of Folk, Remington, and Johnston (1992) showed that onset cues produced significant cueing effects during search for an onset target, but not during search for a colour target. In contrast, colour cues produced cueing effects only in search for a colour target. Thus, attentional capture is not driven solely by bottom-up processes but rather is sensitive to the top-down requirements of the task being performed.

Folk and Remington (1998) extended these findings, showing that attention also can be set to respond to specific feature values. They found that in search for a red target, red cues will capture attention but green cues will not, and vice versa. The experiments of Folk and colleagues strongly suggest that, when searching for a target with a specific colour, observers (1) adopt a top-down attentional control setting for the specific colour (a.k.a. an attentional set), (2) irrelevant objects that share defining features of the target can automatically attract attention, and (3) salient objects do not automatically capture attention if they do not match the attentional set.

Beyond attentional sets for particular features, much work has been devoted to examining the flexibility and specificity of attentional control settings. For example, Bacon and Egeth (1994) showed that observers can adopt attentional sets of differing specificity in response to different stimulus conditions and task requirements. They found that when participants search for a green circle target presented among green diamond nontargets, an irrelevant red diamond in the display will slow responses, despite participants having been informed that the red stimulus will never be the target (see also: Theeuwes, 1992). However, when on some trials the display contained more than one target or other discrepant shapes this distraction by irrelevant colours disappeared, even on trials identical to those that had previously shown interference. Bacon and Egeth (1994) argued this was evidence that in search for a single discrepant shape, participants were adopting a type of goal-directed attentional set to respond to any discrepant items in the display. They termed this singleton detection mode, based on Pashler’s (1988) labelling of these discrepant items as “singletons.” Bacon and Egeth (1994) argued that under singleton detection mode participants are distracted by any singleton, including irrelevant colour singletons (Theeuwes, 1992). However, when the target could not be detected reliably on the basis of its singleton status, participants employ what Bacon and Egeth (1994) called feature search mode, in which attention is set for the specific feature (e.g., shape) of the target, rendering an irrelevant colour singleton ineffective.

More recently, further evidence of the flexibility of attentional control has been provided by results demonstrating that attention can be guided by the relative properties of a target to its surrounding context (Becker, 2010; Becker, Folk, & Remington, 2010). That is, when faced with a search for a target that is reliably larger, redder, brighter, etc. than its surroundings, attention is captured by stimuli that possess the target feature relation, even if they do not possess the specific target feature. For example, Becker, Folk, and Remington (2013) showed that in search for an orange target that was redder than its yellowish-orange distractor context, attention was captured by cues that were redder than their surrounding context (red cue among orange context items; yellowish-orange cue among yellow context items) and not by cues that had the opposite relation (yellower than their surrounding context; orange cue among red context items; yellow cue among yellowish-orange context items). Importantly, specific target features did not seem to influence attention in this experiment as cues that shared the target feature did not capture attention if they possessed the wrong feature relation (orange cues among red context items), whereas items with the correct relation captured attention even when that context was provided by the specific target feature (red cues among orange context items). This provides strong evidence for relational search as a distinct mode of attentional guidance alongside feature search and singleton detection mode.

Sets for multiple properties

The literature reviewed thus far provides evidence that the attentional control system is quite flexible in its ability to employ a range of search modes under different conditions. However, it is not difficult to think of search tasks in which each of these search modes would be severely limited. For instance, in a search for targets whose defining feature changes unpredictably from trial to trial there is no single feature or relation that can be set for, suggesting perhaps singleton detection mode would be the ideal strategy. But as previously discussed, singleton detection mode leaves the system vulnerable to distraction by any discrepant item in a search and thus may not always guide attention to a target in the most efficient manner. This raises the question: What attentional set is employed in a search for multiple target properties?

To examine this question, Folk and Remington (2008) had participants perform an adapted spatial cueing paradigm, as described above; however, in this experiment targets randomly varied between red and green on different trials. They observed that when targets could be either red or green, both red and green cues captured attention, and this was true regardless of the target on any particular trial. However, it is not clear what strategy participants were adopting to produce this result. One possibility is that in search for multiple target features the attentional system adopts singleton detection mode, producing capture by red and green cues because both are singletons in their respective cue displays. Alternatively, the attentional system may employ simultaneous attentional control settings for red and green features specifically, in the same way participants had adopted an attentional set for a single colour in Folk and Remington (1998).

Recent studies (Eimer & Kiss, 2010; Folk & Anderson, 2010) have replicated Folk and Remington (2008) with the addition of cues that do not share a target colour. These studies showed that with no incentive to adopt a feature-specific setting, search for two target colours leads to capture by all cues, including cues of a nontarget colour. This indicates that in search for two colours (with no competition from irrelevant distractor colours), attentional control settings are not tuned to respond exclusively to the target colours, but rather are tuned to respond to singletons more broadly. There is evidence that in fact, attention can be set for two distinct colours given the proper incentive. In search for two target colours, if the target display contains coloured distractors in addition to the colour target, only cues of the target colours capture attention, whereas cues of other colours do not (Irons, Folk, & Remington, 2012; see also Adamo, Wozny, Pratt, & Ferber, 2010, and Kiss, Grubert, & Eimer, 2012, for evidence of multiple simultaneous attentional control settings across stimulus dimensions). Presumably this is because the distractor colours make it impossible to detect targets on the basis of singleton status alone.

Interestingly, Folk and Anderson (2010) offered an alternative to the singleton detection mode explanation of the multiple target findings, attributing the capture by nontarget colour cues to a setting for the entire colour dimension. According to Folk and Anderson’s (2010) colour set account, attention can be set for singletons within a specific feature dimension, such as colour, excluding singletons along other dimensions, such as orientation or motion. Such a “colour singleton set” is predicted by the dimension weighting account (Found & Müller, 1996; Müller, Heller, & Ziegler, 1995), which proposes that different features within a dimension are represented on a dimension-specific saliency map, such as a colour saliency map that can guide attention to the most salient colour when the specific colour of the target is unknown (e.g., because it varies). Previous work on the dimension weighting account has shown dimensional influences on, for example, intertrial priming and pop-out effects in visual search (Found & Müller, 1996; Müller, Heller, & Ziegler, 1995; Müller, Reimann, & Krummenacher, 2003; Treisman, 1988). These results show that dimension-level information can exert an effect on processing. Such priming effects, however, appear to occur automatically, typically lacking the flexibility characteristic of attentional control settings in paradigms such as spatial cueing, and may arise from perceptual and decision-level processes. Thus, it is important to determine whether dimensional effects generalize to the kind of contingencies associated with the capture of attention, while carefully controlling for other possible sets.

The “colour set” account makes the straightforward prediction that when set for colour, a cue of any colour will capture attention if it is the only colour singleton present, but singleton cues of another dimension will not. However, Folk and Anderson (2010) could not demonstrate that the singleton capture they observed was limited to colour stimuli, because they did not run the definitive experiment; they did not test any cues from dimensions other than colour (e.g., size, shape, motion, orientation, etc.). As a consequence, their results cannot distinguish between singleton detection in general and singleton detection limited to colour stimuli (herein referred to as a colour singleton set). The idea of a dimension-specific singleton set at present remains unconfirmed.

The current study

The broad goal of the current experiments was to investigate whether the attentional system can adopt attentional settings more general than a set for specific feature values (e.g., red) but more specific than responding to any salient singleton, regardless of its dimension (singleton detection mode). More narrowly formulated, the goal was to test the existence of an attentional set limited to singletons in the colour dimension. We employed the same paradigm as previous studies examining sets for multiple properties, in which participants are asked to report the identity of a target that on each trial is randomly either red or green. However, in addition to the red, green, and blue cues employed by past authors (Eimer & Kiss, 2010; Folk & Anderson, 2010), we also employed a white rotating motion cue to test for capture by singleton stimuli of a dimension other than colour. Motion cues have previously been found to produce robust cueing effects when motion is relevant to the current attentional set (Folk, Remington, & Wright, 1994; see also Girelli & Luck, 1997; Hillstrom & Yantis, 1994; Remington, Folk, & McLean, 2001; Yantis & Egeth, 1999). If the capture observed by Eimer and Kiss (2010) and Folk and Anderson (2010) is due to the standard singleton detection mode that operates across dimensions (Bacon & Egeth, 1994; Lamy & Egeth, 2003), then all cues, including motion cues, should produce significant cueing effects. However, if the attentional set employed by participants is limited to singletons in the colour dimension, we would expect to observe significant cueing effects for each of the colour cues, including the target-unrelated blue cue, but not for the motion cue.

Experiment 1

The purpose of Experiment 1 was to determine whether the results obtained by Eimer and Kiss (2010) and Folk and Anderson (2010) are explained by a top-down attentional set for feature singletons in general (Bacon & Egeth, 1994) or whether they represent a colour singleton set (Folk & Anderson, 2010). The design was almost identical to that used by Folk and Anderson (2010), except that in addition to the colour cues used in previous studies we included a rotating motion cue. This allowed us to examine whether singletons from dimensions other than colour would capture attention during a search for two colours that varied unpredictably from trial to trial.

To ensure that the results were due to covert attention shifts and not overt eye movements, participants were instructed to maintain fixation on a central fixation cross and fixations were monitored with an eye tracker.

Method

Participants

Twenty-two participants (11 females, mean age = 22.36 years, standard deviation [SD] = 1.59 years) took part in Experiment 1. All participants had normal or corrected to normal vision and were compensated at a rate of $10 per hour for their participation. This study was approved by the University of Queensland Human Research Ethics Committee.

Apparatus

Stimuli were displayed on a 19-inch CRT colour monitor with a resolution of 1152 × 864 pixels and a refresh rate of 85 Hz, controlled by a computer running Windows XP. A video-based, infrared eye-tracking system was used (Eyelink 1000, SR Research, Ontario, Canada) with a spatial resolution of 0.01° of visual angle and a sampling rate of 500 Hz. Participants had their head supported by the eye tracker’s chin rest and forehead support and viewed the screen from a distance of 60 cm. For registration of manual responses, a standard USB keyboard was used. Event scheduling and response time (RT) measurement were controlled by Matlab, using the Psychophysics Toolbox (Brainard, 1997; Kleiner, Brainard, & Pelli, 2007).

Stimuli

Stimuli for this experiment mirrored those of Folk and Anderson (2010) as closely as possible. Throughout the experiment the screen background was set to black (RGB: 0, 0, 0; xyY: 0.355; 0.415; 1.01). The task consisted of a series of displays (Fig. 1), beginning with a fixation display, composed of a central grey fixation cross (0.3° × 0.3°; RGB: 160, 160, 160; xyY: 0.279; 0.332; 19.83) and four boxes (2.0° × 2.0°), of which only the thin grey outlines were visible (0.05° width). The boxes were placed at the 3, 6, 9, and 12 o’clock positions, 5.7° from the centre of the display (measured to the centre of the boxes).

Fig. 1
figure 1

Stimuli from Experiment 1. Singleton cues were red, green, blue, or a rotating white motion cue (arrows not presented in experiment), randomly selected on each trial. Targets were red or green, randomly selected on each trial. Participants were required to respond to whether the coloured target item was an ‘X’ or an ‘=’

The cue display consisted of the fixation display with the addition of four sets of four dots (0.4° × 0.4°), each set located around one of the boxes in a diamond configuration (distance to box: 0.3°). Three of the four-dot cues were always white (RGB: 255, 255, 255; xyY: 0.278; 0.334; 39.60). On colour cue trials, one four-dot cue was red (RGB: 255, 0, 0; xyY: 0.584; 0.366; 9.71), green (RGB: 0, 255, 0; xyY: 0.291; 0.614; 35.9), or blue (RGB: 0, 0, 255; xyY: 0.148; 0.080; 5.43). On motion cue trials, the four dots of the cue were white; however, they rotated 4 pixels per screen refresh, for a total of 1.15° around the cue location box (Fig. 1), either clockwise or counter-clockwise (randomly determined). The rotational motion of the four-dot cue was clearly visible, as reflected in the fact that all participants reported seeing the dots as rotating around the box.

The target display consisted of the fixation display with either an “=” or an “X” presented centrally within each of the boxes. These symbols subtended approximately 0.7° in width and height. Target displays were controlled such that two “=” and two “X” symbols were present on each target frame. Three of the symbols were always white, while one symbol (the target) was either red or green.

Design

Across the experiment, all possible cue/target combinations were presented in random order, randomly distributed between stimulus positions (3, 6, 9, and 12 o’clock). Cue location was uncorrelated with target location, with the result that the cue and target appeared at the same location on 25 % of trials. This also ensured that the cue was uninformative as to the location of the subsequent target, so that any evidence for capture by the cue cannot be attributed to the deliberate use of a predictive cue. The symbol of the target, “=” or “X” also was randomly distributed across trials. A complete crossing of all stimuli (red, green target; right, left response; red, green, blue, motion cue) and location combinations (four cue and target locations) yielded a total of 256 trials per block. Each participant completed two identical blocks, resulting in a total of 512 trials per participant.

Procedure

Participants were tested individually in a dark room. Written and oral instructions were given prior to commencing the task. Participants were informed that their target was the red or green symbol in the display and were instructed to ignore the cues that were presented around the boxes. Participants also were instructed to respond as quickly and accurately as possible to the target symbol by using the index finger of their left hand to press the “0” key on the keyboard’s numeric keypad when the target was an “=” and the index finger of their right hand to press the “.” key on the numeric keypad when the target was an “X.” Participants were instructed to maintain fixation on the grey cross in the centre of the display at all times.

After hearing and reading the instructions participants were calibrated using the eye tracker’s standard calibration software, after which the experimental trials began. The beginning of each trial was indicated by the fixation cross disappearing for 100 ms. Directly following the reappearance of the fixation cross, a fixation control was implemented to ensure that participants maintained central fixation. The fixation control lasted up to 2,000 ms, and a trial would only begin once the participant’s gaze had been within 1.5° of the central fixation cross for 500 ms. If a participant’s gaze did not rest on the fixation cross for 500 ms within this 2,000-ms period, the participant was calibrated anew and the fixation control would begin again. Once participants had been fixating on the cross for 500 ms, the cue display was presented for 120 ms, followed by the fixation display for 50 ms (the interstimulus interval), and then the target display for 120 ms. These presentation durations are based on those used by Folk, Remington, and Wright (1994) in their experiments using motion cues and differ from those used by Folk and Anderson (2010), who used 50 ms, 100 ms, and 50 ms for cue, interstimulus interval and target displays, respectively. This change was necessary, as pilot testing revealed that a motion cue presented for only 50 ms either did not seem to move at all or had to rotate so fast that it blurred substantially. The target display was followed by the fixation display, which remained visible until participants made their response. If participants did not respond within 1,500 ms, an error was recorded and the experiment moved on to the next trial. Incorrect or absent responses elicited a 500-ms, 1,000-Hz tone and were followed by a “buffer” trial, with cue and target settings drawn randomly from the set of all possible trials. As in Folk and Anderson (2010), response times for error and buffer trials were not included in the analysis. Each trial was preceded by a 1,000-ms intertrial interval, in which the fixation display remained on the screen.

Participants were encouraged to take a short break every 128 trials. On average, it took 45 minutes to complete the experiment.

Results

Mean RTs and error rates are presented in Fig. 2. Across all experiments, trials were excluded from analysis if participants moved their eyes more than 1.5° from fixation during the trial. In Experiment 1, this led to a loss of 2.95 % of all data.

Fig. 2
figure 2

Reaction time and error data for Experiment 1. In search for targets that were randomly either red or green, cueing effects (invalid minus valid RT) produced by colour cues, including the target-unrelated blue cue, were significantly larger than cueing effects produced by motion cues. Error bars are within-subjects confidence intervals (Loftus & Masson, 1994)

Response times

Response time data were analysed in a 4 (cue condition: red, green, blue, and motion) × 2 (validity: valid and invalid) within-participants ANOVA. A significant main effect of validity, F(1,21) = 80.56, p < 0.001, ηp 2 = 0.79, showed that participants were significantly faster on valid trials (M = 542 ms) than invalid trials (M = 584 ms). The main effect of cue condition was not significant; however, the interaction between cue condition and validity was significant (Fig. 2), F(3,63) = 10.00, p < 0.001, ηp 2 = 0.32. Planned follow-up tests showed significant cueing effects for all cues: RTs were significantly faster in the valid than the invalid condition for red cues, p < 0.001, green cues, p < 0.001, blue cues, p < 0.001, and motion cues, p = 0.005 (see Table 1 for cueing effect magnitudes).

Table 1 Cueing effect magnitudes observed in Experiment 1

To further explore the source of the interaction between cue and validity, difference scores were calculated by subtracting the RT of the valid condition from the RT of the invalid condition for each cue. These difference scores reflect the magnitude of the cueing effect for each cue (Table 1). Paired samples t tests with Bonferroni corrected alpha levels of 0.0083 (0.05/6 tests) revealed the difference between valid and invalid trials for the motion cue to be significantly smaller than for each of the colour cues (motion vs. red, t'(21) = 4.99, p < 0.001; motion vs. green, t'(21) = 4.87, p < 0.001; motion vs. blue, t'(21) = 3.83, p = 0.001); however, none of the colour cues differed from each other (all ps > 0.17).

Errors

The overall error rate for Experiment 1 was 3.87 % (see Fig. 2 for error rates at each level of validity and cue condition). A 4 (cue condition) × 2 (validity) within-participants ANOVA revealed a significant main effect of validity, F(1,21) = 9.76, p = 0.005, ηp 2 = 0.32, such that errors were lower on valid trials (M = 2.88 %) than on invalid trials (M = 4.86 %). No significant main effect of cue condition emerged. The interaction between cue and validity was significant, F(3,63) = 4.02, p = .011, ηp 2 = 0.16. Planned follow-up tests showed that participants had significantly fewer errors on valid trials compared with invalid trials for red cues, p = 0.012, green cues, p = 0.007, and blue cues, p = 0.012. There was no significant difference between the number of errors on valid versus invalid trials for motion cues (t < 1). This pattern of results mimics that of the RT data, precluding the possibility of a speed accuracy trade-off.

Discussion

The results of Experiment 1 appear somewhat mixed. On the one hand, all of the cues showed significant validity effects, indicating that they captured attention, including the motion cue. This might suggest participants were employing singleton detection mode. On the other hand, the capture exhibited by the motion cue was significantly weaker than that exhibited by all other cues. Critically, the results do show evidence consistent with a colour set in that the motion capture effect was significantly weaker than that of the blue cue (20 ms for motion vs. 44 ms for the blue cue), which also did not share features with any of the targets. Capture by the motion cue appeared to have been qualitatively different to that of the colour cues and not consistent with participants having employed singleton detection mode (Bacon & Egeth, 1994).

Previous research has demonstrated the 170-ms stimulus onset asynchrony employed here to be sufficient for a motion cue to produce large and robust validity effects when motion is task-relevant (Folk, Remington, & Wright, 1994, Experiments 3-5). Nonetheless, we ran a control experiment using rotating motion targets to confirm that our motion cue was able to produce robust capture effects when motion was made task-relevant. This control experiment found that when motion was the target-defining feature our motion cues produced significant cueing effects of roughly 113 ms.Footnote 1 Thus, it is unlikely that the motion cue employed in Experiment 1 was somehow ineffective at capturing attention or was capturing attention consistently but only producing cueing effects of approximately 20 ms. Rather, we may infer that the small validity effect observed for motion was due to averaging a portion of trials in which the motion cue captured attention with another portion in which it did not.

Why might motion cues capture attention on only a subset of trials? One possibility is that participants transitioned from a colour singleton set to singleton detection mode, or vice versa, over the course of the experiment. This would have the effect of averaging together a set of trials in which motion cues did capture attention with another set in which they did not, producing a smaller validity effect overall. To examine this possibility we split the experiment into two halves and repeated the above analyses. The resulting three-way within-participants ANOVA revealed a significant main effect of first/second half of the experiment, F(1, 21) = 28.861, p < 0.001, ηp 2 = 0.58, such that participants were significantly slower in the first half of the experiment (M = 581 ms) than the second half of the experiment (M = 546 ms). There were no significant interactions involving first/second half (all ps > 0.44). Thus, there does not appear to have been any change in capture by the motion cue throughout the experiment.

Another possibility is that, because the motion cue in Experiment 1 was present on only 25 % of trials, this cue may have occasionally captured attention due to its low probability of occurrence, despite participants employing a set for colour singletons. Recent findings suggest that this is a plausible explanation (Folk & Remington, 2007; Geyer, Müller, & Krummenacher, 2008; Müller, Geyer, Zehetleitner, & Krummenacher, 2009; Neo & Chua, 2006). For example, Folk and Remington (2007) had participants search for a colour target and presented a colour or onset cue prior to each target. They varied the proportion of trials containing these cues between subjects and observed that an onset cue presented on only 20 % of trials captured attention, whereas an onset cue presented on 100 % of trials did not. This result suggests that onset cues will capture attention automatically (consistent with past suggestions; e.g., Yantis & Jonides, 1984), provided they are rare enough to not induce active inhibition by the attentional system. Past evidence suggests that onsets and motion are treated similarly by the attentional system (Folk, Remington, & Wright, 1994). Thus, an infrequent motion cue may capture attention on a portion of trials in which it is presented, similar to the capture observed for infrequent onset cues described above (Folk & Remington, 2007). Experiment 2 was designed to test this possibility.

Experiment 2

The purpose of Experiment 2 was to determine if the small capture effect exhibited by motion cues in Experiment 1 was evidence of participants employing singleton detection mode (Bacon & Egeth, 1994) or whether motion cues captured attention on a subset of trials due to their infrequent occurrence. To test this, we increased the number of trials containing motion cues to 50 %, with red, green, and blue cues being selected randomly on the remaining 50 % of trials. All other aspects of the task were the same as for Experiment 1.

If motion cues in Experiment 1 captured attention because participants were adopting singleton detection mode, they should continue to produce a cueing effect of roughly the same magnitude when the frequency of motion cues is increased. However, if participants were adopting a colour singleton set and motion captured attention due to its infrequent presentation (Folk & Remington, 2007; Geyer, Müller, & Krummenacher, 2008; Müller, Geyer, Zehetleitner, & Krummenacher, 2009; Neo & Chua, 2006), then the cueing effect produced by motion cues should be reduced or eliminated in Experiment 2.

Method

Participants

Twenty-three new participants (15 females, mean age = 20.70 years, SD = 2.64 years) took part in Experiment 2. All participants had normal or corrected to normal vision and were compensated at a rate of $10 per hour for their participation.

Apparatus

The apparatus used in Experiment 2 were identical to those used in Experiment 1.

Stimuli

The stimuli used in Experiment 2 were identical to those used in Experiment 1.

Design

The design of Experiment 2 was identical to that of Experiment 1, except that motion cues were present on 50 % of trials and colour cues were randomly selected as red, green, or blue on the remaining 50 % of trials.

Procedure

The procedure in Experiment 2 was identical to that of Experiment 1.

Results

Mean RTs and errors for Experiment 2 are presented in Fig. 3. In Experiment 2, excluding trials due to lack of fixation led to a loss of 2.79 % of all data.

Fig. 3
figure 3

Reaction time and error data for Experiment 2. When the frequency of motion cues was increased to 50 % of trials in search for targets that were randomly either red or green, motion cues no longer produced cueing effects. Error bars are within-subjects confidence intervals (Loftus & Masson, 1994)

Response times

Response time data for Experiment 2 were analysed in a 4 (cue condition: red, green, blue, and motion) × 2 (validity: valid and invalid) within-participants ANOVA. A significant main effect of validity, F(1,22) = 153.37, p < 0.001, ηp 2 = 0.88 showed that participants were significantly faster on valid trials (M = 570 ms) than invalid trials (M = 620 ms). The main effect of cue condition also was significant, F(3,66) = 5.53, p = 0.002, ηp 2 = 0.20, as was the interaction between validity and cue condition (Fig. 3), F(3,66) = 18.38, p < 0.001, ηp 2 = 0.46. Planned follow-up tests showed significantly faster RT in the valid than the invalid condition for red cues, p < 0.001, green cues, p < 0.001, and blue cues, p < 0.001, but, critically, not for motion cues, p = 0.25 (see Table 2 for cueing effect magnitudes). Once again, paired samples t tests with Bonferroni corrected alpha levels of 0.0083 (0.05/6 tests) showed the cueing effects for all colour cues to be significantly larger than for the motion cue (all ps < 0.001); however, the cueing effects for the colour cues were not significantly different from one another (all ps > 0.08).

Table 2 Cueing effect magnitudes observed in Experiment 2

Independent samples t tests with Bonferroni corrected alpha levels of 0.0125 (0.05/4 tests) showed that the magnitudes of the cueing effects produced by cues in Experiment 2 were not significantly different than those of Experiment 1, red: p = 0.172, green: p = 0.044, blue: p = 0.232, motion: p = 0.063. However, the differences in capture effect magnitudes for each cue across experiments show that the absence of capture by the motion cue in Experiment 2 was not due to a general reduction in attentional capture in this experiment, as capture for each of the colour cues was numerically larger in Experiment 2 than in Experiment 1 (Tables 1 and 2).

In order to be consistent with the other experiments, we repeated the above ANOVA with the results separated for the first and second halves of the experiment. The resulting three-level within-participants ANOVA revealed a significant main effect of first/second half of the experiment, F(1,22) = 16.35, p < 0.001, ηp 2 = 0.43, showing that participants were slower in the first half of the experiment (M = 620 ms) than in the second half (M = 571 ms). No other effects involving first/second half of the experiment were significant (all ps > 0.136), suggesting participants’ patterns of attentional capture did not change throughout the experiment.

Errors

Overall error rates for Experiment 2 were 3.67 % (see Fig. 3 for error rates at each level of validity and cue condition). A 4 (cue condition) × 2 (validity) within-participants ANOVA on the error data revealed no significant main effects or interactions (all ps > 0.06).

Discussion

Experiment 2 confirmed that search for two colours does not induce singleton detection mode (Bacon & Egeth, 1994) and that motion cues in Experiment 1 captured attention due to their infrequent presentation. When motion cues were presented on 50 % of trials, they ceased to capture attention, whereas each of the colour cues, including the target-unrelated blue cue, continued to capture attention. Thus, it seems participants are able to adopt an attentional set at the level of the feature dimension.

There is, however, an alternative explanation for the current patterns of results that does not require us to posit the existence of a colour singleton set. As the motion cue in Experiment 1 captured attention by virtue of its infrequent presentation, the blue cue in these experiments also may have captured attention for the same reason. In the previous two experiments, the blue cue has been present on only 25 % and 16.67 % of trials, respectively. Likewise, in previous studies (Eimer & Kiss, 2010; Folk & Anderson, 2010), target non-matching colour cues were present in only one-third of trials. Thus, it may still be that participants are maintaining an attentional set only for the two specific target colours (Irons, Folk, & Remington, 2012) and capture by cues possessing other features is exclusively due to the rarity of those cues. Experiment 3 was conducted to explore this possibility.

Experiment 3

To conclude that participants are adopting a colour singleton set, we must establish that the capture observed by the blue, target non-matching, cue was not observed due to its infrequent presentation. This is an unlikely result, as past work has suggested that rare colour singletons do not capture attention (Horstmann & Ansorge, 2006; Yantis & Egeth, 1999). However, past work has not employed a search for multiple targets, so Experiment 3 was conducted to rule out this possibility.

In this experiment, we made the blue cue more frequent, presenting it on 50 % of trials. This frequency was sufficient to eliminate the capture produced by motion cues in Experiment 2. As such, if blue cues in the previous experiments captured attention due to their infrequent presentation, we would expect them to cease capturing attention when presented more frequently. Conversely, if participants were indeed adopting a colour singleton set in the previous experiments, we would expect the blue cue to continue to capture attention no matter its frequency.

Method

Participants

Sixteen new participants (14 females, mean age = 20.19 years, SD = 6.81 years) took part in Experiment 3. All participants had normal or corrected to normal vision and were compensated at a rate of $10 per hour or given course credit for their participation.

Apparatus

The apparatus used in Experiment 3 were identical to those used in the previous experiments.

Stimuli

The stimuli used in Experiment 3 were identical to those used in the previous experiments.

Design

The design of Experiment 3 was identical to that of Experiment 2, except that blue cues were present on 50 % of trials, and one of the other cues (red, green, or motion) was randomly selected on each of the remaining 50 % of trials.

Procedure

The procedure in Experiment 3 was identical to that of the previous experiments.

Results

Mean RTs and errors for Experiment 3 are presented in Fig. 4a. In Experiment 3, excluding trials due to lack of fixation led to a loss of 2.58 % of all data.

Fig. 4
figure 4

Reaction time and error data for Experiment 3. a When the frequency of target-unrelated blue cues was increased to 50 % of trials in search for targets that were randomly either red or green, blue cues produced reduced capture effects compared with those of the red and green cues. b Presenting data separately for the first and second halves of this experiment shows that in the first half of the experiment blue cues produced only small cueing effects, however by the second half of the experiment the blue cue produced cueing effects indistinguishable from those of the target-related red and green cues. Error bars are within-subjects confidence intervals (Loftus & Masson, 1994)

Response times

RT data for Experiment 3 were analysed in a 4 (cue condition: red, green, blue, and motion) × 2 (validity: valid and invalid) within-participants ANOVA. A significant main effect of validity, F(1,15) = 94.46, p < 0.001, ηp 2 = 0.86, showed that participants were significantly faster on valid trials (M = 530 ms) than invalid trials (M = 584 ms). The main effect of cue condition was not significant; however, there was a significant interaction between validity and cue condition (Fig. 4a), F(3,45) = 14.34, p < 0.001, ηp 2 = 0.49. Planned follow-up tests showed that the RT difference between valid and invalid cues was significant for all conditions (colour cues, all ps < 0.001; motion cue, p = 0.014; see Table 3 for cueing effect magnitudes). Paired-samples t tests with Bonferroni corrected alpha levels of 0.0083 (0.05/6 tests) showed that cueing effects were significantly larger for red and green cues than for blue and motion cues, all ps < 0.005. The cueing effects for the red and green, and blue and motion cues were not significantly different from each other, p = 0.291 and p = 0.058, respectively.

Table 3 Cueing effect magnitudes observed in Experiment 3

This result is interesting, as in the previous experiments cueing effects produced by blue cues have been no different from those produced by the target coloured cues. The 50 % frequency employed here for the blue cue was sufficient to eliminate capture by the motion cue completely in Experiment 2. However, the frequent blue cue continued to produce a significant cueing effect (M = 28 ms). One possible explanation is that participants switched between a colour singleton set and a set for the target features (red and green) over time due to the frequent presentation of coloured items that did not match the target colours. To examine this possibility, we divided the data into the first and second halves of the experiment and repeated the analysis.

The resulting three-level within-participants ANOVA revealed a significant main effect of first/second half of the experiment, F(1,15) = 28.77, p < 0.001, ηp 2 = 0.66, with responses being slower on average in the first half of the experiment (M = 573 ms) than the second half of the experiment (M = 541 ms). The main effect of cue remained nonsignificant. The main effect of validity was significant, F(1,15) = 88.57, p < 0.001, ηp 2 = 0.86, as was the interaction between validity and cue, F(3,15) = 12.52, p < 0.001, ηp 2 = 0.46. The interaction between first/second half and validity was nonsignificant, F(1,15) = 1.173, p = 0.296, ηp 2 = 0.07; however, there was a significant interaction between first/second half and cue, F(3,45) = 3.10, p = 0.036, ηp 2 = 0.17, and a significant three-way interaction between first/second half, cue, and validity (Fig. 4b), F(3,45) = 3.18, p = 0.033, ηp 2 = 0.18, suggesting that there was a change in the general pattern of cueing effects from the first to the second half of the experiment. Follow-up tests with Bonferroni corrected alpha levels of 0.0125 (0.05/4 tests) revealed a significant difference in cueing effects produced by the blue cue between the first and second half of the experiment, t'(15) = 4.12, p = 0.001, with capture by the blue cue being larger in the second half of the experiment than the first (Table 3). No other differences in cueing effects were observed from the first to the second half of the experiment (all ps > 0.140). Furthermore, in the first half of the experiment cueing effects produced by the blue cue were significantly smaller than those produced by the red and green cues (ps = 0.001; Bonferroni corrected alpha level of 0.0083 for 0.05/6 tests) but were no different than those produced by the motion cue (p = 0.805). However, in the second half of the experiment cueing effects produced by the blue cue were indistinguishable from those produced by the red and green cues (ps > 0.140) but were significantly larger than those produced by the motion cue (p < 0.001). This suggests participants started the experiment in feature search mode and adopted a broader set for all colour singletons in the second half of the experiment.

Finally, the magnitude of the small cueing effect produced by motion cues in this experiment was statistically indistinguishable from that of Experiment 1, t(36) = 0.691, p = 0.167 but was significantly larger than that of Experiment 2, t(37) = 2.30, p = 0.004. Thus, we replicate the attentional capture by infrequent motion that was demonstrated in Experiment 1.

Errors

Overall error rates for Experiment 3 were 5.20 % (see Fig. 4a for error rates at each level of validity and cue condition). A 4 (cue condition) × 2 (validity) within-participants ANOVA on the error data revealed a significant main effect of cue condition, F(3,45) = 6.71, p = 0.001, ηp 2 = 0.31. Pairwise comparisons with Bonferroni corrected alpha levels of 0.0083 (0.05/6 tests) revealed this was due to the red cue having significantly fewer errors than the green and blue cues (ps < 0.007). There also was a significant main effect of validity, F(1,15) = 13.09, p = 0.003, ηp 2 = 0.47, with valid cues having fewer errors than invalid cues (M = 3.59 % and M = 5.61 % errors, respectively). This precludes the possibility of a speed-accuracy trade-off. There was no significant interaction between cue condition and validity.

Discussion

Experiment 3 clearly demonstrates the existence of an attentional set limited to colour singletons. Although the blue cue was initially excluded from participants’ attentional set when it was presented on half of all trials, it was incorporated into the attentional set by the second half of the experiment. This increase in capture over time is strong evidence that the capture by blue stimuli in the previous experiments was not due to their being rare. Indeed, by the time participants were halfway through Experiment 3 their pattern of attentional capture was identical to that of Experiment 1 (Fig. 4b, second half). From this we conclude that the results of this ensemble of experiments can be taken as evidence for the existence of a colour singleton set. Why did the cueing effect for blue increase over trials? We suggest that in search for two target colours a colour singleton set is easier or more energy-efficient to employ than multiple sets for the specific target colours. This issue is discussed further in the General Discussion.

Interestingly, although blue cues produced a small cueing effect overall, similar to that produced by motion cues here and in Experiment 1, the same split half analysis performed on the data of Experiments 1 and 3 produced markedly different results. In Experiments 1 and 3, analysing the first and second halves of the experiments separately showed no difference in capture by the motion cue across the two halves of the experiment, suggesting that rather than a transition to or from singleton detection mode this cue was capturing attention despite its exclusion from the attentional set; most likely due to its infrequent presentation (Experiment 2). The same analysis showed very little attentional capture by the blue cue in the first half of Experiment 3, but by the second half blue cues were capturing attention as strongly as the target coloured cues. This suggests the small capture effect produced by blue cues in this experiment was qualitatively different to that produced by the motion cues, providing further evidence that the motion cues in Experiments 1 and 3 were not capturing attention due to participants employing singleton detection mode. Rather, this lends support to our proposal that attentional capture by motion cues in Experiment 1 was due to their infrequent presentation, as when they were made more frequent in Experiment 2 they ceased to capture attention, and when made infrequent again in Experiment 3 they once again produced a small cueing effect.

General discussion

The purpose of the current experiments was to determine whether it is possible for the attentional system to adopt a set for singletons within the colour dimension. To demonstrate this, we needed to show that participants’ attention would be caught by colour singletons that did not possess a target colour and that capture under these conditions was limited to the colour dimension.

Experiment 1 demonstrated that when all cues were equally frequent they all captured attention, but not to an equal extent. That is, while all the colour cues produced cueing effects of equal magnitude, the motion cue produced a cueing effect that was significant, but markedly smaller than those of the colour cues. This suggests that the motion cue was not necessarily included in the attentional set but may have captured attention on a portion of trials for other reasons, such as its infrequent presentation. To test this possibility, we increased the frequency of motion cues to 50 % of trials in Experiment 2, with red, green, and blue cues randomly selected on the remaining trials. Under these conditions, the attentional capture by motion cues was eliminated, supporting the conclusion that participants were indeed adopting a colour singleton set, with capture by motion cues in Experiment 1 being due to their rare presentation.

Experiment 3 ruled out the possibility that blue cues also were capturing attention due to their rarity. To do this, we increased the incidence of blue cues to 50 % of trials, randomly selecting red, green, or motion cues on the remaining trials. Under these conditions, capture by the blue cue was initially reduced but quickly increased to be equal in magnitude to that produced by the target coloured cues. Potential reasons for this are discussed below.

Attentional capture by colour singletons

Together, these experiments support the hypothesis that participants are able to adopt an attentional set for colour singletons that excludes singletons of other feature dimensions. This raises questions regarding what information the attentional system uses to choose one control setting over another and what costs and benefits are associated with employing the different attentional sets. In the current experiments, specific feature sets for red and green would in principle be just as effective at locating the targets as a colour singleton set. Thus, the fact that a colour singleton set is employed in these experiments suggests that this set is in some way preferable to the use of two simultaneous feature sets.

One possible reason a colour singleton set may be preferred is that it may be easier to maintain in working memory compared with simultaneously maintaining a set for red and a set for green. It has recently been argued that working memory can store a maximum of one attentional control setting at a time (Houtkamp & Roelfsema, 2009; Olivers, Peters, Houtkamp, & Roelfsema, 2011). While recent findings of simultaneous attentional sets for two features (Adamo, Wozny, Pratt, & Ferber, 2010; Irons, Folk, & Remington, 2012; Irons & Remington, 2013; Kiss, Grubert, & Eimer, 2012; Roper & Vecera, 2012) seem to disagree with this suggestion, our results do align with a softened version of this proposal, that it may be difficult to maintain two concurrent attentional sets in working memory. Future studies designed to estimate working memory capacity while employing designs similar to those used here and in Irons, Folk, & Remington (2012) would be informative in this regard.

Another way in which a colour singleton set may be preferable to two simultaneous feature sets is if a colour singleton set is easier or more computationally efficient to employ within the context of each individual trial (as opposed to the across-trials context of maintaining multiple sets in working memory). For example, it may be simpler for the attentional system to compare incoming stimuli to a single target template (asking “Is there any colour?”) than to two separate templates (asking “Is there any red?” and “Is there any green?”). That is, adopting an attentional set for the target dimension can be thought of as adopting a set for discriminations made easy by the nature of the required perceptual processing. This may be preferred, because it requires less energy or because performing a single comparison may be faster than performing multiple comparisons. We get a hint that this may be the case by comparing reaction times from the current experiments to those reported in Irons, Folk, and Remington (2012). In our experiments, in which participants adopted a colour singleton set, average reaction times were 570 ms; however, in Irons, Folk, and Remington (2012) Experiments 2-5, where attentional sets for two specific colours were induced, reaction times were typically much higher, averaging more than 700 ms. This is consistent with the suggestion of faster responses from a colour singleton set than from a set for two specific colours (but see Irons, Folk, & Remington, 2012, Experiment 5). Due to the various differences between the two sets of experiments, it will, of course, be necessary to assess this relationship between attentional set and response speed systematically in future research. It is worth noting that although it may seem that we could address this question with the results of Experiment 3, where participants seemed to transition from a set for two features to a colour singleton set, unfortunately we cannot. This is because any change in attentional strategy that may have occurred throughout Experiment 3 is confounded with decreases in reaction time due to practice at the task.

Of course, we are not suggesting that the above considerations are the only considerations likely to be involved in the selection of an attentional set. The likelihood that the attentional template will select items that are not targets, or do not share target properties, is another property that influences the decision of what attentional set to apply. When a colour singleton set is likely to select a distractor item instead of a target, as in Irons, Folk, & Remington (2012) where the target display always contained an irrelevant colour distractor as well as one of two colour targets, the system may decide that adopting two independent feature sets is more likely to be a successful strategy for performance of the task and thus is worth any potential cost. There are also other factors that play into the selection of an attentional strategy, such as recent history (Leber & Egeth, 2006; Maljkovic & Nakayama, 1994), target distractor context (Becker, Folk, & Remington, 2010), reward likelihood (Anderson, Laurent, & Yantis, 2011), statistical properties of the task (Cosman & Vecera, 2014), and probably others.

It is important to note an alternative perspective on the current results. Some authors (Zehetleitner, Goschy, & Müller, 2012) have argued against the existence of distinct search modes. Rather, they put forward an account of top-down control that is on a continuum for all features and feature dimensions simultaneously (see below for a description of one such account), with individual features or dimensions having their attentional priority increased or decreased to produce any particular pattern of results. The current results are largely consistent with both perspectives (but see Frequency Effects below for discussion of results that do not fit with the dynamic adjustment of feature maps). We are agnostic as to the specific cognitive architecture that produced the current results, or that produces any particular “attentional set.” We seek only to demonstrate that a dimension level set for colour singletons is possible, not to make any strong theoretical claims about how this pattern of results comes about. Nevertheless, the suggestion that there may be no distinct search modes is an interesting one that deserves further research.

Dimension weighting account

Effects at the level of the stimulus dimension have been demonstrated previously in the visual search paradigm (Found & Müller, 1996; Müller, Heller, & Ziegler, 1995; Müller, Reimann, & Krummenacher, 2003). For example Müller, Reimann, and Krummenacher (2003) had participants report the presence or absence of a pop-out target that on each trial could be one of two colours (red or blue) or one of two oblique orientations, presented with a number of vertical green distractors. Prior to the presentation of the search display, they presented participants with a word cue alerting them, in different experiments, to either the likely target dimension (e.g., “colour”) or the likely target feature (e.g., “red”). They found that when the target dimension was correctly pre-cued, participants were faster to report the presence of a target than when the incorrect target dimension was cued. Furthermore, they found that cueing the specific target feature led to a large benefit for targets of that feature, but also a smaller yet consistent benefit for uncued targets that shared the cued dimension (e.g., a blue target following a red cue) compared with uncued targets of a different dimension. This also was the case for targets with a feature that was never cued (e.g., yellow targets when only red or blue colour cues were possible).

These and similar findings have led Müller and colleagues to propose the Dimension Weighting Account of visual attention (Found & Müller, 1996; Müller & Krummenacher, 2006; Müller, Reimann, & Krummenacher, 2003). This account proposes that attentional guidance occurs through adjusting the weights on preattentive maps of feature-detectors, primarily at the level of the feature dimension, although they allow that some weighting of specific features is possible. These maps are then summed to produce an overall saliency map, the highest peak of which determines the location of initial attentional deployment (for a related model see the Guided Search model of visual attention; Wolfe, 1994; Wolfe, Cave, & Franzel, 1989).

Although the dimension weighting account is consistent with our results, it is not clear whether the present results can be interpreted in support of it. First, dimension-specific effects have been mainly found with respect to intertrial priming effects or cueing in visual search, and in these instances, dimension-based effects did not always occur at the level of visual selection but (also) at later levels of target identification and response-selection (Becker, 2010; Kumada, 2001; Mortier, Theeuwes, & Starreveld 2005; Müller & Krummenacher, 2006). Moreover, a recent fMRI study showed that these dimension-specific effects show activation in different brain areas than those that are active during feature-specific selection (Becker, Grubert & Dux, 2014). These results seem to argue against the view that the dimension weighting account is solely or primarily an account of visual selective attention, and hence the account has been modified to include later, postselective effects (Rangelov, Müller, & Zeheitleitner, 2011; Zehetleitner, Müller, & Rangelov, 2012). By contrast, the spatial cueing task used in the present study is not susceptible to postselective effects, so that all effects have to be attributed to processes of early visual selection.

Second, our finding that attention can be biased to an entire stimulus dimension also could be explained outside the dimension weighting account. For instance, it has been found that the competition between different stimuli is stronger when they belong to the same stimulus dimension (Treisman & Sato, 1990; Wolfe et al., 1990). If searching for two different features within a stimulus dimension indeed creates more interference (e.g., than biasing attention to features of different stimulus dimensions), attention may have been biased to all features within the colour dimension to avoid this interference. The view that different features within a given stimulus dimension compete more strongly for selection seems to follow from multiple accounts, including the similarity account (Duncan & Humphreys, 1989), relational accounts that include target-context relations (Becker, 2010; Becker, Folk & Remington, 2013), and Feature Integration Theory (Treisman & Gelade, 1980; Treisman & Sato, 1990).

Of note, the present results cannot distinguish between a dimension weighting account or other singleton search mode models that may rely on an architecture of stimulus dimensions, stimulus similarity, or other factors. Still, the results provide clear evidence of dimension based attentional capture in a paradigm that can demonstrate spatial attentional capture while being largely immune to postselective effects.

Frequency effects

Beyond the key finding of the existence of a colour singleton set, one interesting aspect of this study is the frequency effect observed in the attentional capture produced by motion cues. When motion cues were infrequent they produced weak but reliable attentional capture (Experiments 1 and 3); however, this was abolished when the motion cues were made more frequent (Experiment 2). Past studies on attentional capture by rare stimuli have produced mixed results, with some studies showing that rare, irrelevant colour stimuli do not capture attention (Horstmann & Ansorge, 2006; Yantis & Egeth, 1999). Other studies have shown that rare onset stimuli do capture attention (Folk & Remington, 2007; Neo & Chua, 2006). Thus, it seems that transient stimuli (motion, onsets, etc.) are able to capture attention despite their exclusion from an attentional set, provided that they are sufficiently rare. This conclusion is supported by fMRI studies showing that trial by trial fluctuations in distraction by an irrelevant colour during visual search correlated with fluctuations in BOLD activity in attentional control regions (Leber, 2010); however, distraction by irrelevant motion was correlated with activity in motion processing regions of visual cortex and not with activity in attentional control regions (Lechak & Leber, 2012). It remains to be seen whether capture by infrequent stimuli will be observed for any sufficiently salient stimulus or whether this effect is specifically limited to transients.

Interestingly, no change in the magnitude of attentional capture was observed for motion cues between the first and second half of any of the current experiments. The absence of a change in capture by motion cues across these experiments supports our conclusion that motion cues captured attention despite their exclusion from the attentional set. That is, the small capture effect produced by motion cues in Experiments 1 and 3 was not the result of participants transitioning to or from singleton detection mode over time; rather, it was consistently produced throughout the experiments. We cannot rule out that participants were switching to singleton detection mode on occasional single trials; however, past research suggests this is unlikely (Lechak & Leber, 2012).

An alternative explanation of the frequency effect observed in attentional capture by motion cues is provided by Zehetleitner, Goschy, and Müller (2012). These authors argue that rather than distinct search modes determining what features will capture attention, resistance to distraction by irrelevant items is dependent upon experience with those specific distractors and occurs through down-weighting of the distracting feature or dimension, with dimensions taking precedence. This could potentially explain why there was less attentional capture by motion cues with their increased frequency in Experiment 2, as participants accrued more experience with these cues more quickly when they were frequent than when they were infrequent. However, if experience with irrelevant features leads to less attentional capture by those features, then these items would be expected to produce less attentional capture over time, as observed by Zehetleitner, Goschy, and Müller (2012). This was not what we observed. In fact, we observed no change in the magnitude of capture by motion cues between the first and second halves of any of the current experiments, suggesting that in those experiments where motion produced attentional capture (Experiments 1 & 3), it did so consistently.

Alternatively, it could be argued that it is recent experience with a distractor that leads to its suppression and that this suppression wanes after a short period of time unless refreshed by further distractor presentations. This could produce the results of Experiment 1, where some motion cues capture attention and others do not, possibly due to their appearing soon after another motion cue while motion suppression is still high. It also would produce the results of Experiment 2, as motion cues were so frequent that motion suppression would remain high throughout the experiment. However, a reanalysis of motion cue trials from Experiments 1 and 3 demonstrates this was not the case. After classifying each motion cue trial by whether there had been 1, 2, 3, 4, or 5-or-more trials since the previous motion cue was presented, a within-participants ANOVA on the combined motion cue data from Experiments 1 and 3, with the factors of recency (5 levels), and cue validity (2 levels), found no main effect of recency, F(4,116) = 1.56, p = 0.191, ηp 2 = 0.05, and no interaction, F(4,116) = 0.212, p = 0.932, ηp 2 = 0.01, indicating no effect of recent motion cue history on attentional capture by subsequent motion cues. Furthermore, on trials when a motion cue was immediately preceded by another motion cue cueing effects were numerically larger than those produced by the experiment overall (Experiment 1: 22 ms compared with 20 ms overall; Experiment 3: 41 ms compared with 28 ms overall). This is inconsistent with the idea that the presence of an irrelevant feature leads to suppression of that feature dimension. Thus, while our interexperiment motion frequency results seem consistent with the suggestion of dynamic and experience dependent distractor suppression (Zehetleitner, Goschy, & Müller, 2012), when examined within experiments our results suggest a continuous and stable level of motion distractor suppression that varies with distractor frequency but does not increase after experience with a distractor.

Another interesting result of the current study is the change in capture by blue cues across the first and second halves of Experiment 3 (Fig. 4b). It is possible that this result is due to the attentional system initially interpreting the frequent presence of a target-unrelated colour as reason to adopt two specific feature sets (Irons, Folk, & Remington, 2012). Over time, the cost of maintaining two attentional templates combined with the fact that the blue cue never competed with the target may have led the system to “relax” its control and opt for a more energy efficient colour singleton set. However, this is speculation at this stage. This result may suggest that manipulation of the frequency of relevant versus irrelevant cues in search for multiple targets could be a valuable tool for examining the factors that influence attentional template selection in future research.

It also is important to consider why there was no suggestion of variation in attentional capture over time in the experiments of Irons, Folk, & Remington (2012). The answer here is straightforward. Although Irons, Folk, and Remington (2012; Experiments 2-5) only had irrelevantly coloured cues on 33 % of trials, they always had an irrelevantly coloured distractor in the target display. That is, on 100 % of trials there was a coloured nontarget item in the target display that could be erroneously selected by a colour singleton set. In contrast to our experiments where either a colour singleton set or a set for the specific target features would be appropriate for locating the targets, the presence of coloured distractors in the target display of Irons, Folk, & Remington (2012) meant that only a set for the specific target features would guarantee target selection in their task. This likely precluded any possibility of a change in attentional strategy throughout their experiments, as a change in strategy would have led to a decrement in performance.

Conclusions

We have provided evidence that attentional capture is able to operate not just at the level of particular stimulus features, but also at the level of target feature dimensions. The current findings speak to a view of attentional control settings as flexible representations of target properties that can exist at multiple levels of abstraction. Furthermore, we have provided evidence that motion stimuli are able to capture attention despite their exclusion from the current attentional set, provided these stimuli are sufficiently rare. Thus, for some stimuli, attentional capture seems not to be determined solely by the current attentional control settings, but also by an interaction between bottom-up input and expectations based on recent history.