The conditions under which attention is captured by external stimuli have been the subject of much debate for the past three decades. While a number of theories have regarded attentional capture as being a purely stimulus-driven process (Jonides & Yantis, 1988; Theeuwes, 1991, 1992; Yantis & Jonides, 1984, 1990), others have argued in favor of top-down control over capture. One of the most prominent of the top-down models is the contingent attentional-capture theory of Folk, Remington, and Johnston (1992). According to this theory, attentional capture is contingent upon the perceiver’s goals, and objects will only capture attention if they share the task-relevant properties of sought-after target items. The authors hypothesized that this process is facilitated by attentional control settings (ACSs) tuned to the relevant target property. For example, if you are searching for a red pen on a cluttered desk, your ACS may be tuned toward “red.” In this case, your attention will be captured by red items on your desk, but not by green or blue items.

Considerable evidence has been found in support of the ability of task goals to modulate capture by salient stimuli, as proposed in contingent capture (e.g., Ansorge, Horstmann, & Carbone, 2005; Atchley, Kramer, & Hillstrom, 2000; Burnham, 2007; Chen & Mordkoff, 2007; Folk et al., 1992; Folk, Remington, & Wright, 1994; Horstmann & Ansorge, 2006; Lien, Ruthruff, & Cornett, 2010; for alternative accounts, see Awh, Belopolsky, & Theeuwes, 2012; Belopolsky, Schreij, & Theeuwes, 2010). However, most of these studies have used tasks with fairly simple goals, such as searching for a single target feature or property, and there has been little investigation of the potential complexity and flexibility of ACSs. In one study, Adamo, Pun, Pratt, and Ferber (2008) explored the boundaries of ACSs by asking whether it is possible to maintain multiple different ACSs for separate regions of space. They used a cueing task in which the target was one of two colors (green or blue) and appeared inside one of two placeholder boxes (positioned to the left or right of fixation). Participants made a go/no-go judgment based on the color and location conjunction of the target. For example, some participants were asked to respond if a green target appeared on the left side or a blue target appeared on the right side, but to withhold a response if blue appeared on the left or green on the right. Thus, in order to attend only to go targets, participants could adopt one control setting for green in the left hemifield and another for blue in the right hemifield. Prior to the target, a nonpredictive blue or green cue was presented around one of the placeholder boxes. The authors reasoned that if participants were capable of using multiple ACSs, cues would only capture attention if they matched the target conjunctions. That is, green cues would only capture attention if they appeared on the left, and blue cues only if they appeared on the right.

The results were indeed consistent with this hypothesis: Cues matching the target conjunctions led to faster responses when the target appeared at the same location (e.g., green cue on left, green target on left) and slower responses when the target appeared at the opposite location (e.g., green cue on left, blue target on right), as compared with a no-cue control. For cues that did not match the target conjunctions (green cue on right or blue cue on left), the response times did not differ from control, suggesting that these cues did not capture attention. The same effect has also recently been demonstrated for two features of different properties, indicating that participants can simultaneously set for a shape (e.g., a triangle) in one location and a color (e.g., green) in another location (Adamo, Wozny, Pratt, & Ferber, 2010).

However, a follow-up study by Adamo, Pun, and Ferber (2010) measuring evoked response potentials suggested that their previous findings may not reflect the early control of attention postulated by contingent capture. In this study, participants completed the same go/no-go task while the researchers measured the N2pc component of the evoked response contralateral to the cue hemifield. The amplitude of the N2pc contralateral to the cue is widely thought to be proportional to the allocation of attention in the cued location. Previous studies had found that only cues matching the target properties generate a significant N2pc (Eimer & Kiss, 2008; Kiss, Jolicœur, Dell’Acqua, & Eimer, 2008; Leblanc, Prime, & Jolicœur, 2008; Lien, Ruthruff, Goodin, & Remington, 2008), evidence that ACSs modulate early attentional capture. Adamo, Pun, and Ferber, however, found a significant N2pc for both cues that matched the specific target conjunctions and those that did not, indicating that all cues captured attention equally. The match between the color of the cue and the color of the target appeared to only affect the P3 generated by the target. As the P3 is associated with late selection and encoding into working memory, this suggests that the effect of cue congruence on target performance may be postattentional. In support of this, Parrott, Levinthal, and Franconeri (2010) showed that the behavioral results of Adamo et al. (2008) can be obtained even when there is no spatial separation between the stimuli, indicating that the effects occur after spatial selection.

Although these findings suggest that multiple conjunction ACSs did not act on early attention, the possibility cannot be ruled out entirely. An alternative explanation is that the task used by Adamo et al. did not provide sufficient incentive to employ multiple conjunctive ACSs at an attentional level. As the target was the sole colored object in the target frame, participants could detect it by adopting a set for color singletons (Bacon & Egeth, 1994), rather than a setting for individual features. The go/no-go component of the task sought to discourage singleton detection mode, as using this strategy would result in attentional capture by both go and no-go stimuli, and there was no reason to attend to no-go targets. However, there was also no reason not to attend to no-go targets—as no response was made on these trials, there was no real cost to shifting attention. The experiment thus could have been treated as a discrimination rather than a detection task—certain combinations of color and location were associated with a buttonpress, and other combinations were associated with the absence of a buttonpress. If this strategy were used, it would be beneficial to attend to the target on every trial.

The use of singleton detection mode and a late decision process could explain the findings of Adamo et al. (2008; Adamo, Pun, & Ferber, 2010). Under singleton detection mode, all cues would capture attention regardless of their color, producing a benefit on valid trials and a cost on invalid trials, consistent with the ERP evidence of Adamo, Pun, and Ferber. Also consistent with this account, the color of the cue may influence processes occurring after the cue and target have been attended, such as encoding into working memory or response selection. For example, response selection may proceed more rapidly when the cue and target are compatible (the same color) than when they are incompatible (different colors). When the costs and benefits of these two separate processes are combined, the results would resemble the behavioral findings of Adamo et al. (2008). That is, responses would be very fast when both the location and color of the cue matched the target, and very slow when both the location and color were different (i.e., when the cue matched the other target conjunction). When either the location or the color was different (i.e., when the cue did not match either target conjunction, such as a blue cue on the left followed by a green target on the left), response times would be intermediate, and therefore it would appear that cues that did not match the target conjunctions had no effect on response times.

In summary, it is unclear from previous studies whether participants cannot implement multiple ACSs at an early attentional level, or whether they simply did not, perhaps due to lack of incentive. A post hoc strategy—attending to all targets via singleton search mode and deciding after the fact whether or not to respond—may be just as efficient as multiple control settings, and require less effort to maintain. In the present experiments, we sought to make the post hoc strategy inefficient, and thus encourage the adoption of multiple ACSs.

Experiment 1

To examine the role of incentives in maintaining a set for color–location conjunctions, in Experiment 1 we had participants monitor two concurrent rapid serial visual presentation (RSVP) streams, one to the left and one to the right of fixation. Each stream consisted of a series of letters presented in rapid succession in a single spatial location. As in Adamo et al. (2008), the targets were defined by both color and location. Half of the participants were asked to identify a green letter appearing in the left stream or a red letter appearing in the right stream, and the other half had the reversed color–location mappings. To ensure that singleton detection mode could not be used, the RSVP streams were made up of differently colored “filler” letters (brown, purple, etc.). In addition, an RSVP methodology prevented participants from using onset as a cue to target presentation, as all frames, whether they contained a target or not, were signaled by an onset (Burnham, 2007).

The use of RSVP streams enabled us to explore the effects of multiple conjunctive sets on nonspatial attention allocation. In RSVP tasks, attending to irrelevant items in the stream can severely impair the identification of subsequent targets (e.g., Barnard, Scott, Taylor, May, & Knightley, 2004), and as such, it is beneficial to restrict attention allocation to target-relevant features only. Folk, Leber, and Egeth (2008) showed that the capture of nonspatial attention in an RSVP task is governed by ACSs. They presented a stream composed of letters in differing font colors and asked participants to report the letter in a specific color (e.g., red). Prior to the target, a colored box (the distractor) would sometimes appear around one of the letters. Distractors matching the target color impaired the identification of subsequent targets, consistent with nonspatial attention being captured by distractors that shared a feature with the target. No interference occurred if the distractor did not match the target color. This effect has also recently been replicated with search for two colors, by showing that distractors matching either of the two target colors produced interference, while nonmatching distractors did not (Moore & Weissman, 2010).

In the present study, we extended these methods to examine whether participants can maintain ACSs for two color–location conjunctions. On some trials, a green or a red distractor letter was presented in the incorrect stream at varying intervals from the target. For instance, if the participants were searching for green targets on the left and red targets on the right, the target-colored distractors were green letters on the right or red letters on the left. Participants were asked to ignore these distractors and to focus on selecting only those letters with the correct color–location conjunctions. We reasoned that if different ACSs can be maintained for separate regions of space, distractors would not capture attention. Namely, distractors would be treated no differently than fillers and would have no additional effect on target detection. On the other hand, if participants cannot confine their settings for green and red to different locations, target-colored distractors should capture nonspatial attention and interfere with the identification of subsequent targets, consistent with Folk et al. (2008) and Moore and Weissman (2010). Thus, in order to avoid frequently missing targets, participants should be motivated to adopt the appropriate conjunction ACS.

Method

Participants

A group of 19 people (11 female, eight male) with a mean age of 23.21 years (range: 17–51) participated in return for a small payment. All participants reported normal or corrected-to-normal vision and normal color vision.

Stimuli

All text characters were presented in Arial bold font on a CRT screen with a resolution of 1,280 × 1,024 pixels. The fixation display consisted of a white plus sign (size 18 font) in the center of the screen and two white asterisks (size 34 font) located 5° to the left and right of fixation, which served as placeholders for the RSVP streams. Each RSVP stream was made up of the 13 capital letters—A, B, D, E, G, H, M, R, S, T, U, X, and Z (size 34 font)—presented in random order. One target appeared on each trial in either the left or the right stream and could be any of the 13 available characters. For each participant there were two possible targets, determined by the combination of a color and a location. For half of the participants, the targets were green (RGB 0, 255, 0; CIE x = .321, y = .598) letters in the left stream and red (RGB 255, 0, 0; CIE x = .648, y = .331) letters in the right stream. These color and location combinations were reversed for the remaining participants. Some trials also contained a single distractor in one of the two streams. In the target-colored distractor condition, the distractors were letters colored either red or green that did not match the color/location conjunction of the targets. For example, if the task goal was to detect a green target on the left or a red target on the right, a target-colored distractor could be either a red letter on the left or a green letter on the right. To ensure that any effect of these distractors was not due to its relative salience in relation to the filler letters, the target-colored distractor condition was compared with a neutral-colored distractor condition, in which the distractor was colored blue (RGB 0, 0, 255; CIE x = .156, y = .066). The remaining letters in the streams were fillers and were colored aqua (RGB 0, 255, 255; CIE x = .249, y = .367), purple (RGB 255, 0, 255; CIE x = .364, y = .178), brown (RGB 128, 85, 100; CIE x = .405, y = .322), orange (RGB 255, 170, 0; CIE x = .512, y = .442), dark purple (RGB 100, 0, 100; CIE x = .364, y = .178), or teal (RGB 0, 100, 100; CIE x = .249, y = .367). Colors were assigned to the fillers randomly, with the restrictions that within each stream, no more than three letters of each color could appear, and the same color could not appear twice in a row.

Procedure

The participants were seated approximately 57 cm from the testing computer with their chin in a chinrest. They were asked to detect and identify only letters that matched the task goals (e.g., search for a green letter on the left or a red letter on the right). They were also informed that red and green letters would occasionally appear on the wrong side, and that these could interfere with their ability to detect the target. To further illustrate the impact of the distractors, feedback was given at the end of each block displaying the participant’s accuracy with red or green distractors (target-colored distractor condition) and without (neutral-colored distractor and control conditions). Participants were asked to try to ignore the distractors as much as possible and to try to perform just as well in the target-colored distractor condition as in the other conditions by “tuning in” to red and green letters appearing only on the correct side.

Each trial began with the fixation display for 500 ms (see Fig. 1 for the trial sequence). The asterisks were then replaced by the first letter in each of the two RSVP streams. Each letter was present for 150 ms and was then replaced immediately by the next letter in the stream. This continued until all 13 letters were presented, resulting in a total duration of 1,750 ms. One target appeared in either the left or the right stream and could be at any position in the stream from the 6th to the 10th letter, inclusive. A single target-colored or neutral-colored distractor was presented on 86 % of the trials. Distractors were defined as either same-side or different-side with respect to the target. For target-colored distractors, the color of the distractor depended on the color of the target and the side on which the distractor appeared. If the target was a green letter, a same-side target-colored distractor would be a red letter in the same stream, and a different-side target-colored distractor would be a green letter in the opposite stream. Distractors could appear at one of three different lags in relation to the target: –1, 1, or 2. Lag –1 occurred when the distractor appeared immediately after the target. Lags 1 and 2 refer, respectively to a distractor appearing one or two positions prior to the target.

Fig. 1
figure 1

Depiction of a trial sequence with a same-side distractor presented at lag 1 in Experiment 1. Distractor lags are described with respect to the position of the target. For participants searching for green on the left and red on the right, the outlined target letter would be red, and the black distractor letter would be green. The gray letters were heterogeneously colored filler letters

Once all of the letters in the streams had been presented, the fixation cross disappeared and the phrase “Target letter?” was presented on the screen. Participants then identified the target by typing a letter on the keyboard. The response was visible on the screen, and participants were able to delete and change their response if they made a mistake, or to press enter to lodge their response and terminate the trial. The participants were under no time pressure to respond, and accuracy was emphasized.

Target color/location and position in the stream were randomized, to ensure that participants could not predict when or where a target would appear or, correspondingly, what color it would be. The distractor and lag conditions were also mixed within blocks. A total of 40 trials were presented for each of the 12 Distractor Type (target-colored or neutral-colored) × Distractor Location (same or different side) × Lag (–1, 1, or 2) conditions. An additional 80 control trials containing no distractor were also presented randomly throughout the task. This produced a total of 560 trials, presented in four blocks of 140 trials, with breaks in between and preceded by 20 practice trials.

Results and discussion

The data from one participant, who scored 8.75 % correct (3.42 SDs below the mean) in the control condition, was removed from further analyses. Overall, the remaining 18 participants correctly identified the target letter on 70.99 % of trials. To directly explore the effect of the distractors on performance, the accuracy on trials with a target- or neutral-colored distractor was compared with accuracy in the control condition. Accuracy was significantly worse than control in the target-colored distractor condition [t(17) = 10.55, p < .001], but no difference emerged between the neutral-colored distractor condition and control (p = .14). Each Distractor Type × Distractor Location × Lag cell mean was then independently compared with the control condition, using t tests with a Bonferroni adjustment (α = .004; see Fig. 2 for the cell means). For target-colored distractors appearing on the same side as the target, performance was significantly worse than control at all three lags: lag –1 [t(17) = 3.93, p = .001], lag 1 [t(17) = 6.18, p < .001], and lag 2 [t(17) = 7.10, p < .001]. For target-colored distractors appearing on a different side than the target, performance was impaired at lag 1 [t(17) = 8.71, p < .001] and lag 2 [t(17) = 4.67, p < .001], but did not reach significance at lag –1 [t(17) = 2.69, p = .02]. Neutral-colored distractors did not significantly affect performance at either location at lag –1 or 2, but the effect approached significance at lag 1 for both same-side [t(17) = 2.60, p = .02] and different-side [t(17) = 2.16, p = .05] distractors, all other ps > .05.

Fig. 2
figure 2

Mean target identification accuracy as a function of distractor type, distractor location, and lag in Experiment 1. Error bars indicate standard errors of the means

The relative effect of the different distractors was then examined in a 2 (distractor type: target-colored or neutral-colored) × 2 (distractor location: same or different side) × 3 (lag: –1, 1, or 2) within-subjects analysis of variance (ANOVA). Performance was significantly poorer with target-colored than with neutral-colored distractors, F(1, 17) = 152.58, p < .001, η p 2 = .90. Simple effects with a Bonferroni correction confirmed that this was the case at all levels of distractor location and lag (all ts > 3.80, ps < .008), except for different-side distractors at lag –1 [t(17) = 2.92, p = .01]. Furthermore, performance was worse when the distractor appeared on the same side as the target, F(1, 17) = 4.76, p = .04, η p 2 = .22; however, the significant interaction between distractor location and distractor type, F(1, 17) = 7.05, p = .02, η p 2 = .29, indicated that this effect of distractor location was only significant for target-colored distractors [t(17) = 2.57, p = .02]. We also found a significant main effect of lag, F(2, 34) = 34.00, p < .001, η p 2 = .67, indicating that performance was better at lag –1 than at lag 1 or 2 [t(17) = 8.41, p < .02], although, again, a significant interaction with distractor type, F(2, 34) = 17.82, p < .001, η p 2 = .51, showed that the effect of lag was greater for target-colored than for neutral-colored distractors [t(17) = 5.64, p < .001]. No other interactions were significant (ps > .05).

Finally, we addressed the possibility that the ability to successfully apply multiple conjunctive sets improves with practice, by analyzing the results for the second half of trials only. Although overall performance was better in the second than in the first half of trials, F(1, 17) = 42.84, p < .001, η p 2 = .72, the pattern of results was essentially the same throughout the experiment. In the second half of trials, accuracy with target-colored distractors was still significantly worse than in both the control and neutral-colored distractor conditions at lags 1 and 2 (all ts > 3.92, ps < .002), but no significant difference was apparent for lag –1 target-colored distractors (ps > .03). Performance in the neutral-colored distractor condition was significantly worse than control only for lag 1 same-side trials [t(17) = 4.17, p < .001].

In Experiment 1, target-colored distractors interfered with target identification to a significantly greater extent than did neutral-colored distractors, consistent with previous findings of contingent attentional capture by multiple target colors (Moore & Weissman, 2010). Importantly, target-colored distractors interfered with performance despite the fact that they did not match the correct color–location conjunction of the targets. These results suggest that participants were unable to maintain distinct ACSs for the two different streams. If they had been capable of doing this, one would expect the distractors to be treated no differently from neutral-colored distractors. If, for example, participants were set to search for green in the left hemifield, a red letter presented just before the green target would be no more detrimental to performance than a neutral-colored blue letter.

Performance was particularly impaired at lags 1 and 2 (150- or 300-ms stimulus onset asynchrony) of the same-side target-colored distractor condition. This suggests that target-colored distractors elicited an attentional blink, which is known to impede the identification of targets appearing between 100 and 500 ms after the initial distractor or target (Barnard et al., 2004; Chun & Potter, 1995; Folk et al., 2008; Raymond, Shapiro, & Arnell, 1992; Shapiro & Raymond, 1994). Although the exact mechanism behind the attentional blink is still under debate, a prominent view is that the blink occurs when nonspatial attention is allocated to the first distractor or target and is temporarily unavailable to process trailing targets (Chun & Potter, 1995; Dux & Marois, 2009). This suggests that in the present study, attention was erroneously allocated to letters that matched the target colors, regardless of their spatial location. Thus, it appears that although participants may have been setting for two different colored targets, they were unable to confine their ACSs to specific spatial locations.

In general, distractors in the different-side condition were less detrimental to target identification than were those in the same stream. This is consistent with evidence that although the attentional blink still occurs when the two items are spatially separated, the effect is reduced and performance recovers more quickly (Kristjánsson & Nakayama, 2002). It is also consistent with the view that independent processing resources for the left and right hemisphere allow the target and distractor to be processed to some degree in parallel (Scalf, Banich, Kramer, Narechania, & Simon, 2007). Furthermore, different-side target-colored distractors were the same color as the target. Moore and Weissman (2010) demonstrated that the detrimental effect of the distractor is partially alleviated when the upcoming target is the same color, as targets sharing the distractor color are prioritized over targets of the opposite color.

Interestingly, performance was even impaired when the target-colored distractor appeared directly after the target (lag –1). This result is unlikely to be due to an attentional blink. Instead, it may be that when the target and distractor appear in close temporal succession, both items are sometimes selected. This may cause some confusion as to which of the two letters is the target, or which color was assigned to which letter, leading to poorer identification accuracy. This provides further evidence that target-colored distractors are more likely to be selected than filler letters.

Note that even though target-colored distractors produced significantly greater interference than did neutral-colored distractors, there was a marginal effect of neutral-colored distractors at lag 1. Neutral-colored distractors were equated with target-colored distractors for presentation frequency (appearing at most once per trial) and were presented less frequently than the other colored fillers, potentially making the neutral-colored distractors more salient that the fillers. Curiously, however, interference from neutral-colored distractors was only present in the second half of trials and not in the first half (ps > .17). If salience were responsible for the interference, one might imagine the effect to be more pronounced in early trials. Similarly, this finding makes it unlikely that neutral-colored distractors captured attention by virtue of their novelty or by surprise (Horstmann, 2002). Instead, the effect may be accounted for by participants becoming increasingly familiar with the filler colors as the task progressed, and as a result, increasingly able to ignore or suppress them in favor of target-colored objects. As neutral-colored distractors were not presented as frequently as filler objects, they may not have been ignored quite as effectively, and therefore interfered with target detection to a greater extent than did fillers alone.

Experiment 2

In Experiment 1, participants were unable to set for conjunctions of color and location, despite strong top-down incentives to do so. However, maintaining different color sets at separate locations is a complex task—each color must be detected at a given location while simultaneously ignored at a different location. Participants only had a verbal description of the task goals from which to construct their ACSs, and perhaps this was too abstract to be effectively translated into attentional goals. Previous work has suggested that guidance may be more effective if participants are shown a picture of the target, rather than a written description of the target, before commencing search (Wolfe, Butcher, Lee, & Hyle, 2003). This logic suggests that multiple conjunctive sets may be more effective if participants actually see the targets in advance.

Providing a visual “preview” of the target properties may benefit performance for a number of reasons. First, a concrete example of the targets may help to instantiate and strengthen the top-down target representation (e.g., Wolfe et al., 2003). Second, previewing targets may help to focus or engage each color set at its correct location. Folk, Ester, and Troemel (2009) found that when an RSVP target was preceded by a target-colored distractor in the same stream, interference by the distractors appearing at other locations was reduced. The authors suggested that previewing the target allowed attention to become engaged at the target location, which prevented attention from being captured by distractors at other locations. In a similar manner, previewing targets in the present study might have helped to engage the two sets at their correct locations, thereby reducing capture by target-colored distractors appearing at the incorrect location. Third, previewing targets may act to improve target selection through bottom-up priming, as target processing improves when targets are repeated across successive trials (Kristjánsson, Wang, & Nakayama, 2002; Maljkovic & Nakayama, 1994; Yashar & Lamy, 2010). Belopolsky et al. (2010) found that top-down settings were less effective in modulating capture by a singleton distractor when the target and cue properties were varied from trial to trial (but see Lien, Ruthruff, & Johnston, 2010). They concluded that intertrial priming modulates attentional set, suggesting that prior history contributes to the effectiveness of top-down modulation (see also Awh et al., 2012).

In Experiment 2, we examined whether previewing target-matching stimuli would help participants maintain ACSs for color–location conjunctions. The design was similar to that of the target-colored distractor condition of Experiment 1—participants searched for targets defined by conjunctions of color and location and attempted to ignore target-colored distractors appearing at the incorrect location. In addition, on half of the trials, irrelevant nonletter characters that matched the target conjunctions (target-matching cues) were presented early in the target streams. If the presence of the cues serves to strengthen, engage, or prime the ACSs, then target-colored distractors appearing at the incorrect location should be more easily ignored, thereby reducing their interference.

Method

Participants

A group of 15 first-year psychology students (11 female, four male) with a mean age of 18.30 years (range: 17–21) participated in return for course credit.

Stimuli and procedure

The stimuli were identical to those used in Experiment 1, with the following exceptions. The distractors were always target-colored (green or red) letters appearing in the incorrect stream. An additional six nonletter characters (@, #, $,%, &, and ?) were presented in random order at the beginning of each stream, followed by 12 multicolored letters, resulting in a total stream length of 18 characters. In the target-matching cue condition, three of the first six characters in both streams (either the first, third, and fifth or the second, fourth, and sixth letters, varied randomly) appeared in the same colors and same streams as the two targets. For example, for participants searching for green–left and red–right targets, cue trials involved three green characters on the left together with three red characters on the right. We presented the cues three times to try to maximize their impact on performance, and to decrease the likelihood that they would be missed. The colors of the remaining filler characters, and of all of the filler characters in the no-cue condition, varied randomly between aqua, purple, brown, orange, teal, and dark purple.

Participants were given the same instructions and feedback as in Experiment 1. In addition, they were informed that nonletter characters would appear before the targets and that some of these would be colored red and green, but that the target would always be a letter of the alphabet and would appear in the later stage of the trial. An example trial sequence is presented in Fig. 3. Targets could appear anywhere in either stream between positions 12 and 15, inclusive. Distractors could appear on either the same side as or the opposite side from the target, and either directly after the target (lag –1) or one or two positions before the target (lags 1 and 2). The gap between the last cue and the target or distractor (between lags 5 and 11) was large enough that it was unlikely that an attentional blink caused by the final cue would interfere with the target or distractor. A total of 32 trials were presented for each of the 12 Cue Presence (target-matching cues or no cues) × Distractor Location (same or different side) × Lag (–1, 1, or 2) conditions, as well as an additional 64 control trials with target-matching cues and 64 control trials with no cues. All conditions were mixed within blocks.

Fig. 3
figure 3

Depiction of a trial sequence with a same-side distractor presented at lag 1 in Experiment 2. For participants searching for green on the left and red on the right, the outlined target letter and cue letters would be red, and the black distractor and cue letters would be green. The gray letters were heterogeneously colored filler letters

Results and discussion

The total accuracy on the task was 56.95 %. To gauge the overall effect of the presence of distractors both with and without target-matching cues, the data were first analyzed in a 2 (cue presence: target-matching cues or no cue) × 2 (distractor presence: distractors present or control) within-subjects ANOVA. Performance was significantly worse in the target-matching cue condition than in the no-cue condition, F(1, 14) = 21.32, p < .001, η p 2 = .60. Performance on trials with distractors was significantly worse than performance on control trials, F(1, 14) = 54.23, p < .001, η p 2 = .80. Most importantly, the effect of distractors did not vary across cue conditions, p = .49. In the target-matching cue condition, tests of simple effects with a Bonferroni correction revealed that performance was significantly poorer than control with same-side distractors at lag 2 and with different-side distractors at lag 1 (ts > 4.52, ps < .001), and the difference between all remaining target-matching cue conditions and controls approached significance (ts > 2.46, ps < .03). In the no-cue condition, performance was significantly worse than control for same-side distractors at all lags and for different-side distractors at lag 1 (ts > 3.65, ps < .003), and the difference between the remaining conditions and control approached significance (ts > 2.87, ps < .02).

In addition, we examined whether the effect of cue presence varied across the different levels of distractor location and lag, using a 2 (cue presence) × 2 (distractor location: same or different side) × 3 (lag: –1, 1, or 2) within-subjects ANOVA (see Fig. 4). Again, performance was significantly poorer in the target-matching cue condition than in the no-cue condition, F(1, 14) = 9.56, p = .008, η p 2 = .41. We found a significant main effect of lag, F(1, 14) = 13.31, p < .001, η p 2 = .49, with performance being generally highest at lag –1 and lowest at lag 1, but no main effect of distractor location (p = .18). Importantly, the effects of distractor location and lag did not vary across the different levels of cue presence (ps > .64), suggesting that the addition of the cues had no impact on the degree of distractor interference. The only significant interaction was between distractor location and lag, F(1, 14) = 6.67, p = .004, η p 2 = .32. Further analysis indicated that performance was better when the distractor appeared on the opposite side from the target rather than on the same side at lag 2 [t(14) = 2.48, p = .03], but we found no difference at lags –1 and 1 (ps > .30).

Fig. 4
figure 4

Mean target identification accuracy as a function of target-matching cue presence, distractor location, and lag in Experiment 2. Error bars indicate standard errors of the means

Once again, practice had little effect on performance. The pattern of data in the second half of the trials was similar to the overall data pattern. For target-matching cues, accuracy was significantly worse than control in same-side lag 2 and different-side lag 1 trials (ts > 4.38, ps < .004) and approached significance at same-side lag –1 [t(14) = 2.52, p = .02]. Accuracy in the no-cue condition was worse than control at same-side lags of –1, 1, and 2 and at different-side lag 1 (ts > 3.72, ps < .003), and it approached significance at same-side lag 1 and different-side lag –1 (ts > 2.19, ps < .05). Overall, performance was worse with cues than without, F(1, 14) = 6.40, p = .02, but cue presence did not interact with distractor location or lag (ps > .21).

In Experiment 2, we tried to promote the use of conjunction ACSs by previewing the conjunction targets early in the trial. Previewing target properties has been demonstrated to improve target selection in a number of paradigms (Belopolsky et al., 2010; Folk et al., 2009; Kristjánsson et al., 2002; Maljkovic & Nakayama, 1994; Wolfe et al., 2003; Yashar & Lamy, 2010). In the present study, however, adding target-matching cues to the beginning of the trial had no effect on the degree of distractor inhibition, suggesting that target preview did not improve selection when the targets were defined by conjunctions of color and location. Folk et al. (2009) showed that previewing a single target feature at a single location can help to engage the attentional set at the correct location, eliminating capture by distractors at incorrect locations. In contrast, the present results suggest that it may not be possible to engage two different feature sets independently at two different locations. Even with the bottom-up input provided by the cues, the spatial scope of the two color sets appeared to overlap.

In fact, the addition of the target-matching cues actually impaired target selection. It is not clear why this was the case. One possibility is that the presence of the cue led to early attentional engagement with the RSVP streams. Successful performance in RSVP tasks requires that attention remain disengaged until a target is detected; becoming engaged too early results in filler items being processed and reduces the resources available to process targets (Folk et al., 2009). An alternative option is that the targets in the cue condition were inadvertently being suppressed. Attention may have been initially attracted to the cues by their target-relevant colors. However, as no response must be given to the cues, further processing of similar items might have been actively suppressed, and this suppression carried over to the matching targets. That is, the past trial history of inhibiting a response to the initial presentation of the target color could have carried over to affect target detection.

This second hypothesis is particularly interesting, because any suppression produced by the cue did not appear to affect processing of the distractors, despite the fact that the distractors shared the same colors as the cues. Distractor interference was the same both with and without the cues; thus, any effect of suppression appeared to be isolated to only those items that were the same color and appeared in the same location as the cue (i.e., the target), suggesting that the distractor suppression may have been color–location specific. This is consistent with previous evidence indicating that different policies for excluding distractors can operate in parallel at independent locations (Awh, Sgarlata, & Kliestik, 2005; Crump, Gong, & Milliken, 2006).

The possibility that distractor suppression can be specific to a color–location conjunction, even if target facilitation cannot, opens up a new avenue for exploring attentional selection on the basis of conjunctions. If participants can search for red and green targets, and at the same time actively suppress red and green items that do not match the target conjunctions (i.e., distractors), then attention may be more successfully directed toward only those items that match the target conjunctions. This possibility was explored further in Experiment 3.

Experiment 3

In Experiment 3, we explored whether bottom-up input can help to generate the suppression of distractors that do not match the target conjunctions. The methodology was similar to that of Experiment 2, except that we presented distractor-matching cues rather than target-matching cues at the beginning of some trials. The cues were presented in only one stream on each trial and always matched the upcoming distractor, in order to provide as much incentive to suppress distractors as possible. If the cues were to generate suppression of further similar items, the processing of the subsequent distractor should be suppressed and its effect on target identification reduced. Furthermore, if this suppression is limited to items matching the color–location conjunction of the cues, the presence of cues should not impair target identification. On the other hand, if the cues do not produce suppression, and their effect in the previous study was simply due to increased attentional engagement, then distractor-matching cues should have the same effect as target-matching cues. That is, the cues will capture attention and increase engagement on the cued stream, producing a general impairment of performance.

Method

Participants

A group of 19 participants (11 female, eight male) with a mean age of 21.11 years (range: 17–25) participated in return for a small monetary reimbursement.

Stimuli and procedure

The stimuli and procedure were similar to those of Experiment 2. The nonletter characters in the beginning of each stream were replaced with six letters randomly selected from the set of A, B, D, E, G, H, M, R, S, T, U, X, and Z. In the distractor-matching cue condition, three of the first six letters shared the same color–location conjunction as the distractor. The cues always predicted the presence, color, and location of the distractor (i.e., trials with red cues in the left stream always contained a red distractor in the left stream). Targets appeared at any position in the stream from 12 to 15. A total of 32 trials were presented for each of the 12 Cue Presence (distractor-matching cues or no cue) × Distractor Location (same or different side) × Lag (–1, 1, or 2) conditions, as well as an additional 64 control trials without distractors or cues.

Results and discussion

The data from one participant whose accuracy in the control condition was 37.50 % (2.71 SDs below the mean) were removed from analyses. The overall accuracy for the remaining 18 participants was 74.63 %. Performance was significantly worse in both the no-cue [t(17) = 10.10, p < .001] and distractor-matching cue [t(17) = 7.73, p < .001] conditions than in the control condition. In the distractor-matching cue condition, performance for same-side distractors was significantly poorer than control at both lags 1 and 2 (ts > 4.18, ps = .001), but only marginally so at lag –1 [t(17) = 2.36, p = .03; see Fig. 5]. Performance was significantly impaired at all three lags for different-side distractors (ts > 4.04, ps < .002). In the no-cue condition, accuracy with a same-side distractor was significantly lower than control at all lags (ts > 3.42, ps < .004). Accuracy with a different-side distractor was also lower than control at lags –1 and 1 (ts > 4.88, ps < .001), and marginally so at lag 2 using a Bonferroni correction [t(17) = 3.23, p = .005].

Fig. 5
figure 5

Mean target identification accuracy as a function of distractor-matching cue presence, distractor location, and lag in Experiment 3. Error bars indicate standard errors of the means

A 2 (cue presence) × 2 (distractor location) × 3 (lag) within-subjects ANOVA was conducted to compare accuracies across distractor conditions. As predicted, accuracy was significantly higher in the distractor-matching cue condition than the no-cue condition, F(1, 17) = 27.82, p < .001, η p 2 = .62. The main effect of distractor location did not reach significance (p = .06); however, it did interact significantly with cue presence, F(1, 17) = 38.45, p < .001, η p 2 = .69. Further analysis revealed that although accuracy was higher with different-side than with same-side distractors in the no-cue condition [t(17) = 4.65, p < .001], no effect of distractor location was apparent in the distractor-matching cue condition (p = .65). A significant main effect of lag indicated that performance was better at lag –1 than at lags 1 and 2, F(2, 34) = 27.77, p < .001, η p 2 = .62. Lag interacted significantly with the other variables (all ps < .004), and simple-effects tests were conducted to explore these effects. For distractors appearing on the same side as the target, performance was significantly better with a distractor-matching cue than with no cue at both lag 1 [t(17) = 6.64, p < .001] and lag 2 [t(17) = 6.24, p < .001], but not at lag –1 (p = .38). In the different-side condition, accuracy was only marginally higher in the distractor-matching cue than the no-cue condition at lag 1 [t(17) = 2.89, p = .01], and no difference appeared at the other two lags (ps > .51).

As in the previous experiments, practice appeared to have little effect on the pattern of results. In the second half of trials, performance on trials with distractor-matching cues was significantly less accurate than control in the same-side lag 1 and 2 conditions (ts > 4.16, ps < .001) and approached significance at all lags with different-side distractors (ts > 2.79, ps < .02). Accuracy on trials without cues differed from control at same-side lags 1 and 2 (ts > 4.68, ps < .001) and at different-side lag 2 [t(17) = 4.12, ps < .001], and approached significance at different-side lag –1 [t(17) = 2.78, p = .01]. Accuracy was marginally better with distractor-matching cues than with no cues in the same-side condition at lags 1 and 2 (ts > 2.84, ps = .01).

In Experiment 3, distractors interfered significantly less with target detection when they were preceded by identically colored cues presented in the same stream. This was particularly evident in the same-side condition, where the attentional blink seen in the no-cue condition was greatly reduced in the distractor-matching cue condition. These findings cannot be explained by increased attentional engagement in response to the cues, as this would have produced an overall decrease in target identification performance (Folk et al., 2009). Instead, the findings are more consistent with the view that experience of withholding a response to the distractor-colored cues generated suppression, which then carried over to the distractors and reduced their effect on target identification.

Crucially, any effect of suppression was limited to those items with the same color and location as the cue. If participants had only inhibited the location of the cues, the identification of targets appearing at the same location would have been considerably poorer than of those appearing at the opposite location. For example, if red cues appeared on the left, and as a result the left location was inhibited, identification would be worse for a target presented on the left than on the right. Conversely, inhibiting only the distractor color would lead to poorer performance at the opposite location. That is, if all red objects were inhibited, performance for red targets on the right would be worse than for green targets on the left. However, performance was equally good in both the same-side and different-side conditions, suggesting that distractors were suppressed on the basis of their color–location conjunction, allowing for more effective selection of objects matching the correct target conjunctions.

Experiment 4

The goal of Experiment 4 was to examine the degree to which the results of Experiment 3 reflected top-down inhibition. Our hypothesis that distractor-matching cues are attended, found to be irrelevant, and consequently suppressed implies the use of top-down or strategic mechanisms. However, one could argue that the results were actually due to passive, bottom-up processes. It is clear that bottom-up input is a necessary component of the findings, as distractor interference was only reduced on trials in which the distractor was preceded by the cue. It is possible that repeating the distractor features within each stream decreases the salience or potency of the distractor, thereby reducing its impact on target selection. This may occur without ever allocating attention to the cue, and without recruiting top-down inhibitory mechanisms.

In the previous experiment, we tried to maximize the opportunity to apply inhibition in a top-down manner by presenting cues in only one stream at a time and making them predictive of the upcoming distractor. Because the cues were predictive, inhibition toward distractor-matching cues would always be effective for distractors, and limited attentional resources would not need to be divided between two different distractor-matching cues. Thus, participants should have been highly motivated to use the cues to develop top-down inhibition for distractors. In the present experiment, we endeavored to limit the effectiveness of top-down control by making the distractor-matching cues nonpredictive. As in Experiment 2, distractor-matching cues appeared simultaneously in both streams (e.g., a red character on the left with a green character on the right), and provided no information about the presence, color, or location of distractors. If the Experiment 3 findings were, as we suggest, the result of top-down inhibition of distractor-matching items, the effect of cues on distractors should be greatly weakened when the cues are no longer predictive. On the other hand, varying the predictiveness of the cue should have little effect on an obligatory bottom-up process. Thus, if repetition of the distractor features impairs distractor processing in a bottom-up manner, the distractor-matching cues should continue to reduce distractor interference in a similar manner as in Experiment 3.

Method

Participants

A group of 16 participants (ten female, six male) with a mean age of 22.81 years (range: 19–32) took part in return for a monetary reimbursement.

Stimuli and procedure

The stimuli were identical to those of Experiment 2, except that the cues were distractor-matching rather than target-matching. Nonletter characters were presented at the beginning of each stream. On distractor-matching trials, three of the characters in both streams appeared in the same color and same location as the distractors (e.g., if participants were searching for green on the left and red on the right, the cues were composed of red characters on the left presented simultaneously with green characters on the right). A total of 32 trials per Cue Presence × Distractor Location × Lag condition were presented, plus 64 control trials with target-matching cues and 64 control trials without cues.

Results and discussion

The total accuracy was 82.33 %. The data were first analyzed in a 2 (cue presence: distractor-matching cues or no cue) × 2 (distractor presence: distractors present or control) within-subjects ANOVA. As in previous experiments, distractors significantly impaired performance, F(1, 15) = 40.99, p < .001, η p 2 = .73. However, the presence of distractor-matching cues had no effect on accuracy (p = .21), and cue presence did not interact with distractor presence (p = .76). Tests of simple effects found that, both with and without cues, performance was significantly worse than control for same-side distractor trials at lags 1 and 2 (ts > 5.06, ps < .001). Accuracy on different-side distractor trials at lag 1 was significantly poorer than control with distractor-matching cues [t(15) = 4.71, p < .001], and approached a significance difference in the no-cue condition [t(15) = 3.23, p = .006] (see Fig. 6).

Fig. 6
figure 6

Mean target identification accuracy as a function of distractor-matching cue presence, distractor location, and lag in Experiment 4. Error bars indicate standard errors of the means

To compare distractor interference across cue conditions, the data were analyzed in a 2 (cue presence) × 2 (distractor location: same or different side) × 3 (lag: –1, 1, or 2) within-subjects ANOVA. The pattern of distractor location and lag was similar to that in previous experiments: The main effect of lag was significant, F(2, 30) = 19.83, p < .001, η p 2 = .57, with performance being highest at lag –1 and lowest at lag 1. Performance was also poorer with same-side than with different-side distractors, F(1, 15) = 21.68, p < .001, η p 2 = .59, and distractor location interacted with lag, F(2, 30) = 6.23, p = .005, η p 2 = .29. Of most importance to the present experiment, the main effect of cue condition was not significant (p = .19), nor did it interact with distractor location or lag (ps > .11). Tests of simple effects showed no difference between cue conditions at any level of distractor location and lag (all ps > .08).

The results of Experiment 4 show that presenting two simultaneous, nonpredictive distractor-matching cues has essentially no effect on performance. Therefore, the reduction in distractor interference in Experiment 3 occurred either because the cues were predictive, or because they appeared in only one stream at a time. For the predictiveness of a cue to have an effect, the cues must have been processed to a level at which information about the cue–distractor relationship could be extracted. This top-down information could then be used as a basis for initiating inhibition. Setting aside the role of predictiveness, the finding of a reduction in distractor interference for one cue but not for two simultaneous cues suggests the involvement of a capacity-limited mechanism rather than low-level perceptual mechanisms. If the underlying process was not capacity-limited and did not require attention, it should have been equally strong for both two cues and one cue. The findings are consistent with the hypothesis that cues must be attended and activate postattentional mechanisms before they can influence the processing of distractors. Either way, the results suggest that the selection of targets based on color–location conjunctions can be made more efficient with the aid of limited-capacity top-down processes, rather than passive bottom-up or perceptual mechanisms.

It is surprising that distractor-matching cues had no effect whatsoever on performance, given that target-matching cues in the same design did have an effect on target identification (Exp. 2). Although the effect was fairly weak, it does suggest that target-matching cues were occasionally attended and then inhibited. Note that in Experiment 2, the relationship between the cues and targets was direct: Inhibition of cues directly impaired target identification. In contrast, the relationship between the cues and targets in Experiment 3 was more indirect: Inhibition of distractor-matching cues impaired the processing of distractors, which then influenced target identification. It is possible that when cues are nonpredictive, this indirect effect is too weak to have much influence on overall accuracy.

General discussion

In the present study, we explored whether attention could be set for conjunctions of color and location if there were increased incentives to adopt complex attentional control settings. The findings were twofold. Experiment 1 showed that even when participants had to adopt feature search mode and had significant penalties for attending to distractors, an attentional blink was elicited by target-colored distractors that did not match the color–location target conjunctions. Similarly, in Experiment 2, when targets were previewed at their correct location, participants were still unable to avoid capture by target-colored distractors appearing on the wrong side. In both of these experiments, the conjunctive task goals were not applied until after the stage at which the attentional blink had its effect, considered by many theories to be the point at which items are selected for response selection and encoding into working memory (Chun & Potter, 1995; Dux & Marois, 2009). Thus, the influence of multiple conjunction task goals in Experiments 1 and 2 appeared to be postattentional, consistent with Adamo, Pun, and Ferber (2010) and Parrott et al. (2010).

However, Experiment 3 suggests that participants may be better able to select targets defined by color–location conjunctions if the distractors are viewed in advance. Previewing a distractor early in the trial significantly reduced its impact on performance, an effect that was specific to the distractor’s color and location. Experiment 4 demonstrated that the effect relied on limited-capacity top-down mechanisms, rather than bottom-up or low-level perceptual processes. On the basis of these findings, we suggest that attending to the distractor-matching cues allowed participants to engage top-down inhibitory processes for distractor-matching objects. This conclusion is consistent with visual search studies showing that distractor preview enables the construction of an inhibitory set for distractors (Braithwaite, Humphreys, & Hulleman, 2005; Olivers & Humphreys, 2003), as well as evidence that attentional selectivity is enhanced by prior experience (Awh et al., 2012; Belopolsky et al., 2010). Importantly, this inhibition specifically targeted objects sharing both the color and location of the distractors. As a result, distractors were less likely to capture attention, while target detection remained unharmed. This is preliminary evidence that under some circumstances, early attentional selection can occur on the basis of color–location conjunctions.

The effect of preview in the present study accords well with studies showing that different strategies for excluding distractors can operate at different spatial locations (Awh et al., 2005; Crump et al., 2006). For example, Awh et al. (2005) found that distractors interfered with performance less when a target appeared at a location frequently associated with the presence of distractors, as compared with a location rarely associated with the presence of distractors. The authors suggested that different expectations about the likelihood of distractors led to different degrees of “distractor exclusion” at the two locations. Importantly, they achieved the same results even when targets appeared in both locations simultaneously, suggesting that the two distractor-exclusion settings were maintained in parallel. The present study extends these findings to show that distractors may be differentially excluded from specific locations on the basis of specific features.

These results also suggest that inhibition may be an important or necessary process in the implementation of complex ACSs. Previous studies exploring the use of ACSs in complex search tasks have suggested an association between complex ACSs and inhibitory processes (e.g., Eimer & Kiss, 2008; Eimer, Kiss, Press, & Sauter, 2009; Lamy & Egeth, 2003; Lamy, Leber, & Egeth, 2004). However, while the inhibition in these studies most likely operated independently of attention allocation (Anderson & Folk, 2012), the present experiment suggests that distractors defined by a conjunction of two features need to be selectively attended before they can be effectively inhibited. Once the first distractor is attended and subsequently suppressed, later distractors are less likely to be selected, while targets continue to be processed. Note that the present study focused on the effect of conjunctive attentional sets on nonspatial attentional allocation, which may rely on mechanisms different from those of spatial attention. Future research will be required to explore whether these findings extend to spatial allocation.

The effect of distractor preview bears some similarity to previous RSVP findings. For example, in the distractor repetition effect, presenting the same distractor before and after a target in an RSVP task yields better performance than when different distractors are used (e.g., Dux, Coltheart, & Harris, 2006), suggesting that repeated distractors may be more easily rejected. However, the distractor repetition effect relies on the distractors also being identical in terms of shape, while the distractor-matching cues and the distractor in our study were usually different letters. Repetition blindness, on the other hand, occurs when the second of two repeated items in an RSVP stream is missed, even if those items are not identical (e.g., a word presented in upper- and lowercase; Kanwisher, 1987). However, repetition blindness requires that the repeated items be presented within a short time period of each other, and does not usually occur if the interstimulus interval exceeds approximately 350 ms or three lags (Chun, 1997; Park & Kanwisher, 1994). In our study, at least 600 ms (or four lags) always separated the final cue and the distractor. Therefore, we think it is unlikely that either repetition blindness or the distractor repetition effect alone could be responsible for the effect of preview in the present study.

In summary, the findings of the present study suggest that attention cannot be tuned toward conjunctions of color and location using facilitative ACSs alone. Nevertheless, attention may be capable of selecting items on the basis of color–location conjunctions, given conditions that support the concurrent inhibition of irrelevant distractors. These findings highlight the interaction between top-down goals and bottom-up input in guiding attention through complex visual search tasks.