Rapid-serial-visual-presentation (RSVP) tasks have been used extensively to study the temporal limits of selective attention mechanisms. In one variant, items are presented sequentially at a known location, with observers being required to identify two target items amongst nontarget distractors. Most observers show near-perfect first-target (T1) accuracy, while second-target (T2) accuracy is degraded when the inter-target interval (lag) is short. The magnitude of the degradation tends to peak 200–300 ms after T1 and gradually decreases until lags of 500–600 ms, at which point performance asymptotes. This robust phenomenon, known as the attentional blink (AB), has been studied extensively over the past 25 years (Dux & Marois, 2009).

Theoretical accounts of the AB have implicated a variety of neural (e.g., Hommel et al., 2006; Shapiro, Hillstrom, & Husain, 2002; Williams, Visser, Cunnington, & Mattingley, 2008) and cognitive mechanisms. For example, classic bottleneck theories (e.g., Chun & Potter, 1995; Jolicœur & Dell’Acqua, 1998) have focused on resource limitations in late-stage visual processing, whereas interference accounts (e.g., Raymond, Shapiro, & Arnell, 1992, 1995) posit that the AB is caused by inter-item competition in the visual short-term memory store. More recent theoretical explanations have shifted away from a focus on resource limitations, to emphasize the deleterious role of involuntary task switches caused by distractors (e.g., Di Lollo, Kawahara, Ghorashi, & Enns, 2005), the inhibition of target processing triggered by the appearance of inter-target distractors (Olivers & Meeter, 2008), the role of cognitive-control rules in perceptual processing (Taatgen, Juvina, Schipper, Borst, & Martens, 2009), and the suppression of transient attention to perceptual inputs (Bowman & Wyble, 2007; Wyble, Bowman, & Nieuwenstein, 2009). Critically, however, although each of these theories posits a different principal mechanism, all share the viewpoint that the AB arises from fundamental limitations in sequential object processing (Dell’Acqua, Dux, Wyble, & Jolicœur, 2012).

In contrast with this body of theoretical work, and with previous failures to eliminate the AB with training (Braun, 1998; Maki & Padmanabhan, 1994; Taatgen et al., 2009), Choi, Chang, Shibata, Sasaki, and Watanabe (2012) recently reported that the AB could be completely eliminated after 1 h of practice on an RSVP task in which T2 was made salient with color. In addition, in a functional-imaging experiment they found that training decreased the hemodynamic response in frontal executive areas, particularly in dorsal lateral prefrontal cortex. These authors argued that training directly enhanced attentional control, thereby eliminating object-processing limitations.

Although it is possible that training could reduce processing limitations directly, as was suggested by Choi et al. (2012), closer inspection of their paradigm suggests that limitations may have instead been bypassed indirectly, through the establishment of temporal expectations about when targets would appear. A notable feature of Choi et al.’s work is their use of consistent temporal intervals in training and assessment (see the left panel of Fig. 1), with T1 always being presented as the second item and T2 presented after a short (lag 2) or a long (lag 6) interval. This consistency was highlighted during training by repeated presentations of a uniquely colored T2 at short lags.

Fig. 1
figure 1

Schematic representation of the rapid-serial-visual-presentation (RSVP) tasks used in the three conditions for training and assessment. In each panel, the RSVP stream used during assessment is shown on the left, and that used during training is shown on the right. In the assessment tasks, the second target (T2) was presented after one or five intervening distractors (lag 2 or 6) following the first target (T1). The training task used the same RSVP task as the lag 2 condition, but with T2 made salient with color (indicated here by the star). The constant condition replicates Choi et al. (2012). In all conditions, the assessment task was completed both pre- and posttraining.

Extensive research suggests that temporal expectations modulate selective attention. For example, cueing an interval between targets improves performance (Correa, Lupiáñez, Milliken, & Tudela, 2004; Coull & Nobre, 1998), enhancing the brain activity associated with low-level visual processing (Correa, Lupiáñez, Madrid, & Tudela, 2006) and stimulus discriminability (Correa, Lupiáñez, & Tudela, 2005). More relevantly, the AB in both auditory and visual modalities is reduced when target onset is cued (Badcock, Badcock, Fletcher, & Hogben, 2013; Martens & Johnson, 2005; Shen & Alain, 2012). Studies have suggested that temporal cueing reduces the AB by increasing the discriminability of cued stimuli (Rolke & Hofmann, 2007). There is also evidence that temporal expectations speed the latencies of the electrophysiological components associated with the AB (Shen & Alain, 2011; Vangkilde, Coull, & Bundesen, 2012), implying that this allows targets faster access to processing resources.

The conjecture that Choi et al.’s (2012) training paradigm indirectly reduced the AB by creating temporal expectations is consistent with this large body of literature, and it can also account for a curious reduction in T2 performance seen at long lags after training (see Supplementary Fig. 1 from Choi et al., 2012). This decrement is inconsistent with the reduction in processing limitations, which presumably would only serve to improve performance. However, it can be easily explained if it is assumed that the violation of established temporal expectations at long-lag trials reduced target accuracy.

To test our account, we compared performance across three groups of trainees. One group received a training-and-assessment regimen that was identical to the procedure of Choi et al. (2012). A second group received the same training, but an assessment task with a random number of distractors being presented prior to the first target in order to disrupt the predictability of target onset. A third group received the same assessment task as Choi et al.’s participants, but a training task that included additional trials at varying lags in order to reduce the saliency of short-lag trials. To preview the results, reducing the predictability of target onset in either the training or assessment tasks significantly reduced the benefits to AB performance. This supports our hypothesis that Choi et al.’s training indirectly ameliorated the AB by creating temporal expectations, rather than directly circumventing processing limitations.

Method

Participants

A group of 48 adult volunteers (22 male, 26 female; 17–53 years, median = 18 years) with normal or corrected-to-normal visual acuity participated in the study. All were first-year psychology students who received course credit and provided informed consent prior to testing. The procedure was approved by the Human Research Ethics Committee at The University of Western Australia and was in accordance with the Declaration of Helsinki.

Apparatus, stimuli, and procedure

Stimuli were presented on CRT monitors (refresh rate = 100 Hz) using Pentium PCs running Presentation software (Version 12.4, Neurobehavioral Systems) that were located in a dimly-lit room. The experimental design followed that of Choi et al. (2012), with identical assessment tasks being presented before and after a training task. Except where noted, the trials on the assessment tasks consisted of ten sequentially-presented white items (approximately 1º; C.I.E. x = .29 [±95% C.I. = .01], y = .30 [±95% C.I. = .02]), including two target digits and eight distractor letters presented on a black background. The targets consisted of the digits 2–9, and the distractors consisted of all uppercase letters except for B, I, O, and Q. Items were selected randomly without replacement on each trial. Trials on the training task were identical, except that T2 was presented in red (x = .62 [±95% C.I. = .01], y = .34 [±95% C.I. = .01]). Each trial began with a white fixation cross, presented at the center of the display. Participants initiated the RSVP stream by pressing the spacebar, with each item in the stream being presented for 100 ms and followed immediately by the next item. Following the final item, participants were prompted to report the targets by pressing the corresponding keys on the keyboard.

Participants completed one of three conditions, depicted in Fig. 1. The constant condition directly replicated the procedures of Choi et al. (2012). During the assessment task, T1 was always presented as the second item in the RSVP stream, whereas T2 was randomly presented at lag 2 (200 ms after T1), at lag 6 (600 ms after T1), or omitted (30 trials each). The training task consisted of 450 trials that were identical to the lag 2 trials in the assessment task, except that T2 was presented in red. The variable-assessment condition was identical to the constant condition, except that one to four items (chosen randomly) were presented prior to T1 during the assessment trials. The variable-training condition was identical to the constant condition, except that additional training trials were included, such that 450 trials were presented at lag 2, 234 trials at lag 4 (400 ms after T1), and 234 trials at lag 6 (600 ms after T1).

Results

Six of the participants (four from the constant condition, two from variable-training condition) were excluded from the analysis, because their target accuracies were more than three standard deviations below the group mean at one or more lags in the assessment or the training condition. This left 14 participants in each group. Inclusion of these outliers did not change the reported pattern of results.

A one-way repeated measures analysis of variance (ANOVA) showed that T1 accuracy did not differ between the constant (M = 88.32, SD = 8.84), variable-assessment (M = 86.63, SD = 15.64), and variable-training (M = 83.54, SD = 19.61) conditions during training, F(2, 39) = 0.34, p > .05, η p 2 = .02. T2|T1 accuracy also did not differ between the constant (M = 90.68, SD = 14.95), variable-assessment (M = 94.01, SD = 4.54), and variable-training (M = 94.52, SD = 5.22) conditions, F(2, 41) = 0.67, p = .52, η p 2 = .03, during training.

Mean T1 accuracy (see Table 1) was examined in a 3 (Condition: constant, variable-assessment, variable-training) × 2 (Training: pre vs. post) × 2 (Lag: 2 vs. 6) mixed-design ANOVA. No main effects or interactions were significant (all ps < .25, η p 2s < .06), apart from Lag, F(1, 13) = 16.16, p < .001, η p 2 = .29, indicating that T1 accuracy increased over lags. T2|T1 accuracy, shown in Fig. 2, suggests that a robust AB occurred in all conditions prior to training. Following training, the AB was eliminated in the constant condition but remained present in the other conditions. Indeed, a 3 (Condition) × 2 (Training) × 2 (Lag) mixed-design ANOVA revealed significant effects of Training, F(1, 39) = 89.88, p < .001, η p 2 = .70, and Lag, F(1, 39) = 10.86, p = .002, η p 2 = .22, as well as Training × Lag, F(1, 39) = 47.68, p < .001, η p 2 = .55; Condition × Training, F(2, 39) = 5.33 p = .009, η p 2 = .22; and, most importantly, Condition × Training × Lag, F(2, 39) = 7.10, p = .002, η p 2 = .28, interactions. No other effects were significant (all Fs < 1.2, ps > .3, η p 2s < .25).

Table 1 Pre- and post-training Target 1 accuracy (with ±95% confidence intervals indicated in parentheses) for the three between-subjects conditions
Fig. 2
figure 2

Left panel: T2|T1 accuracy for the three between-subjects conditions for lags 2 (blank circles) and 6 (filled circles) before and after training. Error bars indicate within-subjects 95% confidence intervals (Cousineau, 2005). Data points are offset on the abscissa for clarity. Right panel: Learning index comparing post- and pre-training performance at lags 2 and 6 in all three conditions. Error bars indicate ±95% confidence intervals

To investigate these interactions, separate 2 (Training) × 2 (Lag) repeated measure ANOVAs were conducted for each condition. In the constant condition, the main effects of Training, F(1, 13) = 9.43, p = .009, η p 2 = .42, and Lag, F(1, 13) = 31.48, p < .001, η p 2 = .71, as well as the Training × Lag interaction, F(1, 13) = 54.36, p < .001, η p 2 = .81, were significant. Replicating Choi et al. (2012), performance after training increased at lag 2, t(13) = 8.22, p < .001, d = 1.59, and decreased marginally at lag 6, t(13) = 2.03, p = .06, d = 0.65, yielding no significant difference between these lags after training, t(13) = 0.36, p = .97, d = 0.01. In the variable-assessment condition, we observed a main effect of Lag, F(1, 13) = 40.19, p < .001, η p 2 = .76, and a Training × Lag interaction, F(1, 13) = 5.60, p = .034, η p 2 = .30, indicating that training affected only lag 2 performance. Moreover, a significant difference still emerged between lags 2 and 6 after training, t(13) = 4.70, p < .001, d = 1.74. In the variable-training condition, we found a main effect of Lag, F(1, 13) = 20.82, p = .001, η p 2 = .62, and a Training × Lag interaction, F(1, 13) = 6.14, p = .03, η p 2 = .32, indicating a greater effect of training at lag 2. Again, a significant difference between lags 2 and 6 remained after training, t(13) = 2.33, p = .04, d = 0.70.

To further clarify the impact of training across the three conditions, we computed difference scores between T2|T1 accuracy before and after training (shown at the right of Fig. 2). This learning index revealed that training led to greater improvement at lag 2 in the constant condition, relative to both the variable-training, t(26) = 3.39, p = .002, d = 1.28, and variable-assessment, t(26) = 2.59, p = .015, d = 0.98, conditions, which were themselves indistinguishable, t(26) = 0.12, p = .91, d = 0.05. No significant differences emerged between conditions at lag 6 (all ps > .05, ds < .4).

Discussion

We examined whether the training program used by Choi et al. (2012) indirectly ameliorated the AB by establishing temporal expectations about the onset of T2, rather than directly affecting processing limitations. When we replicated their design exactly, we found that training eliminated the AB, as in their original results. However, introducing variability in the target’s temporal position in either the assessment or the training period reduced improvements at lag 2, leaving a significant AB in evidence after training. This pattern is consistent with the temporal-expectation account.

How does Choi et al.’s (2012) training regimen establish temporal expectations? We suggest that training highlighted T2 relative to distractors by making it a different color (Raymond et al., 1995; Visser, Bischof, & Di Lollo, 2004; Ward, Duncan, & Shapiro, 1997), thereby emphasizing its temporal position in the RSVP stream. In turn, because only lag 2 trials were used in training, this created a strong expectation for such trials in the assessment task, and thus improved performance. Four pieces of evidence support this explanation. First, we received unprompted feedback from one participant who specifically reported noticing that all items in the color-saliency training occurred at the same interval. Second, after training, performance declined at lag 6 for Choi et al. and in our constant condition. This can easily be explained if we assume that performance suffered because observers’ temporal expectations were violated, but not by the reduced processing limits proposed by Choi et al. Third, the post-training decline at lag 6 was attenuated in the variable-assessment and variable-training conditions, when temporal expectancies were reduced. Finally, the temporal-expectation account also explains why the training benefits largely disappeared when training did not include a color-salient T2 (Choi et al., 2012, Exp. 3). We suggest that this occurred because salient color is much more effective at highlighting the temporal position of T2 than is the undifferentiated target, and that this improved performance.

In addition to their behavioral results, Choi et al. (2012) showed that training reduced frontal lobe (particularly dorsolateral prefrontal cortex) activation, but not activation in primary visual cortex, which they interpreted as reflecting training-based improvements to top-down attentional control. Our results suggest that these changes are instead more likely to reflect the establishment of temporal expectations. The literature examining the neurophysiological basis of temporally orienting attention shows that the amplitude of the N2 component is enhanced when temporal expectations are violated (Miniussi, Wilding, Coull, & Nobre, 1999), whereas N2 latencies decrease at cued temporal intervals (Seibold, Fiedler, & Rolke, 2011). Importantly, these components originate in frontal areas similar to those modulated by color-saliency training (Doherty, Rao, Mesulam, & Nobre, 2005). Functional-imaging studies have also shown that orienting attention to a temporal interval modulates activity in these frontal regions, without leading to corresponding changes in primary visual cortical areas (Coull, 2004; Coull, Walsh, Frith, & Nobre, 2003). Taken together, this evidence points compellingly to changes in temporal expectations as being the source of the neutral activity patterns observed by Choi et al.

Although our results suggest that training effects are mediated by temporal expectations, rather than direct changes to the processing limitations underlying the AB, it is important to consider other potential explanations. One option is that color-saliency training effects are only found when T1 is presented near the beginning of the RSVP stream, so as to fall within the “attentional awakening” period (Ambinder & Lleras, 2009). This would explain why training benefits sharply declined in the variable-assessment condition, during which T1 appeared outside the attentional awakening window on some trials. However, this account cannot as easily explain the reduced benefits in the variable-training condition, unless one assumes that color-saliency training on inter-target lags that are not used in the assessment task is sufficient to interfere with training-based improvements during the attentional awakening period. This is clearly an issue for additional study.

Although the present findings clearly suggest that Choi et al.’s (2012) training effects were mediated by temporal expectations, it remains possible that additional training might impact central limitations directly. Indeed, the AB improved in all three training conditions, clearly indicating that broad-based practice effects on performance do exist. That said, it is notable that both the variable-assessment and variable-training conditions produced similar changes in the AB, despite having considerably different training durations. This pattern is also consistent with earlier failures to eliminate the AB using prolonged training (e.g., Maki & Padmanabhan, 1994). These results suggest that existing training methods have not yet been conclusively shown to overcome the central limitations underlying the AB.

In addition to training duration, other factors, such as the possible impact of sleep on training consolidation (Stickgold, James, & Hobson, 2000), also were not systematically investigated here, as we included a single day of training instead of the three days employed by Choi et al. (2012). However, a close examination of Choi et al.’s results suggests that they found no significant improvement in the AB (i.e., at lag 2) beyond their initial training session. This implies that at least in the context of the present regimen, sleep might play a relatively minor role in mediating the training benefits. Nevertheless, there is a need for additional study to establish whether training can directly modulate the processing limitations underlying the AB.

In conclusion, we have shown that the recently reported elimination of the AB with color-saliency training did not, as claimed, directly ameliorate processing limitations. The amelioration of the AB instead was largely due to the use of constant presentation intervals, combined with a salient T2, which fostered strong temporal expectations for the short-interval targets for which the AB is most noticeable. This reconciles the effects of color-saliency training with a unequivocal body of research showing that the AB can be reduced, but not eliminated, when attention is oriented in time (Badcock et al., 2013; Martens & Johnson, 2005; Shen & Alain, 2011, 2012). It also provides additional evidence that the AB remains largely immune to elimination with training (Braun, 1998; Maki & Padmanabhan, 1994; Taatgen et al., 2009), and is consistent with the majority of theoretical accounts, which have suggested that the AB arises from fundamental limits in object processing.