Associative learning between visual stimuli and rewards gives rise to persistent attentional biases. When a target feature reliably predicts a reward, stimuli possessing this feature will come to automatically capture attention (Anderson et al., 2011; Della Libera & Chelazzi, 2009; see Anderson, 2013, for a review). Such value-driven attentional capture can be observed even when the previously reward-associated feature is physically inconspicuous and known to be task-irrelevant and in situations in which reward is no longer expected (Anderson et al., 2011; Anderson & Yantis, 2012). These learned attention biases can develop rapidly (Sali, Anderson, & Yantis, 2014) yet are robust enough to persist for extended periods of time (Anderson & Yantis, 2013).

Extrinsic reward is not necessary for attentional biases towards former target-defining features to develop, however. Following substantial training in localizing a predictable target feature, phenotypically similar attentional biases can be observed without the use of explicit reward feedback (Kyllingsbaek et al., 2001, 2014; Qu, Hillyard, & Ding, in press; Shiffrin & Schneider, 1977). The training required to observe such attentional biases typically spans several thousand trials over multiple days, much longer than the brief single-session learning that is sufficient to generate value-driven attentional biases following rewarded training.

A natural question that arises is whether these biases towards former targets, with and without extrinsic reward feedback during training, share a common underlying mechanism. The dominant hypothesis is that they do, reflecting a unitary construct defined by selection history (Awh, Belopolsky, & Theeuwes, 2012; Lin, Lu, & He, 2016; Sha & Jiang, 2016; Stankevich & Geng, 2014). Theories of perceptual learning lend further insight into the potential shared mechanism. These theories posit an internal reward signal that is generated by correct identification of the target, which facilitates plasticity in sensory systems (Herzog & Fahle, 1999; Roelfsema & van Ooyen, 2005; Roelfsema, van Ooyen, & Watanabe, 2010; Sasaki, Nanez, & Watanabe, 2010; Seitz et al., 2005; Seitz & Watanabe, 2005). To the degree that attentional learning and perceptual learning are analogous in this regard, this hypothesis provides an elegant mechanistic explanation linking attentional biases for former targets with and without extrinsic reward during training. Under these assumptions, these two attentional biases are fundamentally driven by the reinforcement of selection history and differ only in the degree of reinforcement, with value-driven attentional biases developing more rapidly as a result of the more potent reward signals generated by the receipt of actual/physical, rather than conceptual/internal, reward feedback.

Of course, attentional learning and perceptual learning could rely, at least in part, on different underlying processes. More broadly, a direct test of the role of internal reward signals in shaping the attention system is lacking, due to the complexities involved in exerting experimental control over an internal construct that is under the endogenous control of the participant. As such, it remains to be demonstrated whether value-driven attention and attentional biases following repeated selection of an unrewarded target share the same learning mechanism or whether associative learning between stimuli and extrinsic rewards leads to fundamentally different learning than unrewarded selection history. To gain insight into this issue, rather than try to isolate and manipulate internal reward signals arising from correct task performance, we examined attentional biases for former targets in individuals who differ in how they process reward.

Specifically, we looked for evidence of attentional biases following repeated visual search for a consistent target, without the use of explicit rewards, in individuals experiencing depressive symptoms. Depression is associated with blunted response to rewards (Eshel & Roiser, 2010; Foti & Hajcak, 2009; Henriques & Davidson, 2000; Shankman et al., 2007). Most directly relevant to our question, recent research demonstrates that individuals with depressive symptoms show markedly blunted value-driven attentional capture (Anderson et al., 2014b). That is, individuals with depressive symptoms are, as a group, unaffected by the presence of a previously reward-predictive distractor and differ significantly from controls in this regard. Based on the common mechanisms hypothesis, we predicted the same pattern of blunted attentional capture associated with depressive symptoms in an unrewarded attentional learning task.

Experiment 1

Methods

Participants

Seventeen participants experiencing symptoms of depression (mean age = 22.1 yr, 8 females) and 15 control participants (mean age = 20.6 yr, 10 females) were recruited from Johns Hopkins University community. Using the effect size and correlation among repeating measures from our prior study of value-driven attentional capture in depressed and nondepressed participants (Anderson et al., 2014b), the current sample size yields power β > 0.90 to detect a significant interaction between distractor condition and depressed status at α = 0.05 (G*Power; http://www.gpower.hhu.de/). The depressed participants were recruited through electronic announcements as well as flyers posted at the counseling center that were specifically targeted toward individuals who were feeling depressed (BDI-II cutoff: ≥ 16 during prescreening). Participants in the control group were obtained through general recruitment methods targeted toward all undergraduate students (BDI-II cutoff: ≤ 12). Exclusion criteria included treatment with psychotropic medications and treatment for or diagnoses with any other psychiatric or neurological condition beyond depression (assessed via self-report during pre-screening). All participants reported normal or corrected-to-normal visual acuity and normal color vision. The two samples did not differ in either age (p = 0.362) or sex (p = 0.534).

Apparatus

A Mac Mini equipped with Matlab software and Psychophysics Toolbox extensions (Brainard, 1997) was used to present the stimuli on an Asus VE247 monitor. The participants viewed the monitor from a distance of approximately 50 cm in a dimly lit room. Manual responses were entered using a standard keyboard.

Beck depression inventory (BDI-II)

All participants completed the BDI-II (Beck, Steer, & Brown, 1996) on the first day of participation before completing the experimental task.

Experimental protocol

Participants completed four sessions of the training phase, each of which was completed on a different day. No more than 2 days elapsed between training sessions. Participants were provided detailed task instructions with example stimuli on the first day of training and were asked to reiterate these instructions on each subsequent day of training. If participants could not accurately recount the task instructions, the instructions were readministered. The test phase was conducted either the day after or two days after the final training session. As on the first day of training, participants were given full instructions with example stimuli.

Training phase

Stimuli

Each trial consisted of a fixation display, a search array (Fig. 1a), and, in the event of an incorrect response, a feedback display. The fixation display contained a white fixation cross (0.5° x 0.5° visual angle) presented in the center of the screen against a black background, and the search array consisted of the fixation cross surrounded by six colored circles (each 2.3° x 2.3°) placed at equal intervals on an imaginary circle with a radius of 5°. The target was defined as the green circle, exactly one of which was presented on each trial; the color of each nontarget circle was drawn from the set {blue, cyan, pink, orange, yellow, white} without replacement. Inside the target circle, a white bar was oriented either vertically or horizontally, and inside each of the nontargets, a white bar was tilted at 45° to the left or to the right (randomly determined for each nontarget).

Fig. 1
figure 1

Example visual search displays. (a) During each of 4 days of training, participants searched for a green color-defined target. (b) During the test phase, conducted on a separate day, participants searched for the unique shape (diamond among circles or circle among diamonds). The trained color (green) randomly coincided with the target shape (1/6 of trials)

Design and procedure

The training phase consisted of 1,008 trials for each of four sessions. Each trial began with the presentation of the fixation display for a randomly varying interval of 400, 500, or 600 ms. The search array then appeared and remained on screen until a response was made or 1,200 ms had elapsed, after which the trial timed out. If participants responded incorrectly, a white "X" appeared at the center of the screen for 1,000 ms, and if the trial timed out, the computer emitted a 500-ms, 1000-Hz tone. Each trial was followed by a 1000-ms blank intertrial interval (ITI).

Participants made forced-choice target identification by pressing the "z" and the "m" keys for the vertically and horizontally orientated bars within the targets, respectively. They were instructed to respond both quickly and accurately. The target appeared in each position equally often, with each position being equally often paired with each orientation/response. Trials were presented in a random order. The first eight trials were considered warm-up/practice, after which participants were provided a 30-sec break after every 100 trials.

Test phase

Stimuli

Each trial consisted of a fixation display, a search array (Fig. 1b), and, in the event of an incorrect response, a feedback display. The six shapes now consisted of either a diamond among circles or a circle among diamonds, and the target was defined as the unique shape. On every trial, one of the shapes was rendered in the color of a former target from the training phase (referred to as the trained color); the color of the remaining shapes were drawn from the same set used during training. As during training, a feedback display informed participants if their prior response was incorrect or too slow.

Design

The target was presented in the trained color on one of six trials (valid trials), and the trained color was used for a nontarget on the remaining five of six (invalid trials). The trained color appeared in each position equally often and was valid in each position equally often. Thus, the trained color was entirely nonpredictive of the target. Trials were presented in a random order.

Procedure

Participants were instructed to ignore the color of the shapes and to focus on identifying the unique shape both quickly and accurately, using the same orientation-to-response mapping. The test phase consisted of 432 trials, separated into three blocks of 144 trials with a 30-sec break between blocks. Each trial began with the presentation of the fixation display for a randomly varying interval of 400, 500, or 600 ms. The search array then appeared and remained on screen until a response was made or 1,500 ms had elapsed, after which the trial timed out. As during training, if participants responded incorrectly, a white "X" appeared at the center of the screen for 1,000 ms. If the trial timed out, the computer emitted a 500-ms, 1,000-Hz tone. Each trial was followed by a 500-ms blank ITI.

Results

Descriptive measures

Mean BDI-II score was 25.1 ± 1.8 SEM (range: 13-41) for the depressed group and 2.3 ± 0.6 SEM (range: 0-7) for the control group, t(30) = 11.22, p < 0.001, d = 4.08. The mean BDI-II score for the depressed group fell within the range of moderate depression, with four participants falling in the severe range (≥ 29). In contrast, each of the control participants scored in the bottom/normal range defined as minimal depression.

Training phase

An analysis of variance (ANOVA) on mean correct RTs with day (1-4) as a within-subjects factor and depressed status (depressed vs control) as a between-subjects factor revealed a main effect of day, F(3,90) = 18.52, p < 0.001, η2 p = 0.577, but no main effect of depressed status, F(1,30) = 0.05, p = 0.822, or interaction, F(3,90) = 0.73, p = 0.536 (Fig. 2a). The same ANOVA on accuracy revealed a main effect of day, F(3,90) = 3.04, p = 0.033, η2 p = 0.092, and a marginal effect of depressed status, F(1,30) = 3.83, p = 0.060, η2 p = 0.113, but no interaction F(3,90) < 0.01, p = 0.994 (Fig. 2b). Thus, a robust practice effect was evident across the training phase for both depressed and control participants, with a marked improvement in performance between the first and second day of training.

Fig. 2
figure 2

Mean response time (a) and accuracy (b) for each day of training in Experiment 1, separately for depressed and control participants. Error bars reflect the SEM

Test phase

An ANOVA on mean correct RTs with the validity of the trained color (valid vs. invalid) as a within-subjects factor and depressed status (depressed vs. control) as a between-subjects factor revealed a highly robust validity effect, F(1,30) = 56.32, p < 0.001, η2 p = 0.652, but no main effect of depressed status, F(1,30) = 1.53, p = 0.226, or interaction, F(1,30) = 3.14, p = 0.087 (Fig. 3a). If anything, the validity effect tended to be larger in the depressed participants and was highly reliable when considering only the depressed group, t(16) = 6.55, p < 0.001, d = 1.59, JZS Bayes Factor = 3,179.44 in favor of the alternative hypothesis (Rouder et al., 2009).

Fig. 3
figure 3

Mean response time (a), accuracy (b), and inverse efficiency (c) for valid and invalid trials in the test phase of Experiment 1, separately for depressed and control participants. (df) The same measures for Experiment 2. Error bars reflect the SEM. *p < 0.05 **p < 0.01 ***p < 0.001

The same ANOVA on accuracy revealed a significant validity effect, F(1,30) = 13.99, p = 0.001, η2 p = 0.318, and a main effect of depressed status, F(1,30) = 5.79, p = 0.023, η2 p = 0.162, but no interaction, F(1,30) = 0.03, p = 0.856 (Fig. 3b). Participants with depressive symptoms were generally less accurate, but the validity effect did not differ across depressed and control groups; considering only the depressed group, a reliable validity effect was still evident, t(16) = 2.39, p = 0.030, d = 0.57.

Combining RT and accuracy into the single measure, inverse efficiency (IE; Townsend & Ashby, 1978), also revealed no interaction, F(1,30) = 0.03, p = 0.856, with a significant validity effect in both depressed and control participants ts > 4.77, ps < 0.001, JZS Bayes Factor > 109 in favor of the alternative hypothesis (Fig. 3c).

Experiment 2

The findings of Experiment 1 suggest that selection history effects on attention are largely unaffected by depressive symptoms. However, caution is warranted when relating these results to prior work showing markedly blunted value-driven attentional capture in a depressed sample (Anderson et al., 2014b). In particular, the test phase task differed in that the trained color coincided with target at chance in Experiment 1 but was always used for a nontarget in this prior study. If the distractor is not entirely task-irrelevant, the attention mechanisms at play may be fundamentally different than when it is, with only purely involuntary mechanisms being related to depression. Therefore, in Experiment 2, depressed and nondepressed participants completed the same test phase but following a training phase in which green targets were associated with high monetary reward, thus allowing for direct comparison of attentional capture following a training procedure with and without reward.

Methods

Participants

A new set of seventeen participants experiencing symptoms of depression (mean age = 18.2 yr, 13 females) and 15 control participants (mean age = 18.7 yr, 11 females) were recruited from Texas A&M University. Participants reflected a convenience sample (BDI-II cutoff ≥ 14 for depressed and ≤ 12 for control). All participants reported normal or corrected-to-normal visual acuity and normal color vision. The two samples did not differ in either age (p = 0.118) or sex (p = 0.838)

Apparatus and experimental task

A Dell OptiPlex equipped with Matlab software and Psychophysics Toolbox extensions (Brainard, 1997) was used to present the stimuli on a Dell P2717H monitor. The training phase involved a single 240 trial session. The target was green on half of the trials and red otherwise (as in Anderson et al., 2014b). The search array was presented for 800 ms or until a response, and correct responses were followed by monetary reward feedback in which a small amount of money was added to a bank total that participants were paid at the end of the experiment (in addition to course credit). Green targets were followed by a high reward of 5¢ on 80% of correct trials and a low reward of 1¢ on the remaining 20%; for red targets, the percentages were reversed. Otherwise, the training phase was identical to Experiment 1. The test phase was exactly identical to Experiment 1.

Results

Descriptive measures

Mean BDI-II score was 17.7 ± 0.8 SEM (range: 14-26) for the depressed group and 4.3 ± 1.0 SEM (range: 0-11) for the control group, t(30) = 10.52, p < 0.001, d = 3.70. The mean BDI-II score for the depressed group fell within the range of mild depression; five participants fell in the moderate range (20-28) and no participant fell in the severe range. In contrast, each of the control participants scored in the bottom/normal range defined as minimal depression.

Training phase

Mean RT was 551 ms for depressed participants and 539 ms for control participants, which did not significantly differ, t(30) = 0.91, p = 0.375. Mean accuracy was 84.5% for depressed participants and 83.7% for control participants, which did not significantly differ, t(30) = 0.23, p = 0.817.

Test phase

Given that both depressed and control participants showed significant validity effects in both RT and accuracy in Experiment 1, subsequent analyses focused on the combined measure of IE (Fig. 3d–f). An ANOVA with experiment (unrewarded vs rewarded training) and depressed status as between-subjects factors, and the validity of the trained color as a within-subjects factor, revealed the critical three-way interaction, F(1,60) = 5.15, p = 0.027, η2 p = 0.079, indicating that the relationship between the validity effect and depressed status differed by experiment. For depressed participants, the validity effect was not significant in Experiment 2, t(16) = 1.02, p = 0.322, JZS Bayes Factor = 2.56 in favor of the null hypothesis, and was significantly reduced compared to Experiment 1, t(32) = −3.14, p = 0.004, d = 1.08, JZS Bayes Factor = 10.71 in favor of the alternative hypothesis. In contrast, control participants exhibited a significant validity effect in Experiment 2, t(14) = 3.78, p = 0.002, d = 0.98, JZS Bayes Factor = 21.15 in favor of the alternative hypothesis, that did not differ significantly from Experiment 1, t(28) = −0.17, p = 0.870, JZS Bayes Factor = 2.94 in favor of the null hypothesis.

Discussion

Contrary to the shared mechanisms hypothesis, individuals with depressive symptoms showed robust attentional biases as a group following extended unrewarded training, and these biases were comparable to those of controls. This pattern stands in clear contrast to the markedly blunted value-driven attentional biases evident in this same population in prior research (Anderson et al., 2014b), which was confirmed by Experiment 2 using the same test phase. If value-driven attention and attentional biases arising from target history reflect the same underlying mechanism, they should be similarly affected by depressive symptoms. Our results provide evidence to the contrary.

Rather, our findings suggest a qualitative distinction between value-driven attention and selection history effects on attention and cast doubt on the idea that such selection history effects critically depend on normal reward processing. It seems not to be the case that selection history effects on attention simply reflect a less potent form of the same associative learning processes that give rise to value-driven attentional capture. At some stage—either in the learning process or in the expression of that learning—value-driven attention and selection history diverge mechanistically in terms of how experience is translated into an enduring bias. If the internal reward signal hypothesis is to be maintained in the context of selection history effects on attention, at a minimum it becomes necessary to hypothesize two distinct reward signals that independently modulate attention, only one of which is affected by depression.

Participants experiencing depressive symptoms were overall less accurate in Experiment 1, significantly so in the test phase and marginally in the training phase. This is consistent with an impact of the hypothesized blunted internal reward signals on task motivation. Such a reduction in motivated task performance makes the preservation of the learned priorities in the test phase of this experiment, and its independence from motivational/internal reward processes, all the more striking in the depressed group.

It could be argued that selection history effects on attention are in fact blunted in depressed individuals, but the duration of training allows them sufficient time to "catch up" with saturated learning in nondepressed individuals. In this regard, it is noteworthy that there was not even a trend towards blunted attentional bias in depressed individuals in Experiment 1, nor any interaction in the learning curves as measured during training. If value-driven attention is reducible to the effects of selection history accelerated by a stronger reward signal, and this strong (extrinsic) reward signal has a blunted effect on attention in depressed individuals, it is difficult to argue that an even weaker (internal) reward signal would have built up to maximal learning in the present study.

In Experiment 2 of the present study, in which monetary rewards were provided during training, the recruited sample of depressed individuals had lower depression scores than the participants in Experiment 1. Note that this cannot explain the observed dissociation, which would predict the opposite result if attentional biases arising from selection history and reward history were the product of the same underlying mechanism (would predict more blunted attention effects in individuals who are more depressed). It also is worth noting that we did not assess comorbid anxiety symptoms. Therefore, although the findings of the present study speak clearly to a distinction between the influence of selection history and reward history on attention, they cannot be attributed uniquely to depression and may reflect other related symptoms and characteristics.

We hypothesize two distinct processes that support attentional biases to former targets, only one of which is dependent on reward-related processes. When extrinsic rewards are received, reward prediction-error-related signals serve as teaching signals to the attention system that potentiate the associated stimulus representation (Anderson, in press; Sali et al., 2014), thereby facilitating selection of the predictive stimulus through the biasing of competition in sensory areas (Anderson et al., 2014a; Anderson et al., 2016; Hickey & Peelen, 2015; see Anderson, 2016, for a review). The strength of such modulation to some degree (although not necessarily linearly; Anderson, 2016) scales with the value of the reward received. This learning occurs via associative/Pavlovian mechanisms linking sensory experiences to outcomes (Le Pelley et al., 2015). On the other hand, with or without extrinsic rewards, repeated selection of a stimulus potentiates orienting responses in a stimulus–response manner. Although the development of such orienting biases likely requires some amount of task-specific motivation and can be modulated by reinforcement provided by extrinsic rewards, these biases do not critically depend on reward feedback processing and are fundamentally the result of repetition of a motivated behavior. Value-driven attention likely reflects a combination of these learning processes, only the former of which is affected by depressive symptoms. Future research is needed to explore more precisely the manner in which reward history and selection history differ in terms of how they influence the attention system.

Conclusions

We provide evidence that dissociates value-driven attention from effects of selection history on attention. Our findings provide support for the idea that multiple distinct learning mechanisms contribute to the development of attentional biases for former targets, which are differently dependent on normal reward processing. In this regard, value-driven attention is not reducible to the same learning mechanisms that facilitate attention to consistent targets via selection history.