Introduction

In an enriched environment containing various stimuli, attention is a vital cognitive function that plays an important role in the selective processing of some valuable input in the prioritization and rejection of meaningless or distracting stimuli. This attentional selection toward specific objects or locations operates based on voluntary factors, such as intentions or task goals, and involuntary factors, such as the physical salience of stimuli (Posner, 1980). Regarding involuntary attentional deployment, it has been suggested that attentional capture occurs according to the perceptual salience of stimuli (e.g., Theeuwes, 1992) or a top-down attentional control setting (e.g., Folk, Remington, & Johnston 1992).

Recently, a considerable amount of research has found that involuntary attentional capture is also modulated by reward history. A reward-associated stimulus captures attention even when it is neither salient nor task-relevant (e.g., Anderson, Laurent, & Yantis, 2011, 2012, 2013; for review, see Anderson, 2013; Awh, Belopolsky, & Theeuwes, 2012; Failing & Theeuwes, 2017). For example, in Anderson et al.’s (2011) experiment, participants were asked to perform different visual search tasks in two separate phases – training and testing. In the training phase, in which the target was defined as two colors among heterogeneously colored non-targets, one color was paired with a high-value reward more often than a low-value reward and vice versa for the other color. In the test phase, in which the target was defined as a shape singleton (e.g., a diamond among circles), the reward-associated stimuli were presented as a distractor in half of the trials. As a result, significant attentional interference occurred during trials in which the distractor associated with high reward was presented compared to those in which either the distractor associated with low reward or no reward-associated distractor was presented. Importantly, in the test phase, the distractor color was non-salient in the search array consisting of stimuli with heterogeneous colors and it was task-irrelevant because the target was defined as a shape singleton. This attentional bias toward value-associated stimuli is called value-driven attentional capture (VDAC).

This attentional modulation by reward depends on associative learning between a conditioned signal, such as a stimulus feature, like color, and a subsequent outcome, such as a monetary reward (see Bucker & Theeuwes, 2017; for a review, see Anderson, 2013). Critically, previous studies reporting VDAC are strongly related to the strength of reward associations in that the stronger the association between a stimulus feature and reward forms, the greater the VDAC to the feature occurs. Specifically, the magnitude of VDAC has been found to correlate with the amount of reward per trial (Bucker & Theeuwes, 2017; Chelazzi, Perlato, Santandrea, & Della Libera, 2013; Feldmann-Wüstefeld, Brandhofer, & SchubÖ, 2016; Hickey, Chelazzi, & Theeuwes, 2010; Kiss, Driver, & Eimer, 2009; Le Pelley, Pearson, Griffiths, & Beesley, 2015; Munneke, Belopolsky, & Theeuwes, 2016; but cf. Sha & Jiang, 2016) or the probability of reward in overall learning (Anderson et al., 2011; Failing & Theeuwes, 2014; Laurent, Hall, Anderson, & Yantis, 2015; Lee & Shomstein, 2014; MacLean, Diaz, & Giesbrecht, 2016; Roper & Vecera, 2016). In other words, the strength of VDAC to a stimulus feature is directly related to the strength of the reward association, which is represented as the expected value (EV), calculated by multiplying the amount of each possible reward by its probability.

Factors other than the strength of reward association have been shown to influence attentional modulation by reward. For example, Pearce and Hall (1980) proposed the uncertainty-based attention theory, according to which people deploy their attention more toward a stimulus feature presenting uncertainty about its association with an outcome than features without such uncertainty (Beesley, Nguyen, Pearson, & Le Pelley, 2015; Easdale, Le Pelley, & Beesley, 2019; Hogarth, Dickinson, Austin, Brown, & Duka, 2008; Le Pelley, Pearson, Porter, Yee, & Luque, 2019; Luque, Vadillo, Le Pelley, & Beesley, 2016; Walker, Luque, Le Pelley, & Beesley, 2019). In other words, observers are likely to deploy more attention to a stimulus associated with uncertainty for further processing, which can be expressed as the variance of outcomes (Fiorillo, Tobler, & Schultz, 2003; Rushworth & Behrens, 2008; Tobler, O'Doherty, Dolan, & Schultz, 2006). Indeed, many studies have demonstrated that when participants infer what outcome is expected for a given stimulus in associative learning, the attentional allocation to the stimulus is modulated by the probabilistic relationship between the stimulus and its outcome (Beesley et al., 2015; Easdale et al., 2019; Hogarth et al., 2008; Luque et al., 2016; Koenig, Kadel, Uengoer, Schubö, & Lachnit, 2017a; Koenig, Uengoer, & Lachnit, 2017b). For example, in Beesley et al.’s (2015) experiment, participants were asked to predict what type of outcome would result from two artificial images. Importantly, each image contained information about the upcoming outcome but at differing levels of probability. Some images resulted in a specific outcome with 100% probability, while others resulted in a particular outcome with either 70% or 30% probability. As in other studies (Easdale et al., 2019; Hogarth et al., 2008; Luque et al., 2016), the amount of attentional bias to stimuli depending on their predictability was measured to investigate attentional modulation based on the stimulus-outcome probabilistic relationship. Studies have consistently reported prolonged dwell time on stimuli with uncertainty about an upcoming outcome (De Tommaso, Mastropasqua, & Turatto, 2019; Koenig et al., 2017a).

Recently, some studies have examined whether uncertainty modulates attentional deployment to task-irrelevant stimulus features associated with either positive or aversive stimuli (De Tommaso, et al., 2019; Koenig et al., 2017a, 2017b; Le Pelley et al., 2019). The results showed that participants allocated more attention to stimuli when the reward or punishment for their responses to the stimuli was delivered with some degree of uncertainty compared to those delivered without uncertainty, consistent with the uncertainty-based attention theory. Importantly, although these studies extended the examination of the uncertainty effect on attention to valenced outcomes, the impact of uncertainty was not clearly dissociated from the effect of reward expectancy on attentional allocation. In the learning phase of Koenig et al.’s (2017a) eye-tracking experiment, one of three target colors was associated with an uncertain reward by giving participants a large reward but with intermediate probability (e.g., 10 cents with 50% probability). Other target colors were associated with certain rewards by giving participants a large reward (e.g., 10 cents) for one color and a small reward (e.g., 1 cent) for the other color with 100% probability. They found that while the duration of the first fixation varied as a function of uncertainty, the first fixation landed on the color associated with the large reward more frequently than the colors associated with the small reward in testing trials, regardless of whether the EV of the uncertainty-related color was greater or smaller than the EV of the certainty-related color. Similarly, in De Tommaso et al.’s (2019) experiments, greater attentional bias was observed towards the stimuli associated with a high level of reward expectancy (reward probability, p = .8) than those associated with the highest level of uncertainty (p = .5).

In addition, the uncertainty effect on attentional capture was tested during reward associative learning rather than after the learning, as in previous VDAC studies (e.g., Anderson et al., 2011; Awh et al., 2012; Failing & Theeuwes, 2017; Le Pelley et al., 2019). In Koenig et al.’s study (2017a), in which the uncertainty effect was examined during value associative learning, there were two types of trials: learning trials and search trials. A specific target color was associated with reward with some degree of uncertainty in the learning trials. The reward-associated color distractor was presented when performers were required to search for a shape singleton target in the search trials. Importantly, these two types of trials were intermixed randomly within each block rather than isolated into separate blocks. Thus, it is possible that attentional bias toward the uncertainty-related features on the search trials was due to the transfer of the strategic attentional control adopted for the learning trials. Since participants had to learn the relationship between stimuli features and rewards on the learning trials, they were likely to strategically allocate more attention to the features associated with an uncertain outcome. Consequently, this top-down control might have been transferred from the learning to the search trials.

The goal of the present study was to test whether uncertainty associated with previously learned value modulates attentional capture on the basis of reward-associated features that are task-irrelevant and non-salient in a given task. Participants performed different visual search tasks in the Training and Test Phases. By separating the Test Phase from the Training Phase, it becomes possible to examine whether involuntary attentional capture is modulated by uncertainty even when the reward is no longer available. Ample evidence has shown that attentional capture by previous value-associated stimuli is continuously maintained even in unrewarded contexts (e.g., Anderson et al., 2011).

To examine the uncertainty effect on attention without the effect of the strength of reward association, the degree of reward uncertainty for reward-associated features was manipulated while the EV was held constant. More specifically, there were two types of reward-associated target features in the Training Phase: uncertain and certain reward-associated target features. The former was associated with uncertain reward, such as relatively large (e.g., 75 or 100 points) or small (e.g., 10 or 25 points) reward points, or no reward (e.g., 0 point). The latter was associated with a non-variable reward of points, such as being of constant size (e.g., 50 points). Thus, the EVs for the two types of reward-associated features were mathematically identical, while the uncertainty for each type was different.

In the Test Phase, while searching for a shape singleton target among heterogeneously colored stimuli, a distractor inked by one of the reward-associated colors was presented in half of the trials and no reward-associated distractor was presented in the other half. If uncertainty modulates VDAC without the influence of EV, the pattern and/or amount of attentional interference resulting from the uncertainty-paired color distractor should be different than that from the certainty color distractor. Specifically, when considering the findings from the literature reviewed above, we expected that distractors associated with uncertainty would capture attention more strongly than those associated with certainty.

Experiment 1

Experiment 1 investigated the modulatory effect of uncertainty on attentional capture for value-associated stimuli, when EV is held constant. In the Training Phase, participants were instructed to respond to the orientation of a bar inside a red or green circle among heterogeneously colored circles. One of the two target colors was associated with an uncertain reward, which was 100 points on 25% of the trials and 0 points on 75% of trials, referred to as the “uncertainty target.” The other target color was associated with a certain reward, which was 25 points with 100% probability, referred to as the “certainty target.” The EVs for both target colors were identical (25 points), whereas they differed in terms of the uncertainty of whether reward was delivered or omitted. This relationship between probability and uncertainty indicates variance, which is calculated by the following formula: P × (λ − V)2 + (1 − P) × (0 − V)2. In the formula, λ indicates the magnitude of reward, P is reward probability, and V is learned value, which is calculated as λ × P (see Table 1).

Table 1 The levels of reward probability, magnitude, and variance of reward-associated targets (distractor) as a function of target (distractor) type in Experiments 1 and 2

In the Test Phase, the target was defined as a bar stimulus in a diamond shape among heterogeneously colored circles, and the color of the target diamond was randomly selected from a set of colors other than red or green. Half the trials included a distractor that was red or green, and the other half of trials did not. As described above, attentional interference effects were measured to examine attentional capture by reward-associated distractors with differing levels of uncertainty but with an identical size of EV.

Methods

Participants

We used the G-Power 3.1 software (Faul, Erdfelder, Lang, & Buchner, 2007) to determine a proper sample size to estimate the difference of the distractor type effect between uncertainty and certainty distractors. Based on the reported ηp2 of the effects in previous studies (Anderson et al., 2013; Koenig et al., 2017b), which ranged between 0.14 and 0.16, power analyses for a within-sample analysis of variance (ANOVA) using a power of .95 and an alpha level of .05 resulted in a minimum sample size (n) of 22. Considering the results of the power analyses and the necessity of counterbalancing (see Design section below), 24 participants (mean age = 23.9, 14 females) were recruited from Korea University. All participants had self-reported normal or corrected-to-normal visual acuity and normal color vision. The participants gave informed consent and received KRW 7,000 (about US$6) for their participation. All experiments were approved by the Institutional Review Board at Korea University (KU-IRB-16-138-A-1).

Apparatus

All experiments were programmed and presented using E-Prime software Version 2.0 (Psychology Software Tools, Inc.). Stimuli were presented on a 17-in. CRT monitor of a personal computer. Participants viewed the monitor from a distance of approximately 60 cm in a dimly lit room. Responses were collected using a standard computer keyboard.

Training phase

Stimuli

All stimuli were presented on a black background. Each trial consisted of a fixation display, a search display, a feedback display, and a reward information display. The fixation display consisted of a white fixation cross (0.9° × 0.9° visual angle, RGB: 255, 255, 255; CIE: x = .270, y = .297) located at the center of the display. In the search display, the fixation cross and six colored circles (each 1.9° × 1.9°), which were equally spaced on an imaginary circle with a 4.2° radius, were presented. The color of the target circle was red (RGB: 255, 0, 0; CIE: x = .581, y = .346) or green (RGB: 0, 255, 0; CIE: x = .285, y = .599) and that of non-target circles was randomly selected from a set of blue (RGB: 0, 0, 255; CIE: x = .152, y = .080), yellow (RGB: 255, 255, 0; CIE: x = .388, y = .513), cyan (RGB: 0, 255, 255; CIE: x = .205, y = .286), magenta (RGB: 255, 0, 255; CIE: x = .262, y = .148), orange (RGB: 255, 127, 0; CIE: x = .498, y = .418), and gray (RGB: 127, 127, 127; CIE: x = .274, y = .297) colors without replacement. A white line segment (0.9° visual angle) was presented inside each circle, which was oriented either vertically or horizontally in the target circle and tilted 45° to the left or to the right (as randomly selected) in the non-target circles. The feedback display informed participants when their response was correct by providing written feedback, specifically, “ ” ("Correct" in Korean). For an incorrect response, a 1,000-Hz tone sounded for 500 ms. The reward information display informed participants of the amount of reward earned on the current trial as well as the total amount of their accumulated reward.

Procedure

The Training Phase consisted of 24 practice trials followed by three blocks of 192 main-task trials each. Each trial started with the fixation display for a random interval of 400, 500, or 600 ms (Fig. 1a). After the fixation display, the search array was presented for 500 ms, followed by a blank display until a response was made within a time interval of 1,000 ms or for 1,000 ms when no response was made within the time interval. The feedback display was presented for 1,000 ms. The reward information display was presented for 750 ms.

Fig. 1
figure 1

Examples of a trial sequence in the training (a) and test (b) phases in Experiments 1 and 2

Participants were instructed to respond to the orientation of the line segment in the target color circle (e.g., red or green) among the heterogeneously colored non-target circles. Half the participants were instructed to press the “F” key of a standard computer keyboard in response to the vertically oriented target line with their left index finger and the “J” key to the horizontally oriented target line with their right index finger, and vice versa for other participants. The reward was provided as a score (e.g., 25 points or 100 points) at the end of each trial. Participants were instructed to earn as many points as possible in order to exceed a score limit about which they were not informed but was required to maximize their monetary reward for participation. However, regardless of the actual points obtained by each participant, when accuracy exceeded 80%, the monetary reward was provided in full on completion of the experiment. When a response was correct, a reward was given depending on the target type. One of the two target colors was associated with a high reward (e.g., 100 points) with 25% probability but no reward (e.g., 0 points) with 75% probability (uncertainty target; prediction error-present target). The other color was associated with a low reward (e.g., 25 points) with 100% probability (certainty target; prediction error-absent target). Thus, the two target colors were associated with an identical EV of 25-point scores but differed in terms of reward uncertainty (Fiorillo et al., 2003; Preuschoff, Bossaerts, & Quartz, 2006; Schultz et al., 2008). The color of the target and the reward uncertainty type were counterbalanced across participants. The debriefing included information about the relationship between performance and reward and was given at the end of the experiment.

Design

The target location, the target bar orientation, and the target color were fully crossed and counterbalanced. Trials were presented randomly, and thus the target color, target location, and line orientation varied unpredictably. Target line orientation-response mappings were counterbalanced across participants.

Test phase

Stimuli

Each trial consisted of a fixation display, a search display, and a feedback display. The search display consisted of a fixation cross and six shapes, in which the target was defined as a diamond shape (2.1° × 2.1°) among five circles. The color of the diamond was randomly selected from a set of blue, yellow, cyan, magenta, orange, and gray colors, but never red or green, which were the colors of the targets in the Training Phase. The feedback display only informed participants whether their response was correct or not.

Procedure

In the Test Phase, participants performed eight practice trials and two blocks of 144 main-task trials each. Each trial began with the fixation display for a random interval of 400, 500, or 600 ms. After the fixation display, the target search display was presented for 500 ms. A blank display was presented until a response was executed within a time limit of 1,000 ms. When no response was made, the blank display was maintained for 1,000 ms. The feedback display was presented for 1,000 ms. The procedure was identical to that of the Training Phase, with the exception that the target was a diamond shape among circles (Fig. 1b). Unlike the Training phase, no reward points were provided. Participants were instructed to ignore the color of the shapes and to make a response to the orientation of the line inside the diamond. Critically, a red or green circle, which had been associated with reward in the Training Phase, was presented as a distractor on 50% of the trials. One of each of the red and green circles was the uncertainty distractor (prediction error present) and the other was the certainty distractor (prediction error absent) based on the target type in the learning phase. The remaining 50% of trials did not contain any reward-associated distractor.

Design

The target location, the target bar orientation, the target color, the distractor presence, and the distractor type were counterbalanced. Trials were presented randomly so that the distractor type and target identity varied unpredictably. The line orientation-response mapping was identical to that in the Training phase.

Results

Trials were excluded from the analyses if the response time (RT) was shorter than 150 ms or greater than three SDs above each participant’s mean for their respective condition (1.77% of trials in the Training Phase and 1.86% of trials in the Test Phase), and only correct trials were included in the RT analyses. Mean correct RT and percent errors (PEs) were calculated for each participant as a function of the block (first, second, and third block) and target type (reward uncertainty target or certainty target) in the Training Phase, and block (first and second block) and distractor type (uncertainty distractor, certainty distractor, or distractor absent) in the Test Phase. Repeated-measures ANOVAs were conducted on the mean RT and PE data, with those variables as within-subject factors for each phase.

Training Phase

The overall mean RT was 589 ms. A significant block effect was obtained, F(2, 46) = 30.33, p < .001, \( {\eta}_p^2 \) = .57, indicating that the mean RT of the first block (M = 634 ms) was greater than that of the subsequent blocks (Ms = 579 ms and 558 ms for the second and third blocks, respectively). Neither the main effect of target type, F(1, 23) = 1.79, p = .19, nor the interaction between the block and target type, F(2, 46) = 1.13, p = .332, was significant, suggesting that the mean RTs of both target types were statistically equivalent across blocks (Table 2). The overall PE was 4.58%, and only the main effect of the block was significant, F(2, 46) = 18.15, p < .001, \( {\eta}_p^2 \) = .44, indicating that PE was higher in the first block (6.3%) than the subsequent blocks (4.0% and 3.5% in the second and third blocks, respectively).

Table 2 Mean response times (RTs; in milliseconds, with standard deviations in parentheses) and percent errors (PEs) in Experiment 1 as a function of target type in the training phase and distractor type in the test phase

Test Phase

The overall mean RT was 609 ms. The main effect of the block was marginally significant, F(1, 23) = 4.06, p = .056, indicating that the mean RT tended to be greater in the first block (M = 617 ms) than in the second block (M = 604 ms). Importantly, the main effect of distractor type was significant, F(2, 46) = 8.24, p < .001, \( {\eta}_p^2 \) = .26. Paired-comparison analyses showed that the mean RT was greater when an uncertainty distractor was presented (M = 623 ms) than when a certainty distractor (M = 608 ms), t(23) = 2.421, p = .024 (Cohen’s d = .494) or no distractor was presented (M = 600 ms), t(23) = 3.917, p < .001 (Cohen’s d = .799). However, the mean RT was not significantly different between trials when no distractor was presented and when a certainty distractor was presented, t(23) = 1.577, p = .128. These data indicate that only the uncertainty distractor elicited VDAC, as shown in the left panel of Fig. 2. The interaction between block and distractor type was also significant, F(2, 46) = 10.84, p < .001, \( {\eta}_p^2 \) = .32 (shown in the right panel of Fig. 2). Separate analyses on each block showed that the main effect of distractor type was significant in the first, F(2, 23) = 13.60, p < .001, \( {\eta}_p^2 \) = .54, and second, F(2, 23) = 3.70, p = .032, \( {\eta}_p^2 \) = .24, blocks. The mean RT for uncertainty distractor trials (M = 641 ms) was greater than that for certainty distractor trials (M = 603 ms), t(23) = 3.99, p < .001 (Cohen’s d = .815) or no distractor trials (M = 606 ms), t(23) = 4.97, p < .001 (Cohen’s d = 1.014) in the first block, but the difference of the mean RT between uncertainty distractors (M = 605 ms) and certainty distractors (M = 613 ms) was not significant in the second block, t(23) = 1.16, p = .257 (see Table 3). The overall PE was 3.71%. No main effect or interaction was significant for the PE data.

Fig. 2
figure 2

Mean response times (RTs; in milliseconds) as a function of distractor type (left panel) and block and distractor type (right panel) in the test phase of Experiment 1. Error bars ±1 within-subject standard error of the mean (Cousineau, 2005)

Table 3 Mean response times (RT; in milliseconds, with standard deviations in parentheses) and percent errors (PEs) in Experiment 1 as a function of as function of block and distractor type in the test phase

Discussion

Consistent with previous studies using similar methods (Anderson et al., 2011, 2012; Miranda & Palmer, 2014; Roper & Vecera, 2016), there was no difference in performance in terms of response latency or accuracy for the targets associated with reward in the Training Phase. Importantly, however, a larger attentional capture was observed when the reward uncertainty distractors were presented than when the certainty distractors were presented in the Test Phase. Since the EVs of the two types of distractors were identical, the larger attentional interference by the uncertainty distractors was attributed to uncertainty, indicating that uncertainty modulates value-based attentional capture.

Interestingly, the VDAC for the uncertain reward-associated distractor was evident in the first block but not in the second block of trials. Indeed, additional analyses revealed that the extent of VDAC evoked by the uncertainty distractors was reliably diminished from the first block (35 ms) to the second block (10 ms), F(1, 23) = 6.82, p = .016, \( {\eta}_p^2 \) = .23, indicating the occurrence of extinction. This result implies that uncertainty in the Training Phase modulated the pattern of VDAC in the Test Phase. We discuss this possibility in the General discussion, in conjunction with the findings from the other experiments in the present study.

However, there are several possibilities that the larger attentional bias towards the uncertainty distractors than the certainty distractors was due to factors other than uncertainty. The first possibility is that because the uncertain reward-associated distractor was a previously partially reinforced stimulus while the certain reward-associated distractor was a previously continuously reinforced stimulus, different extinction rates for the two types of distractors could have caused a stronger attentional bias towards the uncertainty distractor than the certainty distractor. However, this possibility can be ruled out because the VDAC for the uncertain reward-associated distractor vanished rapidly in the second block of the Test Phase as compared to the VDAC for the certain reward-associated distractor.

The second possibility is that the attentional biases were based on reward expectancy (De Tommaso et al., 2019; Koenig et al., 2017a). Especially, participants might have associated the uncertain target color (100 points in 25% of the trials) with a larger reward and the certain target color (25 points in 100% of the trials) with a smaller reward while ignoring the base rates associated with their outcomes (e.g., Tversky & Kahneman, 1982). However, in many studies manipulating the probabilities of large and small reward deliveries, VDAC was obtained (e.g., Anderson et al., 2011), implying that reward delivery probability, as well as reward magnitude, is a critical factor in reward-association learning.

Another possibility is that the dishabituation for the uncertainty target color was caused by rewarded trials, while participants were habituated to the certainty target color because of continuous reward (e.g., Groves & Thompson, 1970). Participants could have earned 100 points on 12.5%, 25 points on 50%, and 0 point on 37.5% of the total trials. Thus, the reward of 100 points could have played as an oddball in terms of both frequency and reward magnitude, resulting in attentional orienting to the uncertain color.

Experiment 2

The uncertainty in Experiment 1 depended mainly on the probability of reward delivery. That is, to associate a color with uncertain reward, no-reward (e.g., 0 points) was delivered on some trials and reward (e.g., 100 points or 25 points) on the other trials. However, the variation of reward magnitude across trials, as well as reward delivery probability, which is also a critical aspect of value learning, is possibly related to uncertainty, supported by previous studies that the reward probability and reward magnitude were encoded in a dissociable manner (Schutte, Heitland, & Kenemans, 2019; Yacubian et al., 2007). Moreover, in previous animal studies, uncertainty was manipulated by altering the variation in reward magnitude to induce the error in the prediction of reward magnitude. The results showed that variable magnitudes of reward were more salient than the non-variable ones (Anselme, 2015; Anselme, Robinson, & Berridge, 2013; Dreher, Kohn, & Berman, 2006; Le Pelley et al., 2019; Shafir, 2000; Walker et al., 2019).

In Experiment 2, to determine the influence of uncertainty from the variation of reward magnitude on VDAC, the uncertainty of value during stimulus-reward association was manipulated by tuning the variance of the magnitude of reward without no-reward trials. For example, one target color was associated with various sizes of reward points, such as 10 points, 25 points, 75 points, and 90 points, in which it is assumed that the prediction about obtainable reward magnitude is uncertain (uncertainty distractor). In contrast, the other target color was imbued with only a single reward point, 50 points, so that the magnitude prediction was relatively obvious (certainty distractor). In doing so, both target colors were associated only with reward points and not with no-reward. More importantly, there were different levels of uncertainty in terms of reward magnitude prediction errors, not of reward delivery or omission, while EV was constant. In the second experiment, we were able to examine the effect of uncertainty on attentional bias while minimizing the contributions of value expectancy, the dishabituation by the high-value outcome, and different extinction rates for previously partially reinforced uncertain distractors and previously continuously reinforced certain distractors. If prediction errors in reward learning, indicating reward uncertainty, modulate VDAC, then the uncertainty distractor would induce greater and short-lasting VDAC compared to the certainty distractor.

Method

Participants

A new group of participants (mean age = 22.7 years, 11 females) in Korea University took part in Experiment 2 and were given the same monetary reward as in the previous experiment (KRW 7,000). All participants had normal or corrected-to-normal visual acuity and normal color vision according to self-report.

Training Phase

Apparatus and stimuli

Apparatus and stimuli were the same as those used in the previous experiment.

Procedure and design

The procedure and design of Experiment 2 were identical to those of Experiment 1 with the exception of the manipulation of reward. Critically, to elicit different reward magnitude variations with the same expected value, one of the two target colors was associated with a constant reward score, such as 50 points (certainty target). The other target color was associated with variable magnitude reward scores, such as 10, 25, 75, or 90 points (uncertainty target). Note that the proportions of variable reward scores were balanced across different scores (25% for each score) and, unlike the previous experiment, there was no no-reward for either type of target color. Therefore, the EV for both target colors was identical (50 points) but the variations in reward magnitude differed for the colors.

Test Phase

The method of the Test Phase in Experiment 2 was identical to that of the previous experiment.

Results

The same criteria as in Experiment 1 were applied to trim the RT and PE data in Experiment 2. As a result, 1.6% of trials were removed from analyses in the Training Phase and 1.6% of trials in the Test Phase. Mean correct RT and PE were calculated for each participant as a function of the block (first, second, and third block) and target type (uncertainty target and certainty target) in the Training Phase, and block (first and second block) and distractor type (uncertainty distractor, certainty distractor, and distractor absent) in the Test Phase. Repeated-measures ANOVAs were conducted on the mean RT and PE data, with those variables as within-subject factors for each phase.

Training Phase

The overall RT was 563 ms. As in the previous experiment, the main effect of the block was significant, F(2, 46) = 21.96, p < .001, \( {\eta}_p^2 \) = 49, showing that the mean RT decreased with block (Ms = 602 ms, 550 ms, and 540 ms for the first, second, and third blocks, respectively). Neither the main effect of target type, F(2, 46) = 1.51, p = .231, nor the interaction between block and target type, F(2, 46) = 2.46, p = .096, was significant (Table 4). The overall PE was 3.85%. A significant main effect of block was obtained, F(2, 46) = 9.26, p < .001, \( {\eta}_p^2 \) = .29, which was due to there being more errors in the first block (5.3%) than the other blocks (i.e., 3.2% and 3.0% in the second and third blocks, respectively). No other main effect or interaction was significant, Fs < 1.

Table 4 Mean response times (RTs; in milliseconds, with standard deviations in parentheses) and percent errors (PEs) in Experiment 2 as a function of target type in the training phase and distractor type in the test phase

Test Phase

The overall mean RT was 594 ms. The main effect of the block was marginally significant, F(1, 23) = 4.21, p = .051, indicating a trend for a decrease in the mean RTs by block (Ms = 605 ms and 587 ms in the first and second blocks, respectively). The main effect of distractor type was significant, F(2, 46) = 6.73, p = .003, \( {\eta}_p^2 \) = .23. As in the previous experiment, paired comparisons showed that the mean RT was greater when either type of distractor was presented than when no distractor was presented (M = 585 ms), ts(23) > 2.971, ps < .007 (Cohen’s ds >.606), but the same amount of interference was generated by uncertainty distractors (16 ms) and certainty distractors (16 ms), t(23) = 0.064, p = .950. However, interestingly, a significant interaction between block and distractor type was obtained, F(2, 46) = 8.13, p < .001, \( {\eta}_p^2 \) = .26 (Table 5). Separate analyses on each block demonstrated that the main effect of distractor type was significant in the first block, F(2, 23) = 13.96, p < .001, \( {\eta}_p^2 \) = .55, and the second block. F(2, 23) = 3.30, p = .046, \( {\eta}_p^2 \) = .22. Additional analyses showed that the amount of interference caused by the uncertainty distractor was significant in the first block (32 ms), t(23) = 5.394, p < .001 (Cohen’s d = 1.101), but not in the second block (0 ms), t(23) < 1 (Fig. 3). In contrast, the interference effect by the certainty distractors was significant in the first block (15 ms), t(23) = 2.825, p = .010 (Cohen’s d = .577), as well as in the second block (18 ms), t(23) = 2.452, p = .002 (Cohen’s d = .5). Importantly, whereas the mean RT for uncertainty distractors (M = 621 ms) was greater than the mean RT for certainty distractors (M = 604 ms) in the first block, t(23) = 2.504, p = .02 (Cohen’s d = .511), it was marginally greater for the non-uncertainty reward distractors (M = 598 ms) than the uncertainty distractors in the second block (M = 581 ms), t(23) = 1.83, p = .08.

Table 5 Mean response times (RTs; in milliseconds, with standard deviations in parentheses) and percent errors (PEs) in Experiment 2 as a function of block and distractor type in the test phase
Fig. 3
figure 3

Mean response times (RTs; in milliseconds) as a function of block and distractor type in the test phase of Experiment 2. Error bars ±1 within-subject standard error of the mean (Cousineau, 2005)

The overall PE was 4.16%. A significant main effect of block was found, F(1, 23) = 6.36, p = .019, \( {\eta}_p^2 \) = .22, indicating that more errors were made in the second block (4.7%) compared to the first block (3.6%). The main effect of distractor type was marginally significant, F(2, 46) = 2.77, p = .072, reflecting that more errors tended to be made on the uncertainty distractor trials (4.9%) than on the certainty distractor trials (3.4%), F(1, 23) = 3.69, p = .067. The interaction between the block and distractor type was not significant, F < 1.

Discussion

The results of Experiment 2 showed that the distractors associated with uncertain magnitude of rewards elicited a larger VDAC compared to the certain reward distractors. Specifically, in the first block, the amount of the VDAC by the uncertainty distractors (32 ms) was greater than that of the VDAC by the certainty reward distractors (15 ms). This attentional interference by the uncertainty distractors vanished in the second block (0 ms), similar to the interference effect observed as a result of uncertain distractors in Experiment 1. Consequently, these findings indicate that the distractors associated with uncertainty in reward magnitude captured attention more so than the distractors associated with certainty in reward magnitude, but the persistence of the interference elicited by the variable reward-associated distractors was short.

Unlike Experiment 1, the uncertainty distractor was associated with larger (75 or 90 points) or smaller rewards (10 or 25 points) than the certainty distractor (50 points). Thus, the obtained attentional biases were not based simply on reward expectancy (De Tommaso et al., 2019; Koenig et al., 2017a). Moreover, because both types of distractors were previously continuously reinforced stimuli, the attentional bias towards the uncertainty distractor was not due to the difference in extinction rate between them. However, there were inhibitory learning trials when the uncertainty target color was presented, whereas no such trial was included when the certainty target was presented in the Training Phase. Thus, the attentional bias was possibly due to the difference in the reward association between the uncertainty and certainty colors (e.g., Miller, Barnet, & Grahame, 1995; Rescorla & Wagner, 1972).

General discussion

In the present study, attentional bias toward reward-associated features was examined when the probabilistic relationship between the feature and reward varied in the uncertainty of value, but not in the expected value. In Experiment 1, uncertainty was tuned by varying the reward provision or omission. In Experiment 2, uncertainty was manipulated by varying the magnitude of the reward. The results from Experiments 1 and 2 consistently demonstrated that the influence of reward uncertainty on attentional bias by value-associated stimuli was twofold. First, the amount of attentional capture was greater with the uncertain reward-associated distractors than with the certain reward distractors. Second, the level of associated-reward uncertainty modulated the persistence of the VDAC in both experiments. In particular, the uncertainty distractors attracted more attention than the certainty distractors in the first block of the Test Phase, but not in the later blocks of the phase in both experiments. These dynamic changes in interference were not found consistently for the VDAC obtained by the certainty distractors. In short, when EV was held constant, uncertainty was found to modulate VDAC.

The distractors associated with uncertain reward captured attention more than the distractors associated with certain reward in the first block of Experiment 2, consistent with Experiment 1. However, when considering both blocks of the Test Phase, unlike the results of Experiment 1, the amount of VDAC by the distractors associated with uncertain reward was not significantly different from that by the distractors associated with certain reward. This different pattern of the modulatory effect by uncertainty was possibly observed because uncertainty was lower in the former than the latter. As seen in Table 1, variance was 1,875 in Experiment 1 and 1,112.5 in Experiment 2.

Uncertainty changes the size and persistence of VDAC

The findings of the two experiments showed that uncertainty in value prediction enlarges the effect of attentional capture by reward-paired stimuli to some degree in the Test Phase. Consistent with Pearce-Hall’s uncertainty-based attention theory (Pearce & Hall, 1980), we found that participants deployed their attention toward an uncertain reward-associated stimulus more than a certain reward one, even when the stimulus was irrelevant and non-salient. The theory suggests that attentional deployment towards an uncertainty-related stimulus is based on the demand for further processing to learn the relationship between the stimulus and its outcome (Gottlieb, 2012; Pearce & Hall, 1980). Since the distractors signaled an identical level of expected value in our study, the attentional bias towards uncertain distractors indicates that participants allocated their attention to the uncertainty distractors to learn the probabilistic relationship between the stimulus and reward delivery (Experiment 1) or reward magnitude (Experiment 2). Furthermore, recent studies show that stimuli predicting a valence-related outcome, such as reward (Koenig et al., 2017a; Le Pelley et al., 2019), capture attention during value learning when they are involved with outcome uncertainty. In contrast, Sali, Anderson, and Yantis (2014) found that when a target feature was associated with a constant amount of reward (no prediction error), this stimulus feature failed to induce VDAC in a subsequent test phase. Beyond such findings, our study found that attentional modulation based on uncertainty was continuously obtained when reward was no longer delivered, even when the effect of EV was controlled. Thus, attentional biases for uncertainty distractors as reported here suggest that uncertainty is a crucial modulatory factor in value-driven attentional capture.

Interestingly, attentional interference by uncertainty distractors vanished rapidly relative to the interference by certainty distractors. It has been demonstrated that partially reinforced responses are more resistant to extinction than continuously reinforced ones in the context of instrumental conditioning (see Mackintosh, 1975), which is called the partial reinforcement extinction effect (PREE). This PREE has also been found in Pavlovian conditioning, indicating that a conditioned stimulus (CS) that is intermittently associated with an unconditioned stimulus (US) is more resistant to extinction than a CS that is continuously paired with the US (Haselgrove, Aydin, & Pearce, 2004; Pearce, Redhead, & Aydin, 1997; Rescorla, 1999). Since uncertainty was manipulated by reward provision and omission, the reinforcement of reward uncertainty may be understood as an intermittent or partial reinforcement schedule. If so, as the reward was withdrawn in the Test Phase, the short persistence of attentional interference by the uncertainty distractor relative to that by the certainty one seems to contradict the concept of the PREE.

However, the Training and Test Phases differed not only in whether a reward was delivered or not but also in the task context, including the task goal and target and non-target features. Taking this into account, a possibility exists that attentional processing of uncertainty-related features is more context-dependent than that for certainty. This context-dependence indicates that when a stimulus is associated with an outcome in a specific context, the effect of associative learning is more evident in this same context than in other contexts. Indeed, a considerable number of researchers have argued that context-dependence varies depending on the ambiguity of the relationship between a CS and an outcome (Bouton, 1997, 2002; Nelson, 2002; Rosas & Callejas-Aguilera, 2006; Rosas, Vila, Lugo, & López, 2001). Bouton (1997) suggests that when a CS becomes ambiguous about an upcoming outcome, the context is given attention so that the ambiguity of the CS can be resolved. Thus, the informativeness of the CS is updated within a particular or given context. In other words, context plays an important role in retrieving stimulus-outcome associations. From this perspective, the attentional interference by the uncertainty distractors disappeared rapidly, because the task context changed in the Test Phase. In short, although uncertainty strengthens VDAC, its modulatory effect might be susceptible to context switching, resulting in modulation of the persistence of VDAC.

Value representation underlying uncertainty-based attentional modulation

Note that the modulation of attention by uncertainty was obtained in the Test Phase, in which reward was no longer provided, but it was not evident in the Training Phase, in which reinforcement learning occurred. This result is inconsistent with the findings from Koenig et al.’s (2017a) study showing that the stimulus features related to reward uncertainty modulated attentional control during value learning. Regarding this inconsistency, there is one noteworthy difference between the previous and present studies in their examination of the uncertainty effect on attention. In previous studies, researchers instructed participants to infer what outcome could be expected among the possible candidates of outcomes for given target stimuli (Beesley et al., 2015; Easdale et al., 2019; Hogarth et al., 2008; Luque et al., 2016). Importantly, the specific features of target stimuli were associated with particular outcomes with some degree of probability, including uncertainty (e.g., 50%) or certainty. Thus, the uncertainty in these studies stems from the probabilistic relationship between the target stimulus and its specific response. Similarly, in the research studying the uncertainty of outcomes (Koenig et al., 2017a), the delivery and the size of reward varied based on participants’ responses to given target stimuli during the learning task. Thus, uncertainty in these studies was also directly associated with the relationship between a given stimulus and a performer’s predictive choice. Therefore, the uncertainty in these studies depends on the value representation formed based on task-relevant predictive choices in the learning task. This implies that attentional modulation depending on the level of uncertainty is strategically required to improve task performance. Thus, attentional modulation by uncertainty is more likely to be reflected in performance even during associative learning.

Unlike this strategic attentional control to uncertainty, in the present study value was relatively independent of task performance in the Training Phase. That is, the probability and magnitude of reward were determined depending on the color of the target circle, regardless of participants’ responses. Specifically, in the Training Phase the targets were defined as specific colors, and participants were required to identify the orientation of a line in the target not the color of the target. Therefore, although reward learning progresses depending on target features, performance in the Training Phase is less likely to be affected by value representation, consistent with previous findings that the value modulatory effect on attentional performance is obtained only in the subsequent task, but not in the task for reward learning (Anderson et al., 2011, 2012; Miranda & Palmer, 2014; Roper & Vecera, 2016).

Instead, value representation in the present study is more likely to be learned automatically, independent of the visual search task in the Training Phase. Indeed, many previous studies have demonstrated that value representation is formed even when the current task is not directly relevant to evaluation performance (Anderson et al., 2011; Bucker & Theeuwes, 2017; Hickey et al., 2010; Kim, Adolphs, O’Doherty, & Shimojo, 2007; Libera & Chelazzi, 2006; Theeuwes & Belopolsky, 2012; for review, see Grueschow, Polania, Hare, & Ruff, 2015). A neural imaging study has also revealed that the posterior cingulate cortex is functionally related to automatic coding for value that possibly operates in a choice-irrelevant task, whereas the medial prefrontal cortex is relatively involved with the value represented in a choice-relevant context (Grueschow et al., 2015). Therefore, the present study demonstrates that value representation about reward uncertainty is generated automatically, so that value-driven attention is modulated by uncertainty even in the subsequent task without reward.

It is important to note that the different amounts of attentional bias towards the two colors were possibly due to different learning rates for excitatory and inhibitory learning. It has been suggested that the learning rate is smaller for inhibitory learning trials than for excitatory learning trials (e.g., Miller et al., 1995; Rescorla & Wagner, 1972). In the current experiments, inhibitory learning trials were interspersed when the uncertainty target was presented in the Training Phase, while fast and correct responses were always rewarded when the certainty target was presented, possibly resulting in a stronger reward association for the uncertainty-related color than the certainty-related one. Thus, future study is needed to determine whether attentional modulation by uncertainty can occur even when controlling for the difference in learning rates for excitatory and inhibitory learning.

Conclusion

To predict an outcome as a future gain or loss, we are required to learn about the complex relationship between stimuli and outcomes – either intentionally or automatically. While previous research has mainly provided evidence of attentional exploitation based on expected values of stimuli, we suggest that attention is modulated by uncertainty in terms of the size and persistence of value-driven attentional capture. This attentional exploration, meaning that our attention is biased involuntarily on the basis of the uncertainty in value representation, implies that our cognition is not simply greedy but practical in exploring the nature of the relationship between different events.