Repeated distractor configurations can lead to reduced search times even if no repetition is noticed by the observers. This effect has been called contextual cueing—a form of implicit, incidental learning (Chun & Jiang, 1998). Recently, the question has been raised whether contextual cueing depends on working memory resources (Manginelli, Geringswald, & Pollmann, 2012; Travis, Mattingley, & Dux, 2013; Vickery, Sussman, & Jiang, 2010; see also Woodman & Chun, 2003).

What role could working memory play in contextual cueing? Contextual cueing has been observed over extensive time periods of up to one week (Chun & Jiang, 2003)—indicating that it relies on long-term memory representations. Working memory may facilitate encoding into long-term memory. Ample evidence has revealed that the maintenance (R. L. Greene, 1987; Ranganath, Cohen, & Brozinsky, 2005) and manipulation (Blumenfeld & Ranganath, 2006; Bower, 1970; Davachi & Wagner, 2002) of items in working memory contribute to long-term memory formation. However, it is less clear whether working memory is needed for more implicit forms of encoding. In the case of contextual cueing, a visual search advantage for repeated displays was observed even though the participants’ attention was distracted from the repeated display items (Jiang & Leung, 2005). In the same study, however, the visual search advantage was eliminated when attention was distracted away from the repeated items after they had been learned. This was interpreted as indicating that attention is necessary for the expression of learning, but not for learning itself. A related distinction was also observed in sequence learning (Frensch, Lin, & Buchner, 1998).

Jiang and Leung (2005) varied attentional but not working memory demands. In their experiments, participants searched in displays that consisted of black and white items. They were instructed that the target was either black or white. In this way, participants were led to select either the black or the white items. It was found that repeated configurations were learned even when they were not attended to (e.g., black items in the search for a white target). However, when the colors were reversed after an initial learning phase, contextual cueing was only observed when the repeated items had been attended to (e.g., repeated items in white when searching for a white target; see also Geringswald, Baumgartner, & Pollmann, 2012, for the dependence of contextual cueing on foveal attention). This distinction—between the dependence of context learning and expression of learning on attention—might also be of importance for the role of working memory in contextual cueing.

Recently, several researchers have investigated the role of visual working memory in contextual cueing. One series of experiments yielded no evidence for working memory load effects on contextual cueing (Vickery et al., 2010). In these experiments, visual search proceeded under working memory load during an initial learning phase, whereas searching in novel and repeated displays was tested without a secondary working memory task in a subsequent testing phase. Thus, it was concluded that repeated configurations could be learned under the working memory load of a secondary task. In a study from our lab, the loss of search facilitation for repeated displays was observed when a concurrent visuospatial working memory load was applied (Manginelli et al., 2012). This elimination of contextual cueing was selective for visuospatial working memory load; it did not occur when a nonspatial (color) working memory load was added. In contrast to the report by Vickery et al., working memory load was present during the whole experiment, and thus could interfere not only with the learning of context but also with the expression of a learned context. Together, these studies suggested that visuospatial working memory is needed for the expression of learning, but not for learning itself.

A third study revealed an elimination of contextual cueing under visuospatial working memory load, but that contextual cueing remained intact for items that were learned under visuospatial working memory load and then subsequently tested without load (Travis et al., 2013). However, the results were confounded by differing learning procedures: The elimination of contextual cueing was observed with the interleaved learning of repeated and new displays, whereas intact contextual cueing was observed after the massed learning of repeated displays only. In a third experiment from this study, again with interleaved learning, it was found that working memory load could interfere with contextual cueing during both the learning and testing phases—depending on the amount of load. One drawback of this study was that the two experiments with interleaved learning were not entirely comparable, due to differences in the psychological expertise of the participants.

Thus, the evidence for the role of working memory has come from comparisons among studies that were not entirely comparable—including comparisons between different paradigms used by different labs, different learning regimes, and potentially different populations. Furthermore, working memory has been imposed throughout the testing session or restricted to a learning session, but it has not been added in a testing session following a learning phase without load. Therefore, we have only indirect evidence for the hypothesis that working memory load affects the expression of learning.

In the present experiments, we added a secondary working memory task during the learning or testing phase in order to obtain direct evidence for a visual working memory contribution to either the learning or the expression of learning of contextual cues.

Moreover, it is not entirely clear whether contextual cueing relies specifically on visuospatial working memory resources or on visual working memory resources in general. Vickery et al. (2010) tested both spatial and nonspatial load—but only during learning. In our previous study, in which a specific interference between visuospatial working memory load and contextual cueing was observed, there was no separation of the learning and test phases. Therefore, we now added either visuospatial or nonspatial visual working memory load in order to investigate the specificity of visuospatial working memory demands. Furthermore, we applied two different operationalizations, one each from visuospatial and nonspatial working memory loads, to rule out that this distinction is confounded by the use of a particular task. In addition, we applied an articulatory suppression task to rule out verbalization of the working memory items. Articulatory suppression had not been applied in some of the previous studies (Travis et al., 2013; Vickery et al., 2010), thus leaving verbalization as a potential confound.

To summarize, working memory load may affect contextual cueing by interfering with (1) implicit context learning itself or (2) the expression of learning. The available evidence has made interference with learning itself unlikely (but see Travis et al., 2013). Instead, indirect evidence suggests that visuospatial working memory load can affect the expression of learning; contextual cueing was observed after working memory load during learning (Vickery et al. 2010), but not when the working memory load continued during the test phase. This is the hypothesis that we aimed to investigate in the present study. In a series of four experiments, we tested the effect that concurrent visuospatial and nonspatial working memory loads have on contextual cueing. In two of the experiments, two different types of spatial and nonspatial working memory loads were applied during the learning phase and then removed during a subsequent test phase. In two more experiments, this was reversed: Concurrent working memory load was applied during the test phase, after repeated search contexts were learned in the absence of working memory load. Finally, a fifth experiment featuring a contextual-cueing task without working memory load served as the baseline. If visuospatial working memory is needed for retrieving and/or maintaining visuospatial long-term memory cues during search, we expected to find interference of visuospatial—but not of nonspatial visual—working memory load with contextual cueing in the test phase. In contrast, if working memory load interferes with implicit context learning itself, we expected an addition of working memory load during the learning phase to reduce contextual cueing.

Experiment 1: Spatial working memory load and context learning

In the first experiment, we combined a standard contextual-cueing paradigm with a concurrent working memory load in an initial learning phase. As we outlined above, we expected that implicit contextual learning would occur, despite the added working memory load.

Method

Participants

A group of 20 participants (14 females, six males; average age 24.9 years) took part in Experiment 1a, and another 20 participants (seven females, 13 males; average age 24.5 years) in Experiment 1b, after giving informed consent. All of the participants had normal or corrected-to-normal vision. They were paid or were compensated with course credits, and were naive as to the purpose of the experiment. This and all following experiments were approved by the Ethics Board of the Medical Faculty of the University of Magdeburg.

Experimental setup

Stimuli were presented on a 24-in. flat-screen color monitor (resolution of 1,920 × 1,200 pixels, refresh rate of 60 Hz) at a viewing distance of 55 cm. Each participant’s head was secured by a chinrest. The experiment was performed using MATLAB (version 7.4.0, R2007a; The MathWorks, Sherborn, MA) with the OpenGL extension for the Psychophysics Toolbox, Version 3.0.8 (Brainard, 1997).

Stimuli

Working memory task

In Experiment 1a, four black squares, each subtending 0.6° × 0.6° of visual angle, were presented on a gray background. The positions of the four squares were randomly chosen among eight equidistant locations available on an imaginary circle, with a radius of about 3° of visual angle, that was centered at the fixation point (a white cross, 0.6° × 0.6° of visual angle). In Experiment 1b, two Gabor patches, each subtending about 1° × 1° of visual angle, were presented on a gray background. They were positioned on an imaginary circle with a 2° radius around the central fixation point (again a white cross of 0.6° × 0.6° of visual angle), with one on the left and one on the right side. The parameters for the Gabors were chosen as follows: phase = 0.25, wavelength (number of pixel per cycle) = 10, frequency = 7.4. The orientations of the two patches were each randomly chosen from among the following values: 0°, 45°, 90°, and 135°.

Contextual-cueing task

In both Experiments 1a and 1b, the stimuli for the contextual-cueing task were one 90°- or 270°-rotated T (the target) and multiple 0°-, 90°-, 180°-, and 270°-rotated Ls (the distractors) subtending 0.6° × 0.6° of visual angle. Each display contained one target and 15 distractors placed on four imaginary concentric circles with radii of 1.7°, 3.4°, 5.1°, and 6.8° (Fig. 1). The targets could only appear on the second and third circles, and the distribution of distractors was balanced so that four items were always presented in each quarter of the screen. The colors of both the target and distractors were selected from among yellow, red, blue, and green. The background was always gray (RGB = 128, 128, 128).

Fig. 1
figure 1

Grayscale version of the contextual-cueing paradigm. (Top) “Old” configurations: The target location and color, as well as the distractor locations, colors, and orientations, are repeated throughout the experiment. (Bottom) “New” configurations: Target locations are kept constant, while the distractor locations, colors, and orientations are newly generated in each block. Shades of gray indicate different colors

Procedure

The experiment lasted approximately 90 min and was framed into five phases: training on the dual task, training on the single task, 15 blocks of learning (dual task), five blocks of test (single task), and an explicit recognition test on the contextual-cueing displays. At the beginning of each phase, participants received instructions displayed on the screen about which task they were to perform. In both training phases, 12 trials were presented using randomly generated displays. Concerning the contextual-cueing task, target locations used during the actual experiment were avoided in the training displays.

Each block of the experiment consisted of 24 trials, for a total of 360 trials in the learning and 120 trials in the test phase. Participants were unaware that four additional, randomly generated trials were presented at the beginning of the first block of each phase in order to reduce variance due to a sudden onset of the actual experiment. These trials were not included in any analysis. Between blocks, participants were allowed to rest until they pressed the down arrow key to initialize the next block.

In the contextual-cueing paradigm, two conditions were defined—“old” and “new”; both consisted of 12 configurations that were presented once in each block. In the “old” condition, the position, orientation, and color of the distractors were kept constant, as well as the position and color of the target, while the orientation (left or right) of the target was randomly varied in each trial, in order to avoid response preparation. In the “new” condition, only the set of 12 target locations was preserved (to avoid a confound of presentation frequency and repetition of distractor configuration); the distractor configuration was randomly generated in each block. Each of the 24 targets was uniquely associated with a condition and—in the case of old displays—with a specific configuration.

In the first 15 blocks (learning), the dual task was performed (Fig. 2). Each trial started with the presentation of the fixation point (a white cross centered in the screen on a gray background) for 2,000 ms. During this time, participants heard two digits randomly chosen between 1 and 9. They were asked to rehearse the two digits until a memory test was provided at the end of the trial. In Experiments 1b, 2b, 3b, and 4b, participants were instructed to rehearse the numbers aloud. Rehearsal was checked by the experimenter, who sat in a room adjacent to the test chamber. The auditory stimulus was followed by the presentation of the visuospatial working memory array, together with the fixation cross, for 500 ms, which was followed by a 1,500-ms presentation of the fixation cross. Next, the search display appeared and remained until the participant responded or until a maximum of 3,500 ms had elapsed. Participants were instructed to make a forced choice buttonpress with their right hand, in accordance with the pointing direction of the stem of the rotated T. They were instructed to be as fast and as accurate as possible. Following the response, they received auditory feedback on the performed task: a 1500-Hz high-pitch tone for a correct answer, or a 300-Hz low-pitch tone for an incorrect answer. When the search display disappeared, the fixation cross was displayed again for a length ranging from 500 to 4,000 ms The length was dependent on the response time of the participant in the previous visual search task, so that in each trial, a constant retention period of 4,000 ms for the spatial working memory array was given. Participants then proceeded by performing a memory test on the spatial working memory task: In Experiment 1a, a black square (the memory probe) was presented in one of the eight available positions defined on the imaginary circle, together with the fixation cross, for a maximum of 3,000 ms. Within this time, participants had to indicate by a left-hand forced choice response whether the position of the square matched one the four squares previously presented in the working memory display. In Experiment 1b, the memory probe consisted of one Gabor patch centered at the fixation point, and participants had to indicate within a maximum time of 3,000 ms by a left-hand forced choice response whether the probe matched one of the two patches previously presented. This task was chosen because tests of stimulus rotation have been shown to draw on visuospatial working memory resources (Miyake, Friedman, Rettinger, Shah, & Hegarty, 2001). In both Experiments 1a and 1b, participants received auditory feedback on the correctness of their answers by means of the same pitch-tones used in the visual search task. On half of the trials, the probe matched one of the square locations or orientations of the memory array. After 1,000 ms of fixation, the memory test concerning the articulatory suppression task was performed. Two white digits were displayed in the center of the screen on a gray background for a maximum of 3,000 ms, and participants had to indicate by a left-hand forced choice response whether or not they matched the two digits that had been rehearsed during the trial. The same auditory feedback was provided again. Finally, each trial ended with a 500-ms presentation of a central fixation mark, after which a new trial began.

Fig. 2
figure 2

One representative trial for each dual task. (Top rows) Visual search combined with visuospatial working memory (for location or rotation). (Bottom rows) Visual search combined with nonspatial working memory (for color or Klingon letters). Each pattern in the color working memory task squares indicates a color

In the test phase—that is, the last five blocks—participants performed only the visual search task, which was identical to that in the learning phase, with the exception that the trials included no visuospatial working memory and articulatory suppression tasks. Each block contained the same 12 repeated configurations used in the learning phase, which were intermixed with 12 new configurations constructed in the same way as in the learning phase (i.e., the same target locations were used). Trials were separated by 1,000 ms of fixation.

In the last phase of the experiment, participants performed a recognition test. All of the 12 repeated search displays were presented again and were randomly intermixed with 12 newly generated displays. Participants had to indicate by a two-alternative forced choice buttonpress whether the displays had already been presented in the experiment. They were not informed at the beginning of the experiment about the upcoming memory test.

Data analysis

In each experiment, we excluded from the analysis any participants who performed the working memory task at chance level. Chance level within the 95 % confidence interval was defined in each experiment by means of a binomial distribution with a sample size equal to the number of trials (360 for Experiments 1a, 1b, 2a, and 2b, in which the working memory task was performed in the learning phase, and 120 for Experiments 3a, 3b, 4a, and 4b, in which the working memory task was performed in the test phase). Therefore, in Experiments 1a, 1b, 2a, and 2b, participants had to reach a minimum accuracy of 55.28 %, and in Experiments 3a, 3b, 4a, and 4b, the minimum accuracy accepted was 59.26 %.

Trials in which one of the tasks was not correctly performed were discarded from all response time analyses. In order to increase statistical power, we averaged every five blocks into one epoch, yielding three epochs in the learning phase and one epoch in the test phase. We analyzed the data of the two spatial or nonspatial working memory tasks in each experiment in joint analyses of variance (ANOVAs) including Task Variant as a between-participants factor. If the Task Variant factor or the interactions involving the task variant became significant, we ran additional analyses for each subtask. One ANOVA was calculated over Epochs 1–3 in order to assess contextual cueing. Our main criterion for contextual cueing was the main effect of configuration over Epochs 1–3. Another ANOVA was calculated over Epochs 3–4 to assess the effects of working memory load on contextual cueing.

For the recognition test, to assess whether participants were aware of the repetition of the distractor configurations, we performed a paired-sample t test on hits and false alarms—that is, on the frequency of old displays being correctly recognized and the frequency of new displays being misidentified as old.

Results

Accuracy

In Experiment 1a, one participant was removed because she did not reach the threshold for the visuospatial working memory task. The mean accuracy of the included participants was 83 %, ranging from 60 % to 94 %. Participants were highly accurate in performing the contextual-cueing task; the mean accuracy across the whole experiment—that is, across learning and test phases—was 99 %, with a minimum of 97 % and a maximum of 100 % correct trials. The accuracy in performing the verbal suppression task was also very high, ranging from 95 % to 100 % correct trials, with an average of 98 %.

In Experiment 1b, the data of three participants were removed from the analysis because they did not reach the minimum threshold required for the working memory task. For the remaining participants, the mean accuracy in performing that task was 76 %, ranging from a minimum of 56 % to a maximum of 97 % correct trials. The mean accuracy in performing the contextual-cueing task was also very high: The mean accuracy was 98 %, with a minimum of 95 % and a maximum of 100 % correct trials. Participants were also very accurate in performing the verbal suppression task: on average 98 %, a minimum of 97 %, and a maximum of 100 % correct trials.

For Experiment 1 overall, accuracy in the contextual-cueing task was not higher for old than for new displays [t(35) = 1.252, p = .219]. However, accuracy was more accurate during the learning phase than during the test phase [t(35) = 5.648, p < .001].

Statistical comparisons between experiments will be reported following the section on Experiment 4.

Search times

An ANOVA of the learning phase with Configuration (old, new) and Epoch (1, 3) as within-participants factors and Task Variant (location, rotation) as a between-participants factor yielded a significant main effect of epoch [F(1, 34) = 100.089, p < .001], indicating a general improvement over time (see Fig. 3 and Table 1). The main effect of configuration was also significant [F(1, 34) = 9.474, p = .004]. We found a trend toward significance for the interaction between epoch and configuration [F(1, 34) = 2.899, p = .098]. The interaction between epoch and the experimental working memory load variant was also significant [F(1, 34) = 4.426, p = .043], due to the high search times in Epoch 1 of Experiment 1b (working memory for rotation; Fig. 4). The remaining interactions were not significant [all Fs(1, 34) < 0.920, ps > .344].

Fig. 3
figure 3

Visual search in Experiments 15. Search times are averaged across task versions a and b in each of Experiments 14. The search times are shown for old and new configurations as a function of epochs (of five blocks). Error bars represent standard errors of the means, and the boxes indicate epochs with working memory load. In Experiments 1 and 3, visuospatial working memory was loaded, whereas in Experiments 2 and 4, nonspatial visual working memory was loaded

Table 1 Search times
Fig. 4
figure 4

Visual search in Experiments 1a and 1b. Search times are shown for the old and new configurations as a function of epochs (of five blocks). Error bars represent standard errors of the means, and the boxes indicate epochs with working memory load

Effects due to the removal of working memory load in Epoch 4 were tested by an ANOVA with Configuration (old, new) and Epoch (3, 4) as within-participants factors and Task Variant as a between-participants factor. The main effect of configuration was significant [F(34) = 27.336, p < .001]. The only other significant result was obtained for the three-way interaction [F(34) = 4.825, p = .035; all other Fs(34) < 1.201, ps > .281]. This interaction prompted us to run separate ANOVAs for Experiments 1a and 1b. For Experiment 1a (working memory for location), we obtained a significant main effect of configuration [F(1, 18) = 18.458, p < .001], whereas the main effect of epoch [F(1, 18) = 0.530, p = .476] and their interaction [F(1, 18) = 0.875, p = .362] were not significant. Likewise, for Experiment 1b, the only significant effect was for configuration [F(1, 16) = 10.182, p = .006], whereas the main effect of epoch [F(1, 16) = 0.657, p = .430] and the interaction [F(1, 16) = 4.216, p = .057] were not significant. The interaction, though, showed a tendency toward significance.

Recognition test

The probability that old displays were correctly recognized was .49, and the probability that new displays were misidentified as old was .41. The difference was significant [t(35) = 2.557]. The hit rate did not correlate significantly with the contextual-cueing score in Epoch 3 (r = –.248, p = .144), but it did in Epoch 4 (r = –.348, p = .038).

Discussion

A search advantage for repeated displays was observed in the presence of concurrent visuospatial working memory load. This advantage was observed in both the learning and test phases. This pattern was expected, under the assumption that visuospatial working memory is needed for the expression of learning but not for the learning of repeated contexts.

The experimental variants of the working memory task (location and rotation) affected the data; Experiment 1b had higher search times in Epoch 1, and response times to new displays dropped from Epochs 3 to 4 in Experiment 1b but not in Experiment 1a. We can only speculate that the orientation working memory task initially interfered more heavily with the visual search task, in which Ts and Ls are also oriented lines that differed only by their conjunction. In Epoch 4, search times for new displays dropped more than those for old displays. It is possible that the random generation of new displays may have led to particularly easy search displays in Epoch 4.

Participants’ hit rate was higher than their false alarm rate. This may indicate that they were at least partly aware of the repeated displays. The hit rate did not correlate with the size of contextual cueing in Epoch 3, but it did in Epoch 4. Thus, contextual cueing in Epoch 4 may have been affected by explicit processing.

Experiment 2: Nonspatial visual working memory load and context learning

In Experiment 2, we replaced the visuospatial working memory load of Experiment 1 with two variants of nonspatial visual working memory load that were again presented during the learning phase. As in Experiment 1, we did not expect this load to eliminate contextual learning, but Experiments 1 and 2 served as a baseline for Experiments 3 and 4—where differential effects of spatial and nonspatial working memory loads were expected. However, we observed no overall change in the size of contextual cueing when visuospatial working memory load was added.

Method

Participants

A group of 20 new participants took part in Experiment 2a (15 females, five males; average age 23.05 years) and 20 others in Experiment 2b (11 females, nine males; average age 25.1 years), after giving informed consent. They were paid or were compensated with course credits. They all had normal or corrected-to-normal vision and were not aware of the purpose of the experiment.

Stimuli and procedures

Experiments 2a and 2b were structured in the same way as Experiments 1a and 1b, respectively, except where noted. In the learning phase (Epochs 1–3) of Experiment 2a, a nonspatial color working memory task was combined with the visual search task. The information to be memorized in Experiment 2a was the color of each item contained in the working memory array, while the locations were irrelevant (and kept constant). In a previous pilot study with four items, we observed that this kind of task is more demanding than the corresponding spatial version (see also Manginelli et al., 2012). Therefore, we decided to reduce the size of the memory array in order to achieve performance comparable to that with visuospatial tasks. The working memory array to be memorized consisted of three colored squares, each subtending 0.6° × 0.6° of visual angle, placed at three equidistant positions on an imaginary circle of a radius of 2° around the central fixation point. The memory test probe consisted of a colored square presented at the fixation-point position. The participants had to judge whether the color of the memory probe matched one of the colors previously presented in the memory array, with the match and mismatch trials being equally balanced. The colors of the squares, of both the memory array and the probe, were selected randomly from among red, blue, black, green, magenta, and white.

In Experiment 2b, two distinct white colored “Klingon” letters were presented; they were located equidistantly, one to the left and one to the right of the fixation point, on an imaginary circle subtending 2° of visual angle and centered at the fixation cross. These letters were randomly chosen from the alphabet (A, B, . . . Z) obtained using the Microsoft Windows font Klinzhaj with font size 45. These are artificial letters that have a letter-like appearance, but cannot easily be associated with the Latin alphabet. They have previously been tested as nonspatial visual working memory stimuli (Mecklinger, Bosch, Gruenewald, Bentin, & von Cramon, 2000).

Results

Accuracy

All participants in Experiment 2a surpassed the threshold of 55 % in the working memory task. The mean accuracy for the working memory task was 85 % correct trials, ranging from 60 % to 94 %. Accuracy in the contextual-cueing task (mean of 98 %, minimum 91 %, maximum 99 %) was again very high, as was accuracy in the articulatory suppression task (mean 99 %, ranging from 96 % to 100 %).

In Experiment 2b, also, all participants surpassed the minimum threshold of performance in the working memory task. Average accuracy in that task was 77 %, ranging from a minimum of 66 % to a maximum of 91 %. The participants were again very accurate in performing the contextual-cueing task (mean 98 %, range 94 %–100 %) and the articulatory suppression task (mean 99 %, with minimum 97 % and maximum 100 %).

Across Experiments 2a and 2b, visual search accuracy was not different for the old and new displays [t(39) = 0.208, p = .837]. Participants were, however, more accurate in the learning phase than in the test phase [t(39) = 4.145, p < .001].

Search times

Considering only the learning phase, we performed a repeated measures ANOVA with Configuration (old, new) and Epoch (1, 3) as within-participants factors and Task Variant (spatial, rotation) as a between-participants factor. The main effect of the epoch was highly significant [F(1, 38) = 57.672, p < .001] and showed a general performance improvement during the experiment (Fig. 3). The effect of the configurations was also significant [F(1, 38) = 38.151, p < .001]. The main effect of the task variant [F(1, 38) = 1.292, p = .263] and all interactions [all F(1, 38) < 2.287, p > .139] missed significance.

In order to assess the effect of the removal of the working memory load in Epoch 4, we ran an additional ANOVA with Configuration (old, new) and Epoch (3, 4) as within-participants factors and Task Variant (spatial, rotation) as a between-participants factor. Only the main effect of configuration was significant in this analysis [F(1, 38) = 58.137, p < .001]. All other main effects and interactions were not significant [all Fs(1, 38) < 3.163, ps > .083].

Recognition test

Old displays were correctly recognized with a probability of .57, and new displays were judged to be old with a probability of .5. The difference was not significant [t(39) = 1.770, p = .084].

Discussion

The same pattern was observed as in Experiment 1: A search advantage for old displays was observed during the learning phase, and persisted into the test phase when the working memory task was removed. Thus, implicit context learning developed under nonspatial visual working memory load, and the search advantage persisted when the working memory load was removed.

Experiment 3: Visuospatial working memory load and expression of context learning

In Experiment 3, learning of repeated contexts could occur in the absence of a concurrent load on working memory. Only in Epoch 4 was a visuospatial working memory load added. We expected that the search advantage for old displays would be reduced by the concurrent working memory load, due to interference with the expression of previously learned contextual cues.

Method

Participants

New groups of 20 participants each took part in Experiment 3a (13 females, seven males; average age 23.25 years) and Experiment 3b (15 females, five males; average age 23.05 years) after giving informed consent. All had normal or corrected-to-normal vision and were not informed about the purpose of the experiment. They were paid or were compensated with course credits.

Stimuli and procedures

Experiments 3a and 3b were designed in exactly the same way as Experiment 1. The only difference was that the dual task (location memory in Exp. 3a and orientation memory in Exp. 3b, combined with contextual cueing) was performed in the test phase. Therefore, participants accomplished the first 15 blocks (Epochs 1–3) of visual search alone, followed by five blocks (Epoch 4) of visual search plus working memory task. The duration of the experiments was approximately 1 h.

Results

Accuracy

In Experiment 3a, the data of two participants were discarded because their performance in the working memory task did not reach the minimum threshold of 59 % correct trials. In the remaining group, the mean accuracy for this task was 87 %, ranging from 71 % to 95 %. Accuracy in the contextual-cueing task, across the learning and test phases, ranged from 95 % to 100 %, with a mean accuracy of 98 %. The accuracy in the verbal suppression task was on average 99 %, ranging from 94 % to 100 %.

In Experiment 3b, the data of three participants were removed from the analysis due to subthreshold performance in the working memory task. The mean accuracy for that task in the remaining group of participants was 77 % (range: 60 % to 89 %). In the contextual-cueing task, the mean accuracy was 98 %, with a minimum of 96 % and a maximum of 100 %. The average accuracy in the articulatory suppression task was 98 %, ranging from 95 % to 100 %.

Across Experiments 3a and 3b, accuracy in contextual cueing did not differ between the old and new displays [t(34) = 2.265, p = .030]. Participants’ search accuracy was, again, higher under concurrent working memory load (in this case, in the test phase) than without [t(17) = –5.872, p < .001].

Search times

The ANOVA with Configuration (old, new) and Epoch (1, 3) as within-participants factors and Task Variant as a between-participants factor yielded a significant main effect of epoch [F(1, 33) = 33.723, p < .001], indicating a gradual improvement in participants’ performance (Fig. 3), whereas a significant main effect of configuration [F(1, 33) = 7.494, p = .010] and a significant interaction [F(1, 33) = 11.347, p = .002] indicated the learning of repeated configurations.

The influence of added working memory load on contextual cueing was investigated with an ANOVA on Configuration (old, new) and Epoch (3, 4) as within-participants factors and Task Variant as a between-participants factor. We observed a significant main effect of configuration [F(1, 33) = 12.737, p = .001] and a significant interaction of Configuration × Epoch [F(1, 33) = 5.228, p = .029]. All other main effects and interactions were not significant [F(1, 33) < 2.450, p > .127]. The Configuration × Epoch interaction was due to the presence of contextual cueing in Epoch 3 but not Epoch 4 [Exp. 3a: in Epoch 3, t(17) = 4.094, p = .001; in Epoch 4, t(17) = 1.987, p = .63; Exp. 3b: in Epoch 3, t(16) = 2.419, p = .028; in Epoch 4, t(16) = 1.199, p = .248].

Recognition test

A significant difference was observed between the frequencies of hits (.53) and false alarms (.46) [t(34) = 2.127, p = .041]. However, the recognition (i.e., the frequency of hits) did not correlate with the facilitation in response times for old displays in either Epoch 3 (r = .134, p = .443) or Epoch 4 (r = –.152, p = .384).

Discussion

The central finding of Experiment 3 was the reduction of search facilitation for repeated displays when a visuospatial working memory task was added after learning. Thus, the addition of a concurrent load interfered with the utilization of previously learned contextual relations for the guidance of visual search.

Participants showed an indication of explicit recognition of old displays; however, recognition, again, did not correlate with search facilitation.

Experiment 4: Color working memory load and expression of context learning

Experiment 4 investigated the specificity of the visuospatial load effect on expression of learning by adding the nonspatial working memory tasks already used in Experiments 2a and 2b to the visual search task in Epoch 4, after learning over three epochs in the absence of concurrent working memory load.

Method

Participants

A new group of 20 volunteers (15 females, five males; average age 22.95 years) were paid or compensated with course credits to participate in Experiment 4a, as well as 20 new participants in Experiment 4b (15 females, five males; average age 23.75 years). The participants provided informed consent. All had normal or corrected-to-normal vision and were not aware of the purpose of the experiment.

Stimuli and procedures

Experiments 4a and 4b were designed in exactly the same way as Experiments 2a and 2b. The only difference was that the dual task (color working memory in Exp. 4a and working memory for Klingon letters in Exp. 4b, each combined with contextual cueing) was performed in the test phase. Therefore, participants accomplished the first 15 blocks (Epochs 1–3) of the contextual-cueing paradigm alone, followed by five blocks (Epoch 4) in which the working memory paradigm was added.

Results

Accuracy

In Experiment 4a, no participant was excluded from the analysis, because they all surpassed the accuracy threshold of the working memory task. The average accuracy was 86 %, with a minimum of 66 % and a maximum of 98 %. Accuracy in the contextual-cueing task was, again, very high, ranging from 91 % to 100 %, with an average of 97 % correct trials. Finally, accuracy in the articulatory suppression task ranged from 85 % to 100 % correct trials, with an average of 98 %.

Likewise, all participants of Experiment 4b reached the threshold for the working memory task (mean = 77 %, min = 66 %, max = 93 %). Visual search accuracy was, again, very high (mean = 98 %, min = 95 %, max = 100 %). In the articulatory suppression task, accuracy ranged from 94 % to 100 %, with a mean of 98 %.

Across both experiments, accuracy in contextual cueing was independent of configuration type [t(39) = 0.046, p = .964]. Participants’ search was, again, more accurate in the presence of working memory load (i.e., during the test phase) than in its absence [the learning phase, t(39) = –6.295, p < .001].

Search times

A repeated measures ANOVA on the learning phase with Configuration (old, new) and Epoch (1–3) as within-participants factors and Task Variant as a between-participants factor yielded a significant main effect of epoch [F(1, 38) = 64.293, p < .001]—indicative of improved performance over time—and a significant main effect of configuration [F(1, 38) = 5.125, p = .029]—indicative of faster response times for old displays (Fig. 3). The Configuration × Epoch interaction narrowly failed to reach significance [F(1, 38) = 3.861, p = .057]. All other interactions missed significance [all Fs(1, 38) < 1.544, ps > .222].

The ANOVA on Configuration (old, new) and Epoch (3, 4) as within-participants factors and Task Variant as a between-participants factor yielded a significant main effect of configuration [F(1, 38) = 19.663, p < .001]. All other main effects and interactions did not reach significance [all Fs(1, 38) < 1.056, ps > .311].

Recognition test

As in the other experiments, we compared hits and false alarms. Here, a significant difference was obtained [t(39) = 2.652, p = .011], with hits (.60) being more frequent than false alarms (.52). However, the hit rate did not correlate with the size of contextual cueing (Epoch 3, r = .113, p = .487; Epoch 4, r = .090, p = .583).

Discussion

In the learning phase, the same pattern was observed as in Experiment 3. Starting from virtually identical search times, repeated displays developed a search advantage over new displays. This was indicated by a significant main effect of configuration and a marginally significant Configuration × Epoch interaction. However, the addition of a concurrent nonspatial working memory load in Epoch 4 did not lead to a reduction of the search advantage for old displays. This contrasts with the reduction of the search advantage in Experiment 3 and supports the view that specific visuospatial working memory resources are needed for the utilization of previously learned contextual relations for the guidance of visual search.

Comparisons between Experiments 14

Accuracy

A multivariate analysis with Experiment (14) as a factor and accuracy in the contextual-cueing, visual working memory, and articulatory suppression tasks as dependent variables yielded no significant differences between experiments [contextual cueing, F(3, 147) = 1.372, p = .254; visual working memory, F(3, 147) = 0.409, p = .747; articulatory suppression, F(3, 147) = 0.489, p = .690]. Thus, we observed no evidence of differential working memory loads or speed–accuracy trade-offs between experiments.

Search times

The main finding of Experiments 14 was the reduction of contextual cueing when visuospatial working memory load was added in Epoch 4 of Experiment 3. Such a reduction was not observed when nonspatial working memory load was added in Experiment 4. This difference was confirmed in an ANOVA with Configuration (old, new) and Epoch (3, 4) as within-participants factors and Experiment (3, 4) as a between-participants factor. This ANOVA yielded a significant main effect of configuration [F(1, 73) = 32.644, p < .00] and, importantly, a significant three-way interaction [F(1, 73) = 4.931, p = .029]. All other main effects and interactions were not significant [all Fs(1, 73) < 1.410, ps > .239]. The analogous ANOVA on Experiments 1 and 2, with spatial versus nonspatial working memory load being removed after Epoch 3, did not yield a three-way interaction [Configuration × Epoch × Experiment, F(1, 74) = 0.283, p = .597; main effect of configuration, F(1, 74) = 80.569, p < .001; all other Fs(1, 74) < 1.79, ps > .185]. This underlines the selective disruption of contextual cueing when visuospatial working memory load was added (as opposed to removed) in Epoch 4.

Experiment 5: Contextual cueing without working memory load

Finally, we ran an experiment in which the contextual-cueing task used in Experiments 14 was run without any secondary working memory load. The purpose of this experiment was to establish a contextual-cueing baseline that was uncontaminated by an additional working memory task in either the learning or the test phase.

Method

A new group of 20 volunteers (17 females, three males; all right-handed, average age 21.6 years) were paid or compensated with course credits to participate in Experiment 5. The participants provided informed consent. All had normal or corrected-to-normal vision and were naive as to the purpose of the experiment. The contextual-cueing task was carried out in the same way as in Experiments 14, but without any additional working memory task.

Results

Accuracy

The mean accuracy was 95 %, ranging from 83 % to 100 %. The accuracy for old displays (mean 96 %, range 88 %–100 %) showed a tendency to be higher than accuracy for new displays (mean 95 %, range 79 %–100 %) [t(19) = 2.021, p = .058].

Search times

A repeated measures ANOVA on the learning phase, with Configuration (old, new) and Epoch (1, 3) as within-participants factors, yielded a significant main effect of epoch [F(1, 19) = 67.827, p < .001]—typical of a general improvement of the participants’ performance during the experiment—and a significant main effect of configuration [F(1, 19) = 12.718, p = .002]—indicating faster response times for old displays (Fig. 3). The Configuration × Epoch interaction was also significant [F(1, 19) = 7.292, p = .014].

We then compared the sizes of contextual cueing in Epochs 3 and 4 in Experiments 14 with those in Experiment 5, in order to find decreased contextual-cueing effects due to the working memory load manipulations. For this purpose, we first calculated standardized contextual-cueing scores, according to the formula cc = (new RT – old RT)/new RT, where “cc” stands for contextual cueing and new (old) RT stands for the response time to new (old) displays.

We calculated four separate multivariate ANOVAs with Experiment (Exp. 1 vs. Exp. 5, Exp. 2 vs. Exp. 5, Exp. 3 vs. Exp. 5, and Exp. 4 vs. Exp. 5) as a factor and the cc scores for Epochs 1–4 as dependent variables. The only significant difference was obtained for the comparison of Epoch 4 in Experiments 3 and 5 [F(1, 53) = 7.585, p = .008], indicative of reduced contextual cueing in Epoch 4 of Experiment 3.

Recognition

The hit rate (.55) was somewhat higher than the false alarm rate (.48), but the difference was not significant [t(19) = 1.45, p = .163].

Discussion

The baseline experiment replicated the basic pattern of a search advantage for repeated displays that increased with repetitions. The comparisons with Experiments 14 demonstrated a highly specific reduction of contextual cueing for visuospatial working memory added in the test phase, whereas all other working memory conditions did not produce a reduction of contextual cueing.

General discussion

We analyzed the dependence of contextual cueing on working memory resources, and found that working memory load from a concurrent task reduced contextual-cueing scores only when working memory load was added after an initial learning period without additional working memory load. This effect was specific for visuospatial working memory load; it was not observed for nonspatial working memory load. The decrease of contextual cueing when visuospatial working memory load was added during test was compared with a baseline without any secondary task, and the former turned out to be the only condition in which contextual cueing was reduced due to a concurrent working memory load. Importantly, these comparisons included working memory load during the initial learning phase that did not prevent contextual cueing. The difficulty of the present working memory tasks did not differ between experiments, eliminating task difficulty as a confounding factor.

Our findings are fully in line with a recent report stating that learning of contextual cues does not depend on working memory (Vickery et al., 2010). Vickery et al. applied various secondary tasks during learning, but always tested contextual search facilitation in the absence of concurrent working memory load. This is the same concept that we applied here in Experiments 1 and 2. In agreement with Vickery et al., we found no impact of working memory load on contextual cueing under these circumstances. However, our data extend previous findings by demonstrating that the expression of learning depended on visuospatial working memory—even if implicit learning of search contexts itself was unaffected.

Travis et al. (2013) demonstrated the importance of the amount of working memory load as a modulating factor of contextual cueing. In their difficult conditions, participants had to retain four locations in working memory. This is the same as in our experiments—although with subtle differences that may have affected task difficulty. Travis et al. presented four dots consecutively rather than simultaneously. We could show that the working memory load in the present study was high enough to reduce contextual cueing at test, whereas the same amount of contextual cueing did not lead to a reduction of cueing early in learning. However, we cannot rule out that a further increased working memory load might lead to additional reductions of contextual cueing earlier in learning, as in the studies of Travis et al. However, with increasing working memory load, there may also be a higher likelihood of influence from factors other than the maintenance of working memory contents—for example, task coordination, affecting contextual cueing. Therefore, future experiments should evaluate whether varying the amount of visuospatial load specifically leads to differential effects on contextual cueing early in learning or in later test phases.

One further difference is that Vickery et al. (2010) and Travis et al. (2013) did not use an articulatory suppression task, so that verbalization of working memory contents could not be ruled out. However, our results show that the learning of contextual cues can proceed in the presence of concurrent visual working memory load, even if verbalization is blocked by articulatory suppression.

We found hints for explicit recognition of old displays in Experiments 1, 3, and 4. However, the absence of contextual cueing after the addition of visuospatial working memory load in Experiment 3 cannot be explained by explicit processing alone, because in this case, we should have observed a comparable reduction of contextual cueing after the addition of nonspatial working memory load in Experiment 4. Moreover, no correlation was observed between the size of the search facilitation for repeated displays and the frequency of correctly recognized old displays. This lack of a correlation is in line with previous findings (Geyer, Shi, & Müller, 2010).

The reduction of contextual cueing when visuospatial working memory load was imposed during test phases generalized over two different tasks: working memory for location and rotation. Similarly, working memory for neither colors nor artificial Klingon letters interfered with contextual cueing. Thus, we can rule out that this effect was due to a particular confounding feature of a single task.

One may argue that the reduction of contextual cueing in Experiment 3 was caused by the addition of a secondary task rather than by the working memory load per se. The more complex task structure of the combined working memory and visual search task may have somehow interfered with contextual cueing. However, this view cannot explain why the addition of a nonspatial working memory task in Experiment 4 did not lead to a reduction of contextual cueing. It is, furthermore, difficult to explain why an unspecific dual-task demand should slow down processing of repeated displays more than it slows down processing of new displays. Thus, we argue that the reduction of contextual cueing in Experiment 3 was due to the specific load on visuospatial working memory. It may, however, be that the load imposed on visuospatial working memory is particularly high when the working memory task is newly added. Over several epochs, participants may learn to handle the demands of the working memory task more efficiently—for example, by an optimized coding of the memory items or improved intertask coordination (Liepelt, Strobach, Frensch, & Schubert, 2011). This may explain why the working memory task led to a significant reduction of contextual cueing relative to the baseline in Epoch 4 of Experiment 3, but not in Epoch 3 of Experiment 1.

One unexpected finding was the higher accuracy in the contextual task under working memory load. In Experiments 1 and 2, accuracy was higher during learning, whereas in Experiments 3 and 4, it was higher under test. Thus, accuracy appeared to benefit from the concurrent working memory task. Moreover, here there was no difference between performance under the visuospatial and nonspatial tasks, in contrast to the search time data. We can only speculate whether the more demanding dual-task situation led to improved attentional focusing on the visual search task. In any case, the process that caused accuracy to be higher during working memory load appears to be unrelated to the disruption of search time facilitation caused by added working memory load in Experiment 3, because accuracy was improved with visuospatial and nonspatial loads during both the learning and testing phases.

Our findings may also shed light on the role of medial temporal lobe structures in contextual cueing. Both patient (Chun & Phelps, 1999; Manns & Squire, 2001) and brain activation (Greene, Gross, Elsinger, & Rao, 2007; Preston & Gabrieli, 2008) data suggest medial temporal contributions to contextual cueing. This was initially surprising, because the medial temporal cortex was traditionally seen as a central structure for explicit declarative memory but not implicit memory. Medial temporal structures are involved in working memory maintenance along with the ventral occipitotemporal cortex (Ranganath & D’Esposito, 2005). fMRI activation in these same structures is modulated by contextual cueing (Geyer, Baumgartner, Müller, & Pollmann, 2012; Manginelli, Baumgartner, & Pollmann, 2013). The present data suggest that the contextual-cueing-related activation in these areas may be related to the retrieval and maintenance of learned memory cues during search.

It may still appear puzzling that the incidental (and largely implicit, as the strength of evidence for explicit recognition did not correlate with the size of contextual cueing) learning of repeated contexts during visual search may depend on visuospatial working memory. One possible explanation for this puzzle is that the expression of learning does not depend on working memory, in the sense of the four items that are in the “broad focus of attention” (Oberauer & Hein, 2012)—that is, the items that can be held available for use—but rather on the activated part of long-term memory that is linked to the items within this focus of attention (Cowan, 1988; Oberauer, 2002). This activated part of long-term memory is thought to be activated insufficiently to produce awareness, but more strongly than long-term memory entries completely unrelated to the current task. This concept of an activated part of long-term memory was initially introduced in order to account for priming effects (Cowan, 1988). It would also fit the assumed implicit nature of contextual cueing (Chun & Jiang, 1998) if utilizing such a subtle activation of memory traces of repeated search contexts in long-term memory was sufficient for the guidance of visual search by learned contextual cues. Alternatively, it may also reduce the response threshold of targets in familiar contexts (Kunar, Flusberg, Horowitz, & Wolfe, 2007).

If we assume that filling the broad focus of attention with spatial or nonspatial content will activate related items in the activated part of long-term memory, the visuospatial working memory contents will interfere more strongly with the activation of contextual spatial memory traces by repeated search displays than with activation by nonspatial working memory contents. This interference could explain the deleterious effect of visuospatial working memory load on contextual cueing, without assuming that learned search configurations need to enter the broad focus of attention, thereby potentially entering awareness. The latter may happen for a fraction of repeated displays, so that contextual cueing becomes explicit (Geyer et al., 2012; Geyer et al., 2010; Smyth & Shanks, 2008), but it is not necessary for contextual cueing to occur.

Thus, visuospatial working memory may not only be vital to keep items explicitly maintained during search (Bundesen, 1990; Duncan & Humphreys, 1989; Treisman, 1988) or to guide visuospatial attention by intentional maintenance of stimuli in working memory (Awh, Jonides, & Reuter-Lorenz, 1998; Griffin & Nobre, 2003; Kuo, Rao, Lepsien, & Nobre, 2009; Lepsien, Griffin, Devlin, & Nobre, 2005), but may also interfere with implicit contextual cues in the activated part of long-term memory.