Working memory (WM) is a limited capacity system that allows for the temporary maintenance and manipulation of information. Importantly, individual differences in WM capacity have been linked to many higher-order cognitive abilities (Conway, Jarrold, Kane, Miyake & Towse, 2007). Although the capacity of WM was once thought to be relatively fixed, recent studies suggest that it is possible to expand this capacity through training. Participants in such studies typically train on one or more WM tasks over multiple sessions, and performance improvements on trained tasks are consistently observed. Of greatest interest to researchers has been transfer effects, or improvements on untrained tasks measuring higher-order abilities.

Many training studies report successful near transfer (improvement on tasks similar to the trained task; e.g., Chein & Morrison, 2010), but far transfer (improvement on relatively dissimilar tasks) has been more elusive. Although some studies have reported far transfer to higher-order abilities, including fluid intelligence (gF; e.g., Jaeggi, Buschkuehl, Jonides, & Perrig, 2008) and reading comprehension (e.g., Chein & Morrison, 2010), the literature contains many inconsistent results, and far transfer is not always observed (Morrison & Chein, 2011; Redick, Shipstead, Harrison, Hicks, Fried & Hambrick, 2012; Shipstead, Redick, & Engle, 2012). Notably, a recent meta-analysis reported reliable near transfer to WM tasks following training, but no convincing evidence of far transfer was observed (Melby-Lervåg & Hulme, 2012).

Despite the inconsistent findings regarding transfer, recent WM training studies consistently report improvement on the trained tasks, indicating that WM, as measured by those specific tasks, has improved. The basis of these training benefits has received little attention, however, and it is not known which WM components and/or processes are being improved or whether new strategies are being acquired. For example, in the adaptive dual n-back task pioneered by Jaeggi et al. (2008), participants are simultaneously presented with two items (a box on the computer screen and a spoken letter) and must decide whether either or both of the items match those presented n items back, where n changes adaptively based on performance. This task surely involves numerous verbal and spatial WM processes, but it is unclear which are responsible for the improvements in dual n-back performance following training.

Whereas previous studies have largely focused on determining the extent to which WM training improvements result in transfer to higher-order abilities, the primary goal of the present study was to identify how training on the adaptive dual n-back task affects WM. The present study focused on the following components: short-term memory, updating, the focus of attention, and executive attention.

With respect to short-term memory, it should be noted that the n-back task requires temporary storage, and training might increase short-term storage capacity in either the verbal or the spatial domain. Therefore, the present study included both verbal and spatial simple span tasks.

With respect to updating, the to-be-remembered items in the n-back task are presented in a continuous stream, and participants must constantly update the contents of their WM by replacing old items with new items. Training might increase the efficiency of this process, and so the present study included an independent measure of updating (i.e., response times [RTs] on repeat trials of Garavan’s [1998] focus-switching task).

With respect to the focus of attention, it has been reported that the capacity of the focus of attention can be expanded (Verhaeghen, Cerella, & Basak, 2004), and thus it is possible that improvements on the dual n-back task observed following training could be due to an increase in this capacity. Following Bunting, Cowan, and Saults (2006), a running span task was used to assess the capacity of the focus of attention.

In addition, Cowan’s (2001) and Oberauer’s (2002) theories propose that information moves in and out of the focus of attention as needed. To examine whether training increases the speed at which one can switch items in the focus of attention, the present study included a measure of focus switching (i.e., RT switch costs on Garavan’s [1998] focus-switching task).

With respect to executive attention, the controlled attention view of WM (Kane, Bleckley, Conway, & Engle, 2001) posits that concurrent processing during a memory task places demands on the ability to control attention. Training might improve the ability to focus attention on relevant information, and so the present study included a complex WM span task.

In order to determine how WM is affected by training, we sought to identify which of the preceding components showed improvement when pretest scores on an untrained task were compared with posttest scores. As suggested by Shipstead et al. (2012), the present study included an active control group that received nonadaptive training, as well as a no-contact control group, in order to distinguish the effects of adaptive training from those of prolonged exposure to the dual n-back task or other nonspecific effects of experimental experience (e.g., the Hawthorne effect).

Method

Participants

Fifty-two undergraduates were randomly assigned to one of three groups: adaptive training (n = 13), nonadaptive training (n = 13), and no-contact control (n = 26).

Procedure

All participants completed pre- and posttest sessions, approximately 1.5 h long. Participants in the two training groups also completed eight 30-min training sessions in which they performed either an adaptive or a nonadaptive dual n-back task. All tasks were programmed using E-Prime 1.2 (Schneider, Eschman, & Zuccolotto, 2002) and were presented on a 17-in. touch screen LCD monitor.

Dual n-back task

The procedure was identical to that described by Jaeggi et al. (2008). On each trial, participants were presented simultaneously with a visual and an auditory stimulus. The visual stimulus was a box presented in one of eight locations around the periphery of the screen, and the auditory stimulus was one of eight letters. Each trial lasted 3 s: a 500-ms item presentation interval, followed by a 2,500-ms interstimulus interval (ISI) during which participants were to respond. Participants were instructed to respond using the keyboard whenever the current stimuli matched the target stimuli presented n trials back. The current stimuli could match the target visual or auditory stimuli or both.

Each session consisted of twenty 20-trial blocks. In the adaptive form of the task administered to all participants in the pretest session, the n-back level was set at one for the first block, after which the n-back level for the current block was determined by the number of errors in the previous block. If a participant made fewer than three auditory target errors and three visual target errors, the n-back level for the next block was increased by one; if a participant made more than five total errors, the n-back level for the next block was decreased by one. Otherwise, the n-back level on the next block remained the same. Participants were informed of the n-back level before the start of each block.

All adaptive training and posttest sessions used the same n-back procedure as the pretest session, except that the n-back level in the first block was the average level that the participant obtained in the previous session. For nonadaptive training sessions, n-back level was held constant at the average level obtained in the pretest session. Importantly, the n-back level was set individually for each member of the nonadaptive group.

Cued recall span task

On each trial of this task, participants were presented with a series of digits. Each digit was presented in the center of the screen for 1,500 ms, followed by a 500-ms ISI. At the end of each trial, participants were asked to recall one digit. Participants were shown a row of 12 boxes, 1 of which was colored green, and the position of the green box indicated which digit was to be recalled. For example, if the third box was green, participants were to recall the third digit. Participants responded by touching the corresponding number on the screen. Participants were never asked to recall digits from positions 10–12. Series length ranged from 3 to 12, and participants completed 20 trials (2 trials at each length). One point was awarded for each correct trial, and points were summed across trials to create a score reflecting verbal short-term memory capacity.

Focus-switching task

In the first part of the task, a series of small and large squares were presented one at a time in the center of the screen. With each presentation, participants were instructed to update their count of how many squares of each size they had seen so far in the series and then to press the space bar to see the next square. Thus, participants were required to keep two running counts, one for each square size, and were asked to report both counts at the end of each series. The order in which participants were asked to report their counts (i.e., report small or large squares first) was randomized. The total number of squares in each series ranged from 16 to 20, and participants saw two series of each length.

In the second part of the task, a series of squares, circles, and triangles were presented one at a time, and participants kept three running counts (i.e., one for each shape). The procedure was otherwise the same. In both parts of the task, each shape could be preceded by either the same shape (a repeat trial) or a different shape (a switch trial). RTs were measured from the presentation of a shape to when the participant pressed the space bar to see the next shape. Repeat trial RTs were used as a measure of WM updating, and switch costs (i.e., switch trial RTs minus repeat trial RTs) were used as a measure of focus switching.

Grid span task

On each trial of this task, participants were first presented with an empty 4 × 5 grid in which a series of red Xs appeared one at a time. Each X was presented for 1,750 ms, followed by a 750-ms ISI. At the end of each series, participants were shown an empty grid and were asked to touch all of the locations in which they had seen a red X; when a location was touched, a black X appeared in that location. Participants received one point for each trial on which they correctly recalled all locations, and points were summed across trials to create a score reflecting spatial short-term memory capacity. Series lengths ranged from 2 to 11, and participants completed 20 trials (2 trials at each length).

Operation span task

On each trial of this complex WM span task, participants saw a series of words. Before the presentation of each word, participants were shown a mathematical equation [e.g., (2 × 4) + 5 = 13] and indicated whether the equation was correct or incorrect by pressing one of two keys on the keyboard. Each word was presented in the center of the screen for 1,500 ms, followed by a 250-ms ISI. At the end of each trial, participants were asked to recall as many words as possible in the correct order. Participants were awarded one point for each word recalled in the correct serial position and half a point for each word in an incorrect serial position. Points were summed across trials to create a score reflecting executive attention. Series length ranged from 2 to 7, and participants completed 12 trials (2 trials at each length).

Running span task

On each trial of this task, participants heard a series of recorded digits presented at a rate of 250 ms per digit. After each series, participants were asked to recall as many digits as possible from the end of the series in the order of presentation. Participants entered responses using the keyboard and were instructed to use the “m” key as a placeholder as needed. One point was awarded for each digit recalled in the correct serial position, and points were averaged across trials to create a score reflecting the capacity of the focus of attention. Series length was 12, 14, 16, 18, or 20 digits, and participants completed 40 trials (8 trials at each length).

Results

As may be seen in Fig. 1, adaptive training resulted in systematic improvement in n-back performance across sessions. To examine the effects of training within each group, paired t-tests were performed on pre- and posttest scores. Significant improvements in n-back performance were revealed for each group [adaptive training, t(12) = 4.62, p = .001; nonadaptive training, t(12) = 4.22, p = .001; and no-contact control, t(25) = 2.68, p = .013]. These differences were still significant after correcting for multiple comparisons.

Fig. 1
figure 1

Daily n-back level performance on the middle 16 blocks for the adaptive training group

Additional analyses were conducted using multivariate analyses of variance (MANOVAs) in order to further protect against the inflated type 1 error that can result from conducting multiple tests. An overall multivariate test revealed that the three groups did not differ on any measure at pretest, F(12, 88) = 1.26, n.s. Descriptive statistics for all measures are given in Table 1.

Table 1 Descriptive statistics

To compare the n-back improvements across groups (see Fig. 2), as well as to determine whether there were significant improvements on any untrained task, a MANOVA was performed on the difference between posttest and pretest performance for each task. The overall multivariate test was significant, F(12, 88) = 3.41, p < .001, η 2 = .534, providing justification for comparing pre- and posttest differences among the three groups. Further multivariate and univariate tests revealed significant group effects for only the n-back task and the running span task. For the n-back task, the adaptive training group improved significantly more than the no-contact control group, F(1, 49) = 31.40, p < .001, η 2 = .386, and the nonadaptive training group, F(1, 49) = 5.84, p = .019, η 2 = .072. Significant group differences were also observed on the running span task: As may be seen in Fig. 3, the adaptive training group again improved significantly more than the no-contact control group, F(1, 49) = 5.87, p = .019, η 2 = .177, and the nonadaptive training group, F(1, 49) = 10.61, p = .002, η 2 = .098.

Fig. 2
figure 2

N-back performance difference scores (posttest − pretest) for each group

Fig. 3
figure 3

Running span performance difference scores (posttest − pretest) for each group

The multivariate test comparing the nonadaptive training group and the no-contact control group was not significant, F(6, 44) = 1.35, and so no univariate tests were performed. No significant group effects were found on the cued recall span task, grid span task, or operation span task, or on the updating and switch cost measures from the focus-switching task. In addition, because analyses using difference scores are somewhat controversial, analyses of covariance were performed on the posttest scores from each task, with the pretest scores used as covariates, and the results of these analyses were no different from those of the MANOVA.

Discussion

The goal of the present study was to evaluate five components of the dual n-back task potentially responsible for the WM benefits observed following n-back training. As was expected, adaptive training resulted in substantial improvements in dual n-back task performance. Importantly, participants in the adaptive training group also showed improvement on the running span task, a measure of the capacity of the focus of attention. No significant improvements were found on the other four untrained tasks, which assessed short-term memory capacity, updating, focus switching, and executive attention. Although no task is process-pure, the observed improvement on the running span task suggests that one important reason that dual n-back performance improves during adaptive training is because such training expands the capacity of the focus of attention.Footnote 1 Whereas the other tasks also undoubtedly involve the focus of attention, performance on the running span task would appear to be the most dependent on the capacity of that focus. It should also be noted that the present study included an active control group in addition to a no-contact control group, an important methodological improvement from many previous WM training studies, and so was able to control not only for simple practice effects, but also for exposure to the n-back task and time spent with experimenters.

The present study is not the first to show that training can expand the capacity of the focus of attention. On the basis of examination of RTs on a modified n-back task, Verhaeghen et al. (2004) reported that practice increased the number of items immediately accessible in the focus of attention. In addition, Dahlin, Nyberg, Backman, and Neely (2008) observed improvements to performance on a single n-back task following training on a running span task. The present study represents a systematic replication of Verhaeghen et al. and Dahlin et al., providing more evidence for the link between the running span task, the n-back task, and the focus of attention. Importantly, the present study is the first to isolate the basis for the training effect. Whereas Dahlin et al. were interested in how far the benefits of training extended, the goal of the present study was to identify the basis for those benefits. By including five untrained tasks in addition to the dual n-back task, we were able to show that increases in the capacity of the focus of attention were responsible for the improvements in dual n-back performance following adaptive training.

The present results argue against the hypothesis that WM training can improve gF (e.g., Jaeggi et al., 2008). If that were so, one should expect the benefits of dual n-back training to generalize to other WM tasks (e.g., operation span), which have been previously shown to correlate with gF (e.g., Engle, Tuholski, Laughlin, & Conway, 1999). It is possible that if participants had completed additional training sessions, transfer to other tasks besides the running span task might have been observed. It should be noted, however, that the adaptive training group’s improvement in n-back performance was considerable, suggesting that 8 days of training is sufficient to produce substantial benefits: As compared with pretest performance, participants in the adaptive training group showed an increase in the number of items (letters and locations) that could be reliably maintained in memory from 4.18 to 7.12 items, more than a 70 % increase.

Notably, the improvement observed on the running span task, rather than being due solely to an increase in the capacity of the focus of attention, may also reflect similarities between the running span and n-back tasks. Perhaps the greatest similarity between the two tasks is that, unlike many memory tasks, they both require participants to remember only the last few items in the series, rather than the series in its entirety, meaning that formerly relevant items in the focus of attention must be constantly replaced with new items. However, we found no evidence that adaptive training improved active updating, at least as measured by RTs on the focus-switching task. Although some researchers believe that the running span task measures updating (e.g., Dahlin et al., 2008), Bunting et al. (2006) specifically argued that although the running span task involves updating when items are presented at a slow rate, the role of active updating, as opposed to passive overwriting, is greatly diminished when items are presented at a fast rate like that used in the present study.

In addition to their similarities, the n-back task and the running span task also have considerable differences. The n-back task involves the recognition of previous items, whereas the running span task involves serial recall. Moreover, the two tasks differ in when these responses are required: Participants potentially respond after every trial (i.e., after the presentation of each box/letter pair) in the n-back task, but only at the end of each series (i.e., after a minimum of 12 digits) in the running span task. The pace of the two tasks is also quite different. In the n-back task, each stimuli pair is presented for 500 ms, followed by a 2.5-s interval in which participants respond and update the contents of their memory. The pace of the running span task is much quicker because digits are each presented for only 250 ms and are followed immediately by the next digit. These differences suggest that the present results are not merely due to surface-level task similarities or strategy overlap but, rather, to n-back training improving a specific WM process that is also tapped by the running span task.

It is possible, of course, that rather than conceptualizing the shared process as the focus of attention, it might be better conceptualized in other terms—for example, as the capacity or efficiency of primary memory (Unsworth & Engle, 2007). Alternatively, it is possible that because of the similarities between the n-back and running span tasks, adaptively trained participants were able to apply a strategy developed during n-back training to the running span task in the posttest session. Although this cannot be definitely ruled out, it is unclear what that strategy would be. Moreover, because the rate of item presentation is much faster on the running span task than on the n-back task, we suspect that the ability to generalize a strategy learned during training would be quite limited. Nevertheless, it remains possible that similarities between the two tasks drive the observed transfer, and further research is needed to address this issue.

How far and to what abilities WM training can transfer is controversial, but it is clear that training does have a profound effect on WM performance. In every WM training study of which we are aware, participants show large improvements in their performance on the trained task by the final session. These training benefits are themselves important, regardless of whether transfer is also observed, but few previous studies have examined the mechanism underlying these benefits. The present study offers evidence that an increase in the capacity of the focus of attention may underlie the training benefits seen on the adaptive n-back task.