Transfer of working memory training to the inhibitory control of auditory distraction
Working memory refers to a cognitive system of limited capacity which enables temporal storage and processing (e.g., manipulation, monitoring) of information to support thought and action processes (see Baddeley
2003; Cowan
2017; Miyake and Shah
1999). It has been shown that individual differences in the capacity of working memory are related to a number of complex cognitive or verbal abilities, such as reasoning (Fry and Hale
1996; Kyllonen and Christal
1990), problem-solving and general intelligence (e.g., Conway et al.
2003; but see Harrison et al.
2013), reading comprehension (Daneman and Carpenter
1980; Engle et al.
1991), and selective listening in cocktail-party situations (Conway et al.
2001). Working memory impairment, on the other hand, has been associated with attention deficits and learning disabilities (Alloway
2009; Martinussen et al.
2005). More recently, several studies demonstrated that working memory capacity can be enhanced through extensive cognitive training, both in children and adults, leading to improvement on various cognitive tasks addressing reading comprehension, executive control, episodic memory, or fluid intelligence (Buschkuehl et al.
2008; Chein and Morrison
2010; Dahlin et al.
2008a,
b; Jaeggi et al.
2008,
2010; Klingberg et al.
2002; Salminen et al.
2012; Schmiedek et al.
2010; Thorell et al.
2009). Several well-controlled studies, however, failed to replicate these widespread transfer effects resulting from working memory training (Melby-Lervåg and Hulme
2013; Redick et al.
2013; Thompson et al.
2013). Therefore, reviews and meta-analyses on the efficacy of working memory training drew rather inconsistent conclusions (Au et al.
2015; Dougherty et al.
2016; Karbach and Verhaeghen,
2014; Melby-Lervåg et al.
2016; Melby-Lervåg and Hulme
2013; Soveri et al.
2017; von Bastian and Oberauer
2013b). There is still an ongoing debate regarding the specific cognitive functions that benefit from working memory training, and to what extent the training-related improvement of these functions yields transfer to untrained tasks that require more generalized cognitive abilities, such as cognitive flexibility, problem-solving, or fluid intelligence. From the empirical data available, it can be concluded that transfer of working memory training is more likely in transfer tasks that are structurally similar to the trained tasks (near transfer) than when the transfer tasks share only a few features with the trained task (far transfer), but there is still very little understanding of the exact cognitive mechanisms and components of working memory that enable transfer (Gathercole et al.
2019; Shipstead et al.
2010; Simons et al.
2016).
Most models of working memory distinguish (a) one or multiple storage buffers or maintenance components from (b) a component for executive control which enables monitoring and manipulation of the stored information (Baddeley
1996,
2003; Baddeley and Hitch
1974; Engle
2002; Miyake and Shah
1999; Oberauer et al.
2000). Cognitive training tasks, such as the dual
n-back task, which has been shown to successfully enhance working memory capacity (i.e., the number
n of items to-be-updated in working memory; see Jaeggi et al.
2008), typically require both maintenance and executive control (e.g., updating) of the information in working memory, but it is still unclear which executive functions benefit most from cognitive training, and how the training-related improvement is related to transfer. Working memory updating and monitoring, set shifting (i.e., cognitive flexibility or task-switching), and inhibition were found to be the three major functions of executive control, which are involved in many cognitively demanding tasks (Miyake et al.
2000), but a majority of studies on working memory training seem to have used tasks that primarily require the updating and monitoring component (e.g., Dahlin et al.
2008a,
b; Jaeggi et al.
2008; Kühn et al.
2013; Lilienthal et al.
2013; Salminen et al.
2016). In a typical dual
n-back task, participants are presented with two running sequences of stimuli (auditory and visual) from which the items of the last few (
n) trials only need to be memorized. The participants’ task is to indicate whether any of the two items on the current trial is identical to one of the items that were presented exactly
n trials before. Hence, it is required to continuously monitor and update the information to be maintained in working memory, but the task may also involve inhibition of currently irrelevant items and attention shifting between the two sequences or stimulus modalities. More specifically, it has been suggested that the
n-back task requires not only encoding, storage, and rehearsal of items, but also discarding (inhibition) of previously encoded items and repositioning (updating) of the to-be-remembered information in working memory (Postle et al.
2001). While the empirical results are still scarce and also inconsistent, there is some evidence suggesting that extended training on the dual
n-back task does indeed improve updating and monitoring, whereas it may not necessarily generalize to other functions of executive control, such as set shifting or inhibition (Dahlin et al.
2008a,
b; Salminen et al.
2012; von Bastian and Oberauer
2013a).
The purpose of the present study is to investigate whether working memory training can be used to enhance the inhibitory control function of working memory. Individual differences in the strength of inhibitory control were shown to predict both the development and the age-related decline of cognitive abilities (Diamond and Gilbert
1989; Hasher and Zacks
1988; Salthouse and Meinz
1995). These findings suggest that inhibitory control may benefit also from a cognitive training, which might have important implications in particular for maintenance of inhibition in the older age. However, it has been argued that inhibition may not be a unitary mechanism, but to refer to three functionally distinct processes (see Friedman and Miyake
2004): (1) suppression of pre-potent or automatic responses (as in a Stroop task; Stroop
1935), (2) inhibitory control of the interference produced by irrelevant stimuli (as in a flanker task; Eriksen and Eriksen
1974; or in an “irrelevant sound paradigm”; Jones and Macken
1993; Salamé and Baddeley
1982), and (3) inhibition of information in memory (e.g., to avoid proactive interference). It has been found that inhibition of pre-potent responses and inhibition of irrelevant stimuli (interference control) may be closely related, whereas inhibition of proactive interference seems to be a separate process (Friedman and Miyake
2004).
While there is some indication that pre-potent response inhibition can be improved with practice (in particular when combined with transcranial direct current stimulation; Ditye et al.
2012), very little is known about the possible effects of an extended working memory training on the other forms of inhibitory control. Here, the effect of two different types of working memory training, varying in the degree of inhibitory control required, were compared with regard to their transfer effects on (a) the ability to suppress pre-potent responses (response inhibition) and (b) the ability to inhibit interference from irrelevant auditory information (resistance to auditory distraction). Specifically, one group of participants was trained on a standard dual
n-back task which is supposed to involve primarily updating and monitoring of contents in working memory (Braver et al.
1997; Jaeggi et al.
2007), and possibly to some extent other inhibitory control processes, such as the inhibition of irrelevant stimulus information (Postle et al.
2001). To experimentally enhance the degree of inhibitory control involved in the dual
n-back, a second group was trained on an “inhibitory” version of the dual
n-back task (inhibitory
n-back) in which responses had to be given predominantly, and participants had to occasionally inhibit the response depending on the current information held in working memory (i.e., on “
n-back trials”). Both types of
n-back training are expected to enhance working memory updating skills which were tested with an untrained visual updating task before and after training (adopted from Dahlin et al.
2008a). Moreover, assuming that response inhibition and resistance to distractor interference are closely related (Friedman and Miyake
2004), any training-related improvement on the inhibitory dual
n-back task may be expected to induce more transfer to performance in other tasks that require either the suppression of pre-potent responses or inhibitory control of irrelevant stimuli, compared to the standard dual
n-back task with lower demands for inhibition. Therefore, transfer of the two types of working memory training was assessed in terms of both response inhibition and the degree of interference produced by auditory distractors. In addition, far transfer was tested for unrelated executive functions (i.e., task-switching) and more generalized cognitive abilities (i.e., problem-solving skills related to fluid intelligence) for which transfer was reported previously (e.g., Jaeggi et al.
2008).
Generalization to response inhibition was assessed with the Simon task (Hedge and Marsh
1975) in which a target is presented at a location that is either spatially compatible or incompatible with the location of the response. Specifically, on compatible trials a response needs to be made by the hand that corresponds to the location of the target (the pre-potent response), whereas on incompatible trials, the response needs to be made by the other hand and the pre-potent response needs to be inhibited. Typically, response time increments are observed on incompatible trials, as compared to compatible trials (the Simon effect). The inhibitory working memory training could be expected to affect response inhibition: If the training-related enhancement of inhibitory control enhanced the ability to suppress pre-potent, dominant, or automatic responses (Friedman and Miyake
2004), then reduced Simon effects should be observed at post-test in the inhibitory
n-back group.
In addition, transfer of the inhibitory training could be expected also with regard to inhibitory control of auditory distraction. It is well known that task-irrelevant sound, such as speech or random tone sequences, disrupts performance in serial short-term memory tasks (e.g., Colle and Welsh
1976; Jones et al.
2004; Jones and Macken
1993; LeCompte et al.
1997; Salamé and Baddeley
1982). While these disruptions were originally explained with speech-related interference-by-content in the “phonological loop” (Baddeley and Hitch
1974; Salamé and Baddeley
1982), it has been shown later that similar disruption can be produced also by non-phonological sound (e.g., changing tones; Jones and Macken
1993), and it has been suggested that the interference may be specific to the processing of serial order in short-term memory (e.g., Jones and Macken
1993,
1995). More specifically, according to the object-oriented episodic record account (Jones et al.
1996), auditory distraction is assumed to be a by-product of perceptual organization processes which enable the segregation and grouping of auditory objects (during auditory scene analysis; Bregman
1990). Any change in the state of background sound is expected to give rise to the formation of a new auditory object, which is automatically linked to the previous objects (using “pointers”), thus creating an ordered stream. In a serial recall task, articulatory rehearsal can be used (as a motor planning process) to deliberately form and refresh links between to-be-remembered items, thus enabling the maintenance and retrieval of serial information. However, automatic processes of auditory perceptual organization form additional links between task-irrelevant changing-state sounds, which then interfere with the deliberate motor planning and rehearsal processes during serial recall. In line with this interference-by-process account (Hughes and Marsh
2017; Jones et al.
2004; Jones and Macken
2018), it has been found that the degree of distraction increases with the magnitude (e.g., the distance in pitch between successive tones; Jones et al.
1999) and the number of changes between successive task-irrelevant auditory events within a given time interval (i.e., the word/token dose effect; Bridges and Jones
1996; Tremblay and Jones
1998, Exp. 5). In addition, changing-state sound (speech or varying tones) was found to disrupt performance in a serial recall task, but not in tasks that do not require serial-order processing (e.g., the “missing-item task”; Beaman and Jones
1997; Hughes et al.
2007; Jones and Macken
1993), unless participants happen to adopt a serial rehearsal strategy (Beaman and Jones
1998; Hughes and Marsh
2020b). In addition to this task-specific interference with serial-order processing, it has been proposed more recently that auditory distraction may arise also from attentional capture, with meaningful or acoustically deviating sounds diverting attention from the focal task (see the “duplex-mechanism account”; Hughes
2014; Hughes et al.
2005). In contrast to interference-by-process, this form of distraction appears to be less specific to serial-order processing (affecting performance also in non-serial short-term memory tasks; e.g., the “missing-item task”, Hughes et al.
2007; Vachon et al.
2017), and it may be more susceptible to cognitive control than interference-by-process (Hughes et al.
2013; Hughes and Marsh
2020a). Moreover, it has been reported that the degree of attentional capture elicited by auditory deviants, but not the changing-state effect (indicating interference-by-process), was related the participants’ working memory capacity (Hughes et al.
2013; Sörqvist et al.
2010)(but see Körner et al.
2017). It is not entirely clear to what extent the disruptive effect of irrelevant speech on serial recall is caused by acoustical interference with serial-order processing and attentional capture, but there is evidence suggesting that at least meaningful speech (e.g., full sentences as compared to lists of changing-state syllables or words) may disrupt performance through both mechanisms (see Bell et al.
2017; Hughes and Marsh
2020b). Moreover, the findings of reduced disruption of serial recall (1) after repeated presentation of the same stream of irrelevant speech (i.e., habituation; Banbury and Berry
1997; Bell et al.
2012), (2) in blind listeners with enhanced auditory processing abilities (Kattner and Ellermeier
2014), and (3) following a specific training of auditory attention (Kattner and Ellermeier
2020) suggest that disruptive effect of irrelevant speech can be partially attributed to the diversion of attention.
In the present study, the transfer of cognitive training was assessed in terms of the disruptive effect of task-irrelevant speech, compared to noise, on serial recall. If the disruptive effect of speech depended on general working memory capacity, then it would be expected that both cognitive trainings with the dual
n-back task will reduce distraction. In contrast, if auditory distraction was specifically related to inhibitory control of irrelevant sound, then the inhibitory
n-back training should result in greater attenuation of auditory distraction than the standard
n-back training with less demands for inhibitory control. Specifically, the inhibitory
n-back training might enhance the ability to resist or resolve interference from the external environment (Friedman and Miyake
2004). In line with the duplex-mechanism account of auditory distraction, it could be argued that attentional capture by irrelevant speech is likely to depend on inhibitory control, whereas the disruption due to changing-state sound (in irrelevant speech) should not depend on any form of cognitive control (Hughes
2014; Hughes et al.
2013). Therefore, enhanced inhibitory control (or resistance to interference from the external environment) could be expected to prevent the diversion of attention by irrelevant speech, whereas the presumably uncontrollable disruption due to the changing-state nature of speech should remain. A training of inhibitory control should thus lead to an attenuation, but not to a full elimination of the irrelevant speech effect. Alternatively, it could be argued also that enhanced inhibitory control of irrelevant changing-state sound (e.g., inhibiting the formation of irrelevant auditory streams) may reduce the specific interference between auditory grouping and the seriation process, which might then lead to a stronger attenuation or even an elimination of the irrelevant speech effect.
In addition to the transfer effects on performance in tasks that involve similar executive functions as the training tasks—working memory updating, suppression of pre-potent responses (Simon effect), and resistance to interference by irrelevant speech—the present study also tested the possibility of far transfer effects on (a) the response-time costs resulting from task-switching (indicating cognitive set-shifting abilities; Rogers and Monsell
1995) and (b) general problem-solving capabilities, which are related to fluid intelligence (Jaeggi et al.
2008).
Discussion
The present study showed that an extended cognitive training with two adaptive versions of the demanding dual
n-back task, varying in the degree of inhibitory control required, improved working memory capacity not only for the trained task (i.e., the value of
n increased from the first to the eighth training session), but also for a different type of updating task. However, this improvement on the untrained memory updating task was significantly different from the passive control group (which showed general improvement on the task) only for participants who were trained on the dual
n-back task with additional demands for response inhibition, but not for participants who were trained on the standard dual
n-back task. Hence, the transfer of a training on the dual
n-back task to general working memory updating abilities seems to depend on the degree of inhibitory control involved in the training task. This finding is quite consistent with a recent meta-analysis on the effects of
n-back training concluding that the magnitude of transfer from training on the standard dual
n-back training to other working memory paradigms, such as operation span or running span tasks, is very small (Soveri et al.
2017). The present findings suggest that an extensive training with cognitive tasks for which multiple executive functions are required (e.g., updating and inhibition) may enhance the likelihood of near transfer effects to other working memory tasks, as compared to a cognitive training with tasks which address only a single executive function (e.g., working memory updating in case of the
n-back task).
In addition to the near transfer effects of the present training to working memory updating, transfer was tested also with regard to two separate forms of inhibition. As the two types of
n-back training tasks differed only with regard to the demand for inhibition control (i.e., to suppress the predominant response on
n-back trials), more transfer to other inhibition tasks was expected for participants who were trained on the inhibitory dual
n-back task, as compared to a training on the standard dual
n-back task. However, it was unclear whether to expect transfer to inhibitory control of pre-potent responses, resistance to distractor interference, or both (Friedman and Miyake
2004). The results indicate that the two training tasks did not differ with regard to transfer to pre-potent response inhibition. In fact, none of the two
n-back trainings reduced the response-time costs for key presses that were spatially incompatible with the target location in the Simon task, as compared to the passive control group. This finding may be surprising given that the inhibitory dual
n-back task required participants to suppress the more frequent and hence pre-potent key presses throughout the eight training sessions. However, the training task did not require participants to solve a spatial compatibility conflict as in the Simon task. Hence, the present results suggest that the inhibition of a frequently occurring response (as during training) may depend on a form of inhibitory control that is functionally distinct from the inhibition that is required for the resolution of a stimulus-response compatibility conflict. Future research is required to determine whether the present training of inhibition within a dual
n-back working memory task generalizes to more similar types of pre-potent response inhibition tasks, such as the stop-signal task (Logan
1994; Verbruggen and Logan
2009).
By contrast, the two types of
n-back training yielded differential transfer effects with regard to inhibitory control of the interference produced by task-irrelevant speech distractors in a verbal short-term memory task. Specifically, participants who were trained with the newly developed inhibitory dual
n-back task, requiring continuous updating and inhibition of the contents in working memory, seem to have enhanced their resistance to irrelevant speech during serial recall. In contrast, a working-memory training with the standard dual
n-back task demanding less inhibitory control does not seem to have an effect on the magnitude of the irrelevant speech effect. This attenuation of auditory distraction after inhibitory
n-back training suggests that the training-related strengthening of inhibition enabled participants to reduce the interference from the external auditory environment (Friedman and Miyake
2004). With regard to accounts of auditory distraction, the finding indicates that inhibitory control may have prevented attentional capture by task-irrelevant speech, but it may not have reduced the disruptions due to interference-by-process (e.g., in line with the duplex-mechanism account; Hughes
2014). Specifically, the fact that the irrelevant speech effect was reduced, but not eliminated after a comprehensive inhibitory-control training indicates that enhanced inhibitory control may prevent the diversion of attention by irrelevant speech, whereas the (remaining) interference with the seriation process produced by the changing-state nature of irrelevant speech may not be susceptible to top-down control. The observation that a considerable portion of auditory distraction (presumably the changing-state effect) could not be eliminated by enhanced inhibitory control might suggest that the interference-by-process mechanism is not related to general working memory functions, and thus not susceptible to inhibitory or cognitive control (see Hughes
2014). This interpretation of the present results would be consistent also with other recent observations showing that only attentional capture, but not the changing-state effect, can be reduced through cognitive control (Hughes et al.
2013; Hughes and Marsh
2020a; Marsh et al.
2019), and is related to working memory capacity (Beaman
2004; Sörqvist
2010). The findings also fit very well with recent observations of a the irrelevant speech effect to be reduced (but not eliminated) after a training of auditory selective attention (using a dichotic-listening task; Kattner and Ellermeier
2020), indicating that the portion of the irrelevant speech effect which can be attributed to attentional capture may be susceptible to attentional control.
Nevertheless, since the irrelevant speech effect may comprise both attentional capture and interference-by-process mechanisms of auditory distraction, it is still possible that inhibitory control also resolves the (presumably uncontrollable) interference between changing-state sound and serial-order processing. Of course, the fact that the irrelevant speech effect was only attenuated, but not eliminated, after eight sessions of inhibitory-control training does not prove that it is the changing-state effect, which remained. Moreover, it could be argued that eight sessions of inhibitory control training may not be sufficient to eliminate the disruptive effect of irrelevant changing-state sound (an effect which appears to be very robust having survived years of every-day mental activities requiring the inhibition of irrelevant information). For instance, there is evidence that irrelevant speech does not interfere at all with short-term memory in congenitally and early blind individuals (Kattner and Ellermeier
2014), suggesting that enhanced inhibitory control of auditory information resulting from a life-long experience with a primarily auditory environment may eliminate the interference-by-process portion of auditory distraction as well. Further research is required to determine whether the attenuation of the irrelevant speech effect in the present study was due to an attenuation of attentional capture or the task-specific interference-by-process. This could be accomplished, for instance, by contrasting the transfer effects of an inhibitory control training on the disruption produced by auditory deviants (which should be due to attentional capture alone) and changing-state sound (which should reflect interference-by-process). Alternatively, transfer effects could be investigated with regard to auditory distraction in non-serial short-term memory tasks, which are known to be immune to a changing-state effect (e.g., the missing-item task; Beaman and Jones
1997). If attentional capture depended on the strength of inhibitory control, then the often relatively small disruptive effect of irrelevant speech in the missing-item task (due to a diversion of attention) might be eliminated completely as a result of inhibitory
n-back training (compare Hughes and Marsh
2020b for a similar observation with regard to the effect of foreknowledge).
More generally, the present results indicate that an extensive cognitive training cannot be used only to enhance working memory updating (Dahlin et al.
2008a,
b) and set shifting (Pereg et al.
2013), but also to strengthen the inhibitory-control function of working memory in terms of the inhibition of auditory distractor interference. The present study is the first to demonstrate that the extent of auditory distraction in short-term memory can be reduced experimentally through a working memory training with enhanced demands for inhibitory control in two stimulus modalities (i.e., inhibition of responses to visuospatial and auditory stimuli in the dual
n-back task). In contrast, the same working memory training with reduced demands for inhibitory control did not affect auditory distraction. Hence, the pattern of results indicates that the training-related decrease of interference by task-irrelevant auditory stimuli was not driven by working memory capacity in general, but rather by a specific inhibitory-control function (i.e., resistance to distractor interference; Friedman and Miyake
2004).
Finally, the present study also investigated possible far transfer effects of an extended dual
n-back training on (a) set shifting abilities and (b) fluid intelligence scores. Regardless of the degree of inhibitory control involved, the present dual
n-back training did not reduce the response-time costs resulting from switching between two different categorization tasks. This suggests that training on the
n-back task does not enhance executive set shifting abilities. Moreover, in contrast to previous findings (Jaeggi et al.
2008), the dual
n-back training did not affect fluid intelligence in the present study. Specifically, participants in all experimental groups were able to solve about 9.5–10 out of 18 problems of the short version of Raven’s Advanced Progressive Matrices test at pre-test (which is equivalent to the pre-test scores reported by Jaeggi et al.
2008). In contrast to the control group and the inhibitory dual
n-back training group, the average fluid intelligence score was slightly enhanced at post-test for participants who were trained on the standard dual
n-back task, but the group differences in gains on fluid intelligence did not turn out to be statistically significant. In line with several other recent findings of a lack of “far transfer” (Harrison et al.
2013; Melby-Lervåg et al.
2016; Redick et al.
2013), the present result seems to contradict the findings reported by Jaeggi et al. (
2008). However, the absence of a transfer effect to fluid intelligence might also be due to differences in the spacing of training times. Specifically, participants in the present study were trained for eight 80-min training sessions (breaks not included), whereas Jaeggi et al. (
2008) trained participants for either eight, twelve, seventeen or nineteen 25-min sessions. In the study by Jaeggi et al. (
2008), the training-related gain on fluid intelligence was shown to increase with the number of training sessions, and statistically significant gains of fluid intelligence relative to pre-test were found only after seventeen and nineteen 25-min sessions of training, but not after eight and twelve sessions of training. Hence, it seems that the transfer to fluid intelligence depends on the training dosage. However, in the present study, participants were trained for 640 min in total (8 × 80 min), so the total training time exceeded the nineteen training sessions in the Jaeggi et al. (
2008) study (i.e., 475 min). The fact that no reliable transfer to fluid intelligence was observed in the present study suggests that temporal spacing of training sessions (i.e., multiple short sessions) may enhance the chances of far transfer effects, as compared to massed training sessions (i.e., few long sessions).
Taken together, the present study shows that working memory capacity can be enhanced successfully with an extended training on two different types of dual n-back tasks varying to the degree of inhibitory control required. In general, transfer to cognitive abilities that are not directly related to the training task was very limited. However, in contrast to the standard n-back task with relatively low demands for inhibitory control, training on the newly developed inhibitory dual n-back task was found to reduce the degree of interference produced by irrelevant speech in a serial short-term memory task. This finding indicates that the inhibitory dual n-back task enhanced not only working memory updating abilities, but also inhibitory control of distractor interference, thus enabling more transfer to tasks for which these executive functions are required (e.g., inhibited processing of task-irrelevant speech). More research is required to disentangle the effects of enhanced inhibitory control on attentional capture and interference-by-process mechanisms of auditory distraction, and to assess possible transfer effects of an inhibitory working memory training on other forms of inhibitory control, such as pre-potent response inhibition or inhibition of proactive interference in memory.