When a novel stimulus is encountered, an orienting response is elicited that directs attention toward that stimulus. As the same stimulus is presented repeatedly, however, the orienting response habituates (Öhman, 1979). Habituation of the orienting response is a simple form of learning and acts as an attentional filtering mechanism that makes people able to selectively attend to what is part of their present goal and adapt to the environment (Cowan, 1995; Sokolov, 1963; Waters, McDonald, & Koresko, 1977). According to Lubow’s (1989) framework, habituation depends on a memory process whereby the organism learns to associate goal-irrelevant stimuli with a no-consequence response. When a stimulus–no-consequence mapping has been established, the stimulus no longer captures attention, and its power to inflict behavioral distraction is diminished. Consistent with this view, a number of studies have shown that people are less distracted by an irrelevant sound they have been exposed to previously (Banbury & Berry, 1997; Debener, Kranczioch, Herrmann, & Engel, 2002; Elliott & Cowan, 2001; Sams, Alho, & Näätänen, 1984; Waters et al., 1977), suggesting that people learn to associate the sound with a no-consequence response. If learning processes underlie habituation, individual differences in memory abilities could perhaps modulate habituation rate. In the experiment reported here, we addressed this issue by investigating the role of individual differences in working memory capacity (WMC) in habituation to auditory distraction.

WMC is typically operationalized with complex-span tasks that require serial recall of items presented in between a series of distractor activities (Conway et al., 2005). Ample evidence suggests that those tasks measure a very general cognitive control mechanism (for a review, see Engle, 2002). For instance, high-WMC individuals are less likely to detect their own name spoken in a to-be-ignored channel (Conway, Cowan, & Bunting, 2001), less susceptible to attentional capture from irrelevant information in a visual display (Kane, Bleckley, Conway, & Engle, 2001), and superior at dividing attention across multiple channels (Colflesh & Conway, 2007), in comparison with their low-WMC counterparts. In general, WMC reflects the ability to control attention, constrain attention to relevant information, and deliberately inhibit responses to irrelevant stimuli (Heitz & Engle, 2007; Kane & Engle, 2003; Redick & Engle, 2006; Unsworth, Schrock, & Engle, 2004). Indeed, irrelevant auditory stimuli are not an exception. High-WMC individuals are generally less susceptible to auditory distraction (Beaman, 2004; Sörqvist, 2010b) and, central to the present investigation, less susceptible to attentional capture from abrupt changes in the sound environment (Sörqvist, 2010a). Moreover, working memory load manipulations influence the potency of irrelevant auditory stimuli to capture attention (Berti & Schröger, 2003; Dalton, Santangelo, & Spence, 2009; SanMiguel, Corral, & Escera, 2008). The capacity of working memory therefore appears to determine how well participants can constrain attention to focal materials in the presence of irrelevant sound and overrule attentional capture.

Whether WMC is also related to habituation to auditory distraction remains to be investigated. High-WMC individuals enjoy greater primary and secondary memory abilities (Unsworth & Engle, 2007) and greater selective attention capabilities (Kane et al., 2001) than do others, and since habituation of the orienting response appears to depend on memory abilities (Lubow, 1989) and is also associated with selective attention (Cowan, 1995), WMC may modulate habituation rate. In the experiment reported here, we used the cross-modal oddball paradigm as a vehicle to test this assumption. In this paradigm (Escera, Alho, Winkler, & Näätänen, 1998), the participants respond to visual targets that are preceded by a sound. The sound is the same on most trials (a standard), but infrequently, another sound is presented (a deviant). Response time to the targets is typically prolonged when the deviant is presented (hereinafter called a deviation effect). Evidence for habituation would be revealed if the magnitude of the deviation effect attenuates as a function of increased exposure to the deviant. We anticipated that higher WMC would be associated with greater habituation, since memory ability should influence how efficiently the participants learn to associate the deviant with a no-consequence response.

Method

Participants

A total of 54 university students (mean age = 24.98 years, SD = 3.82) participated in the experiment in exchange for a small honorarium. All reported Swedish as their native language, normal hearing, and normal or corrected-to-normal vision.

Materials and apparatus

Operation span

A computerized version of the operation span (OSPAN) task (Turner & Engle, 1989) was adopted to measure WMC. Mathematical operations [e.g., “Is (5 + 3) × 3 = 24?”] were presented on a computer screen. The participants were told to respond “yes” or “no” to the operation, as quickly as possible, by pressing a button on the keyboard. When a response was recorded, the screen went blank for 500 ms, and then a one-syllable noun (e.g., dog), which the participants were told to remember for later recall, was presented for 800 ms. Each word was presented only once during the task. When the to-be-remembered word disappeared, a new mathematical operation was presented or the list ended, depending on the length of the list. The list length varied from two to six words. A total of 10 lists were used (2 of each list length), and the length increased across the task. When a list ended, the participants were asked to recall the words in the order of presentation by typing on the computer keyboard.

Oddball task

The oddball task was modeled after Parmentier (2008). The participants were requested to categorize arrows as pointing either to the left (<<<) or to the right (>>>) by pressing the corresponding arrow key on the computer keyboard. They were told to use their dominant hand when pressing the button, to emphasize both speed and accuracy, and to ignore all sounds. At the beginning of each trial, a 200-ms sound was presented. The sound was either a 440-Hz sinewave tone (with a rise and fall time of 100 ms), used as the standard, or a burst of white noise (with a rise and fall time of 10 ms), used as the deviant. The sounds were normalized and were presented binaurally through headphones (Sennheiser HD 202) at approximately 65 dB(A). An arrow was presented at the offset of the sound. The arrow was visible for 600 ms before it was replaced by a 250-ms visual mask (###). The computer recorded the response latency between the onset of the arrow and when the participant pressed a button. A keypress later than 600 ms from the onset of the arrow was recorded as an error response. When the visual mask disappeared, the next trial was initiated. The experimental session began with a practice phase of 10 standard sound trials. Thereafter, the participants were presented with a total of 612 trials divided across six blocks of 102 trials each. There were 51 left- and 51 right-pointing arrows in each block presented pseudorandomly (i.e., no more than 3 similar arrows were presented in a row). The deviant was presented on 10 of the 102 trials in each block (separated by about 10 trials), and the standard was presented on all other trials. The six blocks were identical and were separated by a 25-s pause.

Design and procedure

A within-participants quasi-experimental design was used. The participants sat alone in a quiet room in front of a computer and wore headphones during the whole experimental session. The computer controlled stimulus presentation and recording of responses. Written instructions were presented before each task. The OSPAN task was administered first, followed by the oddball task. The experiment took approximately 30 min to complete.

Results

Operation span

Alpha was set to .05 in all analyses. Recall of words was scored using a strict serial recall criterion (i.e., credit was given for each word recalled in the correct serial position), and the score for each list was multiplied by the length of the list in order to balance differences in list difficulty. For ease of presentation of the results, the participants were divided into two groups (high- and low-WMC individuals) by a median split of the OSPAN scores, but OSPAN was also used as a continuous variable in some analyses in order to give a more complete understanding of the relationship between WMC and habituation. The mean scores, expressed as a probability value, for the two groups were .83 (SD = .09) and .51 (SD = .14), respectively. The mean score was .87 (SD = .15) for the operation part of the OSPAN task, and the operation scores were positively related to the recall scores, r(52) = .28, p < .05. There was no trade-off between the two parts of the task, apparently.

Oddball task

Mean accuracy, expressed as a probability value, was .90 (SD = .06) and .92 (SD = .04) on standard trials and .89 (SD = .10) and .94 (SD = .04) on deviant trials for low- and high-WMC individuals, respectively. Hence, differences between the two groups were small, although high-WMC individuals had higher accuracy than did low-WMC individuals on deviant trials, t(52) = 2.03, p < .05. Because of this, trials with incorrect responses were excluded from the response time analysis. As can be seen in Fig. 1, the magnitude of the deviation effect was relatively large for both groups at the beginning of the experiment, and it was still persistent in low-WMC individuals at the end. In contrast, the magnitude of the deviation effect declined in later blocks and was abolished at the end of the experiment in high-WMC individuals. These conclusions were supported by a 2 (group: high- vs. low-WMC individuals) × 2 (trial type: standard vs. deviant sound) × 6 (block: 1–6) analysis of variance (ANOVA) with mean response latency as the dependent variable. The analysis yielded no significant main effect of group, F(1, 52) = 2.61, MSE = 8,868.53, p = .11, η 2p = .05, no significant main effect of block, F(5, 260) = 1.41, MSE = 806.60, p = .22, η 2p = .03, and no significant interaction between block and group, F < 1, η 2p < .01. However, the main effect of trial type was significant, F(1, 52) = 47.01, MSE = 727.91, p < .01, η 2p = .48, the interaction between trial type and block was significant, F(5, 260) = 6.69, MSE = 175.60, p < .01, η 2p = .11, and, importantly, the three-factor interaction was significant, F(5, 260) = 2.91, MSE = 175.64, p < .05, η 2p = .05. To tease apart the three-way interaction, two follow-up ANOVAs were calculated, one for each WMC group. For the low-WMC group, there was a significant main effect of trial type, F(1, 26) = 25.93, MSE = 856.98, p < .01, η 2p = .50, but no significant main effect of block, F < 1, η 2p = .02, and no significant interaction between trial type and block, F(5, 26) = 1.98, MSE = 177.73, p = .09, η 2p = .07. For the high-WMC group, there was a significant main effect of trial type, F(1, 26) = 21.16, MSE = 598.84, p < .01, η 2p = .45, no significant main effect of block, F(1, 26) = 1.51, MSE = 681.83, p = .19, η 2p = .06, and a significant interaction between trial type and block, F(1, 26) = 7.68, MSE = 173.48, p < .01, η 2p = .23.

Fig. 1
figure 1

How quickly individuals with high and low working memory capacity (WMC) responded to visual targets that were preceded by a frequently presented standard sound or by a rarely presented deviant sound across six consecutive blocks of trials (response time for correct responses are included only). Error bars are standard errors of means

A repeated measures regression analysis on the whole range of data, using the difference score between standard and deviant trials as the dependent variable and block and OSPAN scores as independent variables, revealed a significant main effect of block, ΔR 2 = .06, F = 6.64, p < .01, a significant main effect of OSPAN scores, ΔR 2 = .01, F = 5.02, p < .05, and a significant interaction between block and OSPAN scores, ΔR 2 = .02, F = 2.56, p < .05. We used a residual analysis technique (Cronbach & Furby, 1970) to tease this interaction apart. For simplicity, the analysis was restricted to the beginning (block 1) and the end (block 6) of the experiment. In a first hierarchical regression analysis, we tested whether WMC was related to the deviation effect in the beginning. Mean response time on deviant trials in block 1 was selected as a dependent variable, mean response time on standard trials in block 1 was selected as an independent variable in the first step, and OSPAN scores were selected as an independent variable in the second step. A significant part of the variance was explained in the first step, R 2 = .48, β = .69, t = 6.90, p < .01. However, OSPAN scores were not significantly related to the residual variance left to be explained in the second step, ΔR 2 = .01, β = .11, t = 1.12, p = .27 (Fig. 2a). Hence, the magnitude of the deviation effect was unrelated to WMC in the beginning. We tested whether WMC was related to the deviation effect at the end in a corresponding hierarchical regression analysis with data from block 6. A significant part of the variance was explained in the first step, R 2 = .61, β = .78, t = 8.97, p < .01, and a significant negative relationship was found between OSPAN scores and the residual variance left to be explained in the second step, ΔR 2 = .03, β = .18, t = 2.17, p < .05 (Fig. 2b). Hence, higher WMC was associated with a lower magnitude of the deviation effect at the end, and this relationship was significantly different from the corresponding relationship at the beginning. In a final analysis, we investigated whether WMC was related to the magnitude of habituation. We first calculated the difference between standard and deviant trials in block 1 (M =˗27.69, SD = 25.66) and the corresponding difference in block 6 (M = ˗10.46, SD = 25.05) and then selected the difference scores in block 6 as a dependent variable, the difference scores in block 1 as an independent variable in the first step, and OSPAN scores as an independent variable in the second step of a hierarchical regression analysis. Note that since more negative values represent a larger deviation effect, more positive residual values at the second step represent a higher degree of habituation. A significant part of the variance was explained at the first step, R 2 = .08, β = .28, t = 2.12, p < .05, and a significant positive relationship was revealed between OSPAN scores and the residual variance left to be explained in the second step, ΔR 2 = .12, β = .35, t = 2.77, p < .01 (Fig. 2c). Hence, higher WMC was associated with greater habituation. There was an outlier in the OSPAN data (z-value = ˗2.88). Control analyses with this participant removed revealed stronger relations than those reported above, and the conclusions were the same.

Fig. 2
figure 2

Relationship between operation span scores (working memory capacity) as a continuous variable (z-values on x-axis) and a the magnitude of the deviation effect in block 1, b the magnitude of the deviation effect in block 6, and c the degree of change in magnitude of the deviation effect between block 1 and block 6 (habituation). Note that higher values on the y-axis in panels a and b represent a larger deviation effect, whereas higher values on the y-axis in panel c represent more habituation (i.e., how much smaller in magnitude the deviation effect is in block 6 than in block 1)

Discussion

This experiment shows that the magnitude of the deviation effect decreases as a function of exposures to the rare sound and as a function of WMC. Because of this, individual differences in WMC seem to modulate the rate of habituation to auditory distraction.

Implications for theories of habituation and auditory distraction

The relationship between WMC and habituation reported here is well in line with the idea that both habituation of the orienting response (Cowan, 1995) and WMC (Kane et al., 2001) are related to selective attention. Habituation does not appear to be a purely incidental and stimulus-driven phenomenon; deliberate cognitive control processes seem to contribute to habituation as well. In what way does WMC contribute to habituation? One possible interpretation is that WMC influences how efficiently stimulus–no-consequence mappings are stored in memory. Another possibility, based on the observation that high-WMC individuals have superior inhibition capabilities (Conway et al., 2001; Kane et al., 2001; Sörqvist, 2010a), is that WMC influences how efficiently people map the deviant sound to an inhibition response. A third possibility, based on the notion that deviants lose their captivating power when they are expected (Parmentier, Elsley, Andrés, & Barceló, 2011), is that WMC influences the participants’ ability to predict when deviants are presented.

Most of what is known about cognitive/behavioral effects (as opposed to neuroscientific effects) of background sound comes from the irrelevant-sound paradigm (e.g., Macken, Phelps, & Jones, 2009). In this paradigm, the participants are visually presented with sequences of items they are supposed to recall in order of presentation. When the items are presented against a background of sound, recall is invariably impaired, at least when the sound stream changes from one element to the next (e.g., “k l m v r q c”), called a changing-state effect, or when there is an abrupt deviation from a repetition of a single sound element (e.g., “m m m m m m c”), called a deviation effect (Hughes, Vachon, & Jones, 2007). Recent evidence suggests that these two effects are caused by functionally distinct mechanisms (Hughes et al., 2007; Sörqvist, 2010a)—the former by a conflict between the deliberate seriation processes involved in serial recall of the memory items and automatic processing of order between perceptually discrete sounds, the latter by an interruption of the focal activity due to attention being captured by the abrupt change. In this context, it is interesting to note that previous research on habituation to auditory distraction has shown diverse results. Habituation has been reported in several contexts outside the irrelevant-sound paradigm (Banbury & Berry, 1997; Elliott & Cowan, 2001; Waters et al., 1977), including neuroscientific evidence of habituation to auditory novels (Debener et al., 2002; Sams et al., 1984), but it appears impossible to habituate to the changing-state effect (Ellermeier & Zimmer, 1997; Röer, Bell, Dentale, & Buchner, 2011; Tremblay & Jones, 1998). The results reported here help explain why these studies are inconsistent, since they show that it is possible to habituate to attentional capture from irrelevant sound. Furthermore, the experiment reported here proposes that high WMC (progressively) attenuates the deviation effect, whereas previous investigations indicate that similar capacity measures are unrelated to the magnitude of the changing-state effect (e.g., Beaman, 2004; Ellermeier & Zimmer, 1997; Macken et al., 2009; Sörqvist, 2010a). One interpretation, therefore, is that evidence of habituation concerns a type of auditory distraction that has to do with attention capture, orienting responses, and cognitive control. The absence of habituation, on the other hand, concerns a type of auditory distraction that has to do with involuntary processing of acoustic change and conflicting order processes.

Conclusion

The orienting response can be forced under top-down control. The experiment reported here shows that this control progressively develops as a function of exposures to the to-be-ignored stimuli and as a function of the individual’s memory abilities. Individuals with high WMC seem quicker to adapt to their environment.